KeyBy

2019-08-25  本文已影响0人  yayooo

逻辑地将一个流拆分成不相交的分区,每个分区包含具有相同key的元素,再内部以hash的形式实现的。
DataStream → KeyedStream
其中KeyedStream是DataStream的子类

package com.atguigu.apiTest

import org.apache.flink.streaming.api.scala._

object TranformTest {
  def main(args: Array[String]): Unit = {
    val env: StreamExecutionEnvironment = StreamExecutionEnvironment.getExecutionEnvironment
    //将全局并行度设置为1
    env.setParallelism(1)

    val streamFromFile: DataStream[String] = env.readTextFile("C:\\Users\\Administrator\\Desktop\\0311Flink\\flink\\src\\main\\resources\\hello.txt")

    //1.基本转换算子和简单聚合算子
    val dataStream: DataStream[SensorReading] = streamFromFile.map(data => {
      val dataArray: Array[String] = data.split(",")
      //包装成样例类
      SensorReading(dataArray(0).trim, dataArray(1).trim.toLong, dataArray(2).trim.toDouble)

    })

    //keyBy()
   val keyByStream: DataStream[SensorReading] = dataStream.keyBy("id").sum("temperature")

    keyByStream.print()


    env.execute()

  }

}

hello.txt文件如下:

sensor_1, 1547718199, 35.80018327300259
sensor_6, 1547718201, 15.402984393403084
sensor_7, 1547718202, 6.720945201171228
sensor_10, 1547718205, 38.101067604893444
sensor_7, 1547718202, 6

输出结果:

SensorReading(sensor_1,1547718199,35.80018327300259)
SensorReading(sensor_6,1547718201,15.402984393403084)
SensorReading(sensor_7,1547718202,6.720945201171228)
SensorReading(sensor_10,1547718205,38.101067604893444)
SensorReading(sensor_7,1547718202,12.720945201171228)

上一篇下一篇

猜你喜欢

热点阅读