flink 问题总结(13)flink 上kafka相同grou

2021-08-05  本文已影响0人  ZYvette

背景

flink1.12版本中使用了flinksql,固定了groupid。但是因为重复上了两个相同任务之后,发现数据消费重复。

create table A (
    ...
 ) with (
  'connector' = 'kafka',
   'topic' = 'test',
   'properties.bootstrap.servers' = 'localhost:9092',
   'properties.group.id' = 'flink-test',
   'scan.startup.mode' = 'latest-offset',
   'format' = 'json',
    'json.fail-on-missing-field'='true',
   'json.map-null-key.mode'='DROP',
   'sink.partitioner'='round-robin'
 );

下图sink中创建两个相同任务,会消费相同数据。

问题原因

两个任务同时处理,并没有在一个consume group里,所以不会共同消费。

"Internally, the Flink Kafka connectors don’t use the consumer group management functionality because they are using lower-level APIs (SimpleConsumer in 0.8, and KafkaConsumer#assign(…) in 0.9) on each parallel instance for more control on individual partition consumption. So, essentially, the “group.id” setting in the Flink Kafka connector is only used for committing offsets back to ZK / Kafka brokers."

参考

https://stackoverflow.com/questions/38639019/flink-kafka-consumer-groupid-not-working

上一篇 下一篇

猜你喜欢

热点阅读