flink 问题总结(13)flink 上kafka相同grou
2021-08-05 本文已影响0人
ZYvette
背景
flink1.12版本中使用了flinksql,固定了groupid。但是因为重复上了两个相同任务之后,发现数据消费重复。
create table A (
...
) with (
'connector' = 'kafka',
'topic' = 'test',
'properties.bootstrap.servers' = 'localhost:9092',
'properties.group.id' = 'flink-test',
'scan.startup.mode' = 'latest-offset',
'format' = 'json',
'json.fail-on-missing-field'='true',
'json.map-null-key.mode'='DROP',
'sink.partitioner'='round-robin'
);
下图sink中创建两个相同任务,会消费相同数据。
问题原因
两个任务同时处理,并没有在一个consume group里,所以不会共同消费。
"Internally, the Flink Kafka connectors don’t use the consumer group management functionality because they are using lower-level APIs (SimpleConsumer in 0.8, and KafkaConsumer#assign(…) in 0.9) on each parallel instance for more control on individual partition consumption. So, essentially, the “group.id” setting in the Flink Kafka connector is only used for committing offsets back to ZK / Kafka brokers."
参考
https://stackoverflow.com/questions/38639019/flink-kafka-consumer-groupid-not-working