flink recovery

2019-02-25 本文已影响0人大大大大大大大熊

恢复的过程

Recovery involves restarting the cluster, restoring the cluster's state, rewinding the Kafka consumers to the offsets recorded in the checkpoint, and replaying the events from that point
恢复过程包含：重启，装载状态，回滚kafka consumer的offset到checkpoint记录的点，从那个点重发数据

一个TaskManager故障

it doesn't matter if only one task manager goes down. All of the task managers will be restarted, and they will all rewind and resume processing from the offsets stored in the most recent checkpoint. The checkpoints are global, spanning the entire cluster -- there's no support for partial recovery.
即使只有一个TaskManager故障，所有的TaskManager都会重启，然后回滚状态到最近checkpoint记录的state，重新处理从那个点重发的数据。

image.png

为什么恢复时间长

image.png

上一篇下一篇

猜你喜欢

热点阅读