Kafka事务分析
Kafka 幂等性
-
Kafka幂等性含义
幂等性起初是在HTTP协议中定义,是指一次和多次请求同一个资源对资源本身应当具有同样的效果。从Kafka的角度来说,客户端多次发送同一条消息与发送一次消息,kafka服务只应当保存一次该消息。
-
kafka幂等性
-
问题
- 在生成消息时,Broker保存消息之后,在发回ack之前,broker宕机,producer将会重试发送该消息,可能引发消息重复
- 前一条消息发送失败,后一条消息发送成功,当前一条消息重试成功后,将导致消息乱序
-
方案
kafka引入了Producer ID (PID)和Sequence Number(SN)来实现kafka的幂等性。
- producer端:
每个Kafka producer初始化一个PID, 每发送消息时,同时发送Producer SN,单调自增
- Broker端:
Broker端维护了每个{PID,Topic,Partition}维护一个序列号Broker SN,每次check完之后写入消息时,该序列号变更为消息的SN
当producer发送消息,Broker接收到消息后:
如果Producer SN = Broker SN + 1 则Broker保留此消息,并将Broker SN + 1
如果Producer SN > Broker SN + 1,则说明数据尚未写入,出现了乱序,则会被拒绝
如果Producer SN < Broker SN, 说明消息重复发送,则消息会被丢弃
-
Kafka 事务
关于Kafka的事务网上有很多介绍,本文基于API调用顺序及API实现的角度对consume-trans-produce完整流程进行分析。
- KakfaProducer#initTransactions()
查找Transaction Coordinator 并获取 Producer ID (pid)
-
Transaction Coordinator 的查找是隐藏的,在TransactionManager的initializeTransactions中会生成InitProducerIdHandler并放入pendingRequests中
-
Sender的run方法中调用maybeSendTransactionalRequest,在处理InitProducerHandler时,先取出该request,由于此时没有Transaction Coordinator,于是调用transactionManager.lookupCoordinator将FindCoordinatorHandler放入pendingRequests,并在此放入InitProducerIdHandler
- Transaction Coordinator的处理逻辑
Coordinator的请求时调用 handleFindCoordinatorRequest,其实现与响应查找Group Coordinator请求一样,其实现也是一直到,主要逻辑是根据transactionId的哈希值和topic的partitionCount去余找出对应的partition Id,并返回topic信息
Utils.abs(transactionalId.hashCode) % transactionTopicPartitionCount
- Transaction Coordinator 通过handleInitProducerIdRequest(request方法生成PID
该方法首先会根据transactionId查找之前是否有相关事务,并根据之前的事务状态进行不同的处理逻辑,如果出事状态为空,会为该请求生成相应的PID,并将该信息写入_transaction_state,并返回InitProducerIdResult
Request 信息:
Request InitProducerIdRequestData(transactionId,transactionTImeout) Response case class InitProducerIdResult(producerId: Long, producerEpoch: Short, error: Errors) 写入__transaction_state信息:
topic __transaction_state Key transactionalId Value private[transaction] case class TxnTransitMetadata(producerId: Long, producerEpoch: Short,txnTimeoutMs: Int, txnState: TransactionState, topicPartitions: immutable.Set[TopicPartition], txnStartTimestamp: Long, txnLastUpdateTimestamp: Long) Time time.milliseconds() 消息格式如下:
baseOffset: 4 lastOffset: 4 count: 1 baseSequence: -1 lastSequence: -1 producerId: -1 producerEpoch: -1 partitionLeaderEpoch: 0 isTransactional: false isControl: false position: 516 CreateTime: 1571627852548 size: 119 magic: 2 compresscodec: NONE crc: 4102323319 isvalid: true
| offset: 4 CreateTime: 1571627852548 keysize: 14 valuesize: 37 sequence: -1 headerKeys: [] key:
TesttranId payload: ��`����m�Q���������Producer epoch 在producer发送InitProducerIdResult请求,服务端查询对于同一个transactionId之前有存储相关状态时,则会返回客户端之前的ProducerId,同时将Producer的epoch增加1
-
连续发送lookupCoordinator,InitProducerIdHandler
- KakfaProducer#beginTransaction()
改方法主要进行客户端
状态初始化,不涉及与集群交互
-
KafkaConsumer.poll()
该方法实现从kafka中获取数据,根据kafka的consumer group的逻辑,consumer会发送
SYNC_GROUP
请求,而GroupCoordinator接收到此消息后,会将消费组信息写入内部__consumer_offset的topic中服务端(GroupCoordinator)处理消息 : case ApiKeys.SYNC_GROUP => handleSyncGroupRequest(request)
改方法会写入消息到__consumer_offset中,类似: baseOffset: 0 lastOffset: 0 count: 1 baseSequence: -1 lastSequence: -1 producerId: -1 producerEpoch: -1 partitionLeaderEpoch: 0 isTransactional: false isControl: false position: 0 CreateTime: 1571627856304 size: 311 magic: 2 compresscodec: NONE crc: 30386071 isvalid: true | offset: 0 CreateTime: 1571627856304 keysize: 14 valuesize: 227 sequence: -1 headerKeys: [] key: /consumer-1-99af3db9-1ffb-4fa5-a43a-2efd12f976d2��b9-1ffb-4fa5-a43a-2efd12f976d2m�Q� consumer-1 /10.4.43.182���� tranTopic tranTopic
-
KafkaProducer.send()
KafkaProducer的消息发送方法主要分为如下几个主要步骤:
-
addPartitionsToTxnRequest,在发送消息前检查消息的TopicPartition是否已经在该事务中,如果没有,则需要向Transaction Coordinator 发送addPartitionsToTxnRequest
服务端调用 handleAddPartitionToTxnRequest(request) 处理该请求,将该TopicPartition信息写入__transaction_state(主要是更新了TopicPartition信息,可能事务起始时间),
写入磁盘信息:
topic __transaction_state Key transactionalId Value private[transaction] case class TxnTransitMetadata(producerId: Long, producerEpoch: Short,txnTimeoutMs: Int, txnState: TransactionState, topicPartitions: immutable.Set[TopicPartition], txnStartTimestamp: Long, txnLastUpdateTimestamp: Long) Time time.milliseconds() 写入消息如下:
baseOffset: 5 lastOffset: 5 count: 1 baseSequence: -1 lastSequence: -1 producerId: -1 producerEpoch: -1 partitionLeaderEpoch: 0 isTransactional: false isControl: false position: 635 CreateTime: 1571627857554 size: 139 magic: 2 compresscodec: NONE crc: 3124473386 isvalid: true
| offset: 5 CreateTime: 1571627857554 keysize: 14 valuesize: 56 sequence: -1 headerKeys: [] key:
TesttranId payload: ��` TopicTestm�Q��m�Q�� -
发送生产消息请求值对应的TopicPartition的leader
leader通过handleProduceRequest(request)将消息写入对应的日志文件中,消息格式如下:
baseOffset: 6 lastOffset: 7 count: 2 baseSequence: 0 lastSequence: 1 producerId: 5002 producerEpoch: 1 partitionLeaderEpoch: 0 isTransactional: true isControl: false position: 350 CreateTime: 1571627623516 size: 97 magic: 2 compresscodec: NONE crc: 1101266558 isvalid: true
| offset: 6 CreateTime: 1571627623510 keysize: -1 valuesize: 9 sequence: 0 headerKeys: [] payload: FirstMess
| offset: 7 CreateTime: 1571627623516 keysize: -1 valuesize: 13 sequence: 1 headerKeys: [] payload: SecondMessage
-
-
KafkaProducer.sendOffsetsToTransaction()
sendOffsetsToTransaction 主要包含两部分,分别是发送ADD_OFFSETS_TO_TXN请求至Transaction Coordinator和发送TXN_OFFSET_COMMIT至GroupCoordinator
-
Transaction Coordinator 通过handleAddPartitionToTxnRequest(request) 将相关信息写入__transaction_state,消息格式如下:
topic __transaction_state Key transactionalId Value private[transaction] case class TxnTransitMetadata(producerId: Long, producerEpoch: Short,txnTimeoutMs: Int, txnState: TransactionState, topicPartitions: immutable.Set[TopicPartition], txnStartTimestamp: Long, txnLastUpdateTimestamp: Long) Time time.milliseconds() baseOffset: 6 lastOffset: 6 count: 1 baseSequence: -1 lastSequence: -1 producerId: -1 producerEpoch: -1 partitionLeaderEpoch: 0 isTransactional: false isControl: false position: 774 CreateTime: 1571627857608 size: 168 magic: 2 compresscodec: NONE crc: 3868568548 isvalid: true
| offset: 6 CreateTime: 1571627857608 keysize: 14 valuesize: 84 sequence: -1 headerKeys: [] key:
TesttranId payload: ��`__consumer_offsets TopicTestm�Q��m�Q��此时会将consumer的groupid对应的
__comsumer_offset的partition信息写入__transaction_state
-
Group Coordinator通过handleTxnOffsetCommitRequest方法将相关信息写入__consumer_offsets,消息格式较为复杂,可参考GroupMetadataManager#storeOffsets方法
baseOffset: 1 lastOffset: 1 count: 1 baseSequence: 0 lastSequence: 0 producerId: 5002 producerEpoch: 1 partitionLeaderEpoch: 0 isTransactional: true isControl: false position: 311 CreateTime: 1571627858221 size: 137 magic: 2 compresscodec: NONE crc: 1296828467 isvalid: true
| offset: 1 CreateTime: 1571627858221 keysize: 29 valuesize: 39 sequence: 0 headerKeys: [] key:
MyConsumer tranTopic payload: ����sfsdfesdfsdssdfm�Q�'
-
-
KafkaProducer.commitTransaction()/KafkaProducer.abortTransaction()
如上两个方法均表示客户端终止事务的请求,两个方法均是直接向Transaction Coordinator发送END_TXN请求,服务端在处理该请求时,是通过handleEndTxnRequest(request)完成的。
-
Transaction Coordinator在接收到客户端的结束事务的请求后,先在__transaction_state中写入PrepareComplete/PrepareAbort,写入消息格式如下:
topic __transaction_state Key transactionalId
| Value | private[transaction] case class TxnTransitMetadata(producerId: Long, producerEpoch: Short,txnTimeoutMs: Int, txnState: TransactionState, topicPartitions: immutable.Set[TopicPartition], txnStartTimestamp: Long, txnLastUpdateTimestamp: Long) |
| Time | time.milliseconds() |请求 EndTxnRequest(version, transactionalId, producerId, producerEpoch, result) > baseOffset: 7 lastOffset: 7 count: 1 baseSequence: -1 lastSequence: -1 producerId: -1 producerEpoch: -1 partitionLeaderEpoch: 0 isTransactional: false isControl: false position: 942 CreateTime: 1571627858574 size: 168 magic: 2 compresscodec: NONE crc: 3579425289 isvalid: true > | offset: 7 CreateTime: 1571627858574 keysize: 14 valuesize: 84 sequence: -1 headerKeys: [] key: > TesttranId payload: ��`__consumer_offsets TopicTestm�Q��m�Q��
-
在写入消息完成后,TransactionCoordinator将
向该事务的所有消息的TopicPartition的leader
以及__consumer_offset中对应groupId的partition的leader
发送WRITE_TXN_MARKERS请求,各broker接收到此请求后,通过handleWriteTxnMarkersRequest(request)将事务的状态写入log中,消息格式较为复杂,可参考handleWriteTxnMarkersRequest.handleWriteTxnMarkersRequest(request)TopicPartition :
baseOffset: 8 lastOffset: 8 count: 1 baseSequence: -1 lastSequence: -1 producerId: 5002 producerEpoch: 1 partitionLeaderEpoch: 0 isTransactional: true isControl: true position: 447 CreateTime: 1571627858578 size: 78 magic: 2 compresscodec: NONE crc: 1728728300 isvalid: true
| offset: 8 CreateTime: 1571627858578 keysize: 4 valuesize: 6 sequence: -1 headerKeys: [] endTxnMarker: COMMIT coordinatorEpoch: 0 -
当所有的TopicPartition均将事务标记写入log文件之后,Transaction Coordinator会将事务结束的标志写入__transaction_state中,消息格式如下:
topic __transaction_state Key transactionalId Value private[transaction] case class TxnTransitMetadata(producerId: Long, producerEpoch: Short,txnTimeoutMs: Int, txnState: TransactionState, topicPartitions: immutable.Set[TopicPartition], txnStartTimestamp: Long, txnLastUpdateTimestamp: Long) Time time.milliseconds() baseOffset: 8 lastOffset: 8 count: 1 baseSequence: -1 lastSequence: -1 producerId: -1 producerEpoch: -1 partitionLeaderEpoch: 0 isTransactional: false isControl: false position: 1110 CreateTime: 1571627858584 size: 119 magic: 2 compresscodec: NONE crc: 2367176543 isvalid: true
| offset: 8 CreateTime: 1571627858584 keysize: 14 valuesize: 37 sequence: -1 headerKeys: [] key:
TesttranId payload: ��`m�Q��m�Q��
PS1: 从以上可以看出,所有TransactionCoordinator写入__transaction_state的消息格式都是一样的。
PS2 : Kafka内部的消息(
__transaction_state, __consumer_offset
)的查看,可以使用kafka自带的控制台消费命令,如果直接消费,将打印乱码,无法识别。 不过我们可以使用指定消息类型的方式来友好的展示其信息内容,如查看__transaction_state
的消息类型,如使用transactionId为“TesttranId”的producer, 如下是查看该producer的事务交互信息:-
查找TransactionId对应的__transaction_state的partition
scala> Math.abs("TesttranId".hashCode)%50
res0: Int = 48 -
使用如下命令查看该partition内容:
kafka-console-consumer.sh --topic __transaction_state --partition 48 --bootstrap-server 10.1.236.66:9092 --formatter "kafka.coordinator.transaction.TransactionLog$TransactionLogMessageFormatter"
-
打印消息格式如下(可以看出其信息与上文中所述的存储格式是一致的):
TesttranId::TransactionMetadata(transactionalId=TesttranId, producerId=9075, producerEpoch=3, txnTimeoutMs=60000, state=Ongoing, pendingState=None, topicPartitions=Set(__consumer_offsets-22, TopicTest-0), txnStartTimestamp=1572316082745, txnLastUpdateTimestamp=1572316082795)
对于
__transaction_state, __consumer_offset
两中类型的主题,可分别使用如下的formatter类打印其存储信息主题 formatter类 __transaction_state
"kafka.coordinator.transaction.TransactionLog$TransactionLogMessageFormatter" __consumer_offsets
"kafka.coordinator.group.GroupMetadataManager$OffsetsMessageFormatter" PS3 : 所有交互的发送方,相应方对照:
请求 发起方 响应方 查找Transaction Coordinator 客户端producer 任一broker 查找PID 客户端producer Transaction Coordinator SYNC_GROUP 客户端consumer Group Coordinator AddPartitionsToTxnRequest 客户端producer Transaction Coordinator Produce 客户端producer TopicPartition 的Leader ADD_OFFSETS_TO_TXN 客户端producer Transaction Coordinator TXN_OFFSET_COMMIT 客户端producer Group Coordinator END_TXN 客户端producer Transaction Coordinator WRITE_TXN_MARKERS Transaction Coordinator 所有partition的leader -