RocketMQ源码分析----长轮询
废话
这篇文章主要讲RocketMQ的长轮询,为什么叫长轮询我也不清楚....主要别人这样叫我也这样叫吧,大家明白意思就好。
正文
RcocketMQ消费者的模式是pull模式,也就是会定时向Broker请求消息进行消费。在源码中实现是开启了后台线程不停的去pull(当然会先从队列去PullRequest,队列为空会阻塞),刚研究RocketMQ消费者pull的代码之后不久,有个问题:
- 如果长时间没有消息,消费者不停的去请求那不就会导致broker负载很高吗
当时没有想太多这个问题,后来才发现的,那么我们先回顾一下Broker在没有消息的时候是怎么处理的,首先当然是先获取消息了(下面代码在PullMessageProcessor.java中)
// ....省略其他代码
final GetMessageResult getMessageResult =
this.brokerController.getMessageStore().getMessage(requestHeader.getConsumerGroup(), requestHeader.getTopic(),
requestHeader.getQueueId(), requestHeader.getQueueOffset(), requestHeader.getMaxMsgNums(), subscriptionData);
// ....省略其他代码
然后看下对Result的判断
switch (getMessageResult.getStatus()) {
// ....省略其他代码
case NO_MATCHED_LOGIC_QUEUE:
case NO_MESSAGE_IN_QUEUE:
if (0 != requestHeader.getQueueOffset()) {
// ....省略其他代码
} else {
response.setCode(ResponseCode.PULL_NOT_FOUND);
}
break;
case OFFSET_FOUND_NULL:
response.setCode(ResponseCode.PULL_NOT_FOUND);
break;
case OFFSET_OVERFLOW_ONE:
response.setCode(ResponseCode.PULL_NOT_FOUND);
break;
}
上面是没有找到消息的情况,response是要相应给消费者的,这是为特定的code。
下面看下对response的code的判断
switch (response.getCode()) {
case ResponseCode.PULL_NOT_FOUND:
if (brokerAllowSuspend && hasSuspendFlag) {
// 如果broker开启了长轮询,则将长轮询时间设置为30s(消费者传过来的,默认30s),否则设置为1s
long pollingTimeMills = suspendTimeoutMillisLong;
if (!this.brokerController.getBrokerConfig().isLongPollingEnable()) {
pollingTimeMills = this.brokerController.getBrokerConfig().getShortPollingTimeMills();
}
String topic = requestHeader.getTopic();
long offset = requestHeader.getQueueOffset();
int queueId = requestHeader.getQueueId();
// 将这次请求的信息包括channel全部封装到PullRequest,并保存到pullRequestTable,即把当前的request hold住
PullRequest pullRequest = new PullRequest(request, channel, pollingTimeMills,
this.brokerController.getMessageStore().now(), offset, subscriptionData);
this.brokerController.getPullRequestHoldService().suspendPullRequest(topic, queueId, pullRequest);
response = null;
break;
}
}
看到response = null了吧,如果是null,就不会给消费者响应,那么消费者就不能执行相应的回调方法了。
那么这时候又有两个问题:
- 给或者不给消费者响应对消费者有什么影响?
- 此时不给响应,那么什么时候会给消费者响应?
先看下第一个问题,第一个问题的答案在MQClientAPIImpl的pullMessageAsync(由于源码里写的异步,所以调用的该异步方法,同步的情况就不一样了)
this.remotingClient.invokeAsync(addr, request, timeoutMillis, new InvokeCallback() {
@Override
public void operationComplete(ResponseFuture responseFuture) {
RemotingCommand response = responseFuture.getResponseCommand();
if (response != null) {
try {
PullResult pullResult = MQClientAPIImpl.this.processPullResponse(response);
assert pullResult != null;
pullCallback.onSuccess(pullResult);
} catch (Exception e) {
pullCallback.onException(e);
}
} else {
// ....
}
}
});
response==null的在这次的讨论范围,先看下响应了会怎样,会调用回调方法pullcallback,传入Response,这个定义在DefaultMQPushConsumerImpl的pullMessage方法中
// 省略其他代码
PullCallback pullCallback = new PullCallback() {
@Override
public void onSuccess(PullResult pullResult) {
if (pullResult != null) {
switch (pullResult.getPullStatus()) {
case FOUND:
if (pullResult.getMsgFoundList() == null || pullResult.getMsgFoundList().isEmpty()) {
DefaultMQPushConsumerImpl.this.executePullRequestImmediately(pullRequest);
} else {
if (DefaultMQPushConsumerImpl.this.defaultMQPushConsumer.getPullInterval() > 0) {
DefaultMQPushConsumerImpl.this.executePullRequestLater(pullRequest,
DefaultMQPushConsumerImpl.this.defaultMQPushConsumer.getPullInterval());
} else {
DefaultMQPushConsumerImpl.this.executePullRequestImmediately(pullRequest);
}
}
break;
case NO_NEW_MSG:
DefaultMQPushConsumerImpl.this.executePullRequestImmediately(pullRequest);
break;
case NO_MATCHED_MSG:
DefaultMQPushConsumerImpl.this.executePullRequestImmediately(pullRequest);
break;
}
}
}
@Override
public void onException(Throwable e) {
DefaultMQPushConsumerImpl.this.executePullRequestLater(pullRequest, PullTimeDelayMillsWhenException);
}
};
如果收到了响应,到最后会调用executePullRequestImmediately或者executePullRequestLater(底层也是调用executePullRequestImmediately,只不过是过一会采取执行),其作用就是将pullRequest放回队列,嘿嘿,那么第一个问题答案就出来了:
- 如果是成功,那么当然是没问题,我消费完成,把pullRequest放回队列,后台线程会从队列取出来继续请求下一个消息;
- 如果是失败,由于Broker没有返回,所以,自然执行不到这里,那么后台线程还是阻塞状态知道返回失败,把队列放回去,才进行下一次尝试
那么接下来就是看下第二个问题的答案是什么了,这个问题嘛,需要先看下《ConsumeQueue介绍和其构建过程》这篇文章,在ConsumeQueue定时构建过程中,有几句代码
if (BrokerRole.SLAVE != DefaultMessageStore.this.getMessageStoreConfig().getBrokerRole() && DefaultMessageStore.this.brokerConfig.isLongPollingEnable()) {
DefaultMessageStore.this.messageArrivingListener.arriving(dispatchRequest.getTopic(), dispatchRequest.getQueueId(), dispatchRequest.getConsumeQueueOffset() + 1, dispatchRequest.getTagsCode());
}
ConsumeQueue在CommitLog有消息写入的时候(即有Producer发送了消息),会进行构建,就会调用上面那行代码,那么这个arriving方法到底干了什么,我们进去看看,核心是调用了PullRequestHoldService的notifyMessageArriving方法
public void notifyMessageArriving(final String topic, final int queueId, final long maxOffset, final Long tagsCode) {
String key = this.buildKey(topic, queueId);
ManyPullRequest mpr = this.pullRequestTable.get(key);
if (mpr != null) {
List<PullRequest> requestList = mpr.cloneListAndClear();
if (requestList != null) {
List<PullRequest> replayList = new ArrayList<PullRequest>();
for (PullRequest request : requestList) {
long newestOffset = maxOffset;
if (newestOffset <= request.getPullFromThisOffset()) {
newestOffset = this.brokerController.getMessageStore().getMaxOffsetInQuque(topic, queueId);
}
Long tmp = tagsCode;
if (newestOffset > request.getPullFromThisOffset()) {
if (tagsCode == null) {
// tmp = getLatestMessageTagsCode(topic, queueId,
// maxOffset);
}
if (this.messageFilter.isMessageMatched(request.getSubscriptionData(), tmp)) {
try {
this.brokerController.getPullMessageProcessor().excuteRequestWhenWakeup(request.getClientChannel(),
request.getRequestCommand());
} catch (RemotingCommandException e) {
log.error("", e);
}
continue;
}
}
// 如果已经超过限制了,那么再次重试,如果还是失败,那么直接返回
if (System.currentTimeMillis() >= (request.getSuspendTimestamp() + request.getTimeoutMillis())) {
try {
this.brokerController.getPullMessageProcessor().excuteRequestWhenWakeup(request.getClientChannel(),
request.getRequestCommand());
} catch (RemotingCommandException e) {
log.error("", e);
}
continue;
}
replayList.add(request);
}
if (!replayList.isEmpty()) {
mpr.addPullRequest(replayList);
}
}
}
}
逻辑很简单,先从pullRequestTable中取出对于topic+qid下hold住的PullRequest,如果tag是相符的,那么调用PullMessageProcessor.this.processRequest模拟broker处理消费者的请求消息,并做回响应,这里进行调用如果还是查不到消息,那么不会再hold住Request,直接返回
看到这,第二个问题也得到了答案,整个长轮询的原理也清楚了,总结一下:
- 消费者会不停的从PullRequest的队列里取request然后想broker请求消息,得到broker的响应后会做相应处理并把PullRequest放回队列以便下一次请求
- broker在查不到消息的情况下会hold住请求,在ReputMessageService不停构建ConsumeQueue的时候,会拿出hold住的请求进行二次处理