Spring整合rabbitmq实践(二):扩展
Spring整合rabbitmq实践(一):基础
Spring整合rabbitmq实践(三):源码
3. 扩展实践
3.1. MessageConverter
前面提到只要在RabbitTemplate中配置了MessageConverter,在发送和接收消息的时候就能自动完成Message和自定义java对象的自动转换。
MessageConverter接口只有两个方法:
public interface MessageConverter {
/**
* Convert a Java object to a Message.
* @param object the object to convert
* @param messageProperties The message properties.
* @return the Message
* @throws MessageConversionException in case of conversion failure
*/
Message toMessage(Object object, MessageProperties messageProperties) throws MessageConversionException;
/**
* Convert from a Message to a Java object.
* @param message the message to convert
* @return the converted Java object
* @throws MessageConversionException in case of conversion failure
*/
Object fromMessage(Message message) throws MessageConversionException;
}
即使不手动配置MessageConverter,也会有一个默认的SimpleMessageConverter,
它会直接将java对象序列化。
官方文档不建议使用这个MessageConverter,因为SimpleMessageConverter是将java对象在producer端序列化,然后在consumer端反序列化,这会将producer和consumer紧密地耦合在一起,并且仅限于java平台。
推荐用JsonMessageConverter、Jackson2JsonMessageConverter,这两个是都将java对象转化为json再转为byte[]来构造Message对象,前一个用的是jackson json lib,后一个用的是jackson 2 json lib。
@Bean
public MessageConverter jsonMessageConverter() {
return new Jackson2JsonMessageConverter();
}
@Bean
@Autowired
public RabbitTemplate rabbitTemplate(ConnectionFactory connectionFactory,
MessageConverter messageConverter) {
RabbitTemplate template = new RabbitTemplate(connectionFactory);
template.setMessageConverter(messageConverter);
return template;
}
还有一些其它的MessageConverter实现类,当然如果有需要也可以自己实现。
3.2. Exception Handling
有两个error handler类可以对@RabbitListener注解的方法中抛出的异常进行处理。
一个是RabbitListenerErrorHandler接口,并将其设置到@RabbitListener注解中,如下:
@Bean
public RabbitListenerErrorHandler rabbitListenerErrorHandler(){
return new RabbitListenerErrorHandler() {
@Override
public Object handleError(Message amqpMessage, org.springframework.messaging.Message<?> message, ListenerExecutionFailedException exception) throws Exception {
System.out.println(message);
throw exception;
}
};
}
@RabbitListener(queues = "test_queue_1", errorHandler = "rabbitListenerErrorHandler")
public void listen(Message message){
...
}
另一个是ErrorHandler接口,并将其设置到RabbitListenerContainerFactory中。
public interface ErrorHandler {
/**
* Handle the given error, possibly rethrowing it as a fatal exception.
*/
void handleError(Throwable t);
}
@RabbitListener注解的方法中抛出的异常,首先会进入RabbitListenerErrorHandler,这里如果没有能力处理这个异常,需要将其重新抛出(否则不会进入ErrorHandler),然后异常将会进入ErrorHandler,一旦异常进入ErrorHandler就意味着消息消费失败了(所以不需要重新抛出异常)。
RabbitListenerErrorHandler没有默认配置,ErrorHandler有一个默认的ConditionalRejectingErrorHandler类,它的处理方式是打印日志,然后辨别异常类型,如果属于以下几种异常,
o.s.amqp...MessageConversionException
o.s.messaging...MessageConversionException
o.s.messaging...MethodArgumentNotValidException
o.s.messaging...MethodArgumentTypeMismatchException
java.lang.NoSuchMethodException
java.lang.ClassCastException
则包装成AmqpRejectAndDontRequeueException抛出,这个异常的作用是,忽略defaultRequeueRejected(前文已经讲过)的设置,强制让rabbitmq丢弃此条处理失败消息,不放回queue。
这样处理是因为这些异常是不可挽回的,就算再重新执行也一样会抛异常,如果放回到queue就会陷入“消费失败-放回queue-消费失败...”的死循环。不过这是1.3.2版本之后新增的功能,之前的版本如果设置放回queue会陷入死循环,需要自己实现ErrorHandler来处理。
3.3. Transactions
rabbitmq和spring-amqp官方文档对事务的描述都非常少,简单介绍一下了解到的信息。
rabbitmq官方文档对amqp事务的整体定位是这样的:
Overall the behaviour of the AMQP tx class, and more so its implementation on RabbitMQ, is closer to providing a 'batching' feature than ACID capabilities known from the database world.
amqp事务仅仅适用于publish和ack,rabbitmq增加了reject的事务。其它操作都不具备事务特性。也就是说,rabbitmq本身的事务可以保证producer端发出的消息成功被broker收到(不能保证一定会进入queue),consumer端发出的确认信息成功被broker收到,其它诸如consumer端具体的消费逻辑之类如果想要获得事务功能,需要引入外部事务。
引入rabbitmq事务很简单,将RabbitTemplate或者RabbitListenerContainerFactory的channelTransacted属性设为true即可,示例:
@Autowired
@Bean
public AmqpTemplate amqpTemplate(ConnectionFactory amqpConnectionFactory){
RabbitTemplate rabbitTemplate = new RabbitTemplate();
rabbitTemplate.setConnectionFactory(amqpConnectionFactory);
rabbitTemplate.setChannelTransacted(true);
return rabbitTemplate;
}
这样,获得的Channnel就有了事务功能。
也可以直接操作Channel:
Channel channel = cachingConnectionFactory.createConnection().createChannel(true);
try {
//channel.txSelect();上面createChannel已经设为true了,这句可以去掉
channel.basicPublish("xxx", "xxx", new AMQP.BasicProperties(), JSON.toJSONString(event).getBytes());
channel.txCommit();
} catch (IOException e) {
try {
channel.txRollback();
} catch (IOException e1) {
}
} finally {
try {
channel.close()
} catch (Exception e) {
}
}
需要注意的是,直接通过Connection获取的Channel需要手动close:
Channels used within the framework (e.g. RabbitTemplate) will be reliably returned to the cache. If you create channels outside of the framework, (e.g. by accessing the connection(s) directly and invoking createChannel()), you must return them (by closing) reliably, perhaps in a finally block, to avoid running out of channels.
对于producer端,同样的发送一条消息到一个不存在的exchange:
amqpTemplate.convertAndSend("notExistExchange", "routingKey", object);
如果关闭事务,如上文提到过,CachingConnectionFactory会打出一条错误日志,但程序会正常运行。
如果打开事务,由于消息没有到达broker,这里会抛出异常。
对于consumer端,当consumer正在处理一条消息时:
如果broker挂掉,程序会不断尝试重连,当broker恢复时,会重新收到这条消息;
如果程序挂掉,broker发现还没有收到consumer的确认信息但consumer没了,会将这条消息恢复;
长时间没有收到consumer端的确认信息,也会将消息从unacked状态变成ready状态;
如果程序处理消息期间抛异常,broker会收到一个nack或者reject,也会将这条消息恢复。
所以,rabbitmq是可以将没有成功消费的消息恢复的,个人觉得consumer端使用rabbitmq事务的意义并不是很大,也许可以用于consumer端消息去重:
consumer处理成功向rabbitmq发出了ack,consumer默认rabbitmq收到了这个ack所以consumer认为这条消息处理结束,但实际可能rabbitmq没有收到ack又将这条消息放回queue然后重新发给consumer导致消息重复处理。如果开启了事务,能保证rabbitmq一定能收到确认信息,否则事务提交失败。
另外,需要注意的是,开启事务会大幅降低消息发送及接收效率,因为当已经有一个事务存在时,后面的消息是不能被发送或者接收(对同一个consumer而言)的,所以以上两种场景都不推荐使用事务来解决。
3.4. Listeners
@Bean
public ChannelListener channelListener() {
return new ChannelListener() {
@Override
public void onCreate(Channel channel, boolean transactional) {
logger.info("channel number:{}, nextPublishSqlNo:{}",
channel.getChannelNumber(),
channel.getNextPublishSeqNo());
}
@Override
public void onShutDown(ShutdownSignalException signal) {
logger.error("channel shutdown, reason:{}, errorLevel:{}",
signal.getReason().protocolMethodName(),
signal.isHardError() ? "connection" : "channel");
}
};
}
ChannelListener接口,监听Channel的创建和异常关闭。
@Bean
public BlockedListener blockedListener() {
return new BlockedListener() {
@Override
public void handleBlocked(String reason) throws IOException {
logger.info("connection blocked, reason:{}", reason);
}
@Override
public void handleUnblocked() throws IOException {
logger.info("connection unblocked");
}
};
}
BlockedListener监听Connection的block和unblock。
@Bean
public ConnectionListener connectionListener() {
return new ConnectionListener() {
@Override
public void onCreate(Connection connection) {
logger.info("connection created.");
}
public void onClose(Connection connection) {
logger.info("connection closed.");
}
public void onShutDown(ShutdownSignalException signal) {
logger.error("connection shutdown, reason:{}, errorLevel:{}",
signal.getReason().protocolMethodName(),
signal.isHardError() ? "connection" : "channel");
}
};
}
ConnectionListener监听Connection的创建、关闭和异常终止。
@Bean
public RecoveryListener recoveryListener() {
return new RecoveryListener() {
@Override
public void handleRecovery(Recoverable recoverable) {
logger.info("automatic recovery completed");
}
@Override
public void handleRecoveryStarted(Recoverable recoverable) {
logger.info("automatic recovery started");
}
};
}
RecoveryListener监听开始自动恢复Connection、自动恢复连接完成。
ConnectionListener、ChannelListener、RecoveryListener设置到ConnectionFactory即可。
@Autowired
@Bean
public CachingConnectionFactory cachingConnectionFactory(ConnectionListener connectionListener,
ChannelListener channelListener,
RecoveryListener recoveryListener) {
CachingConnectionFactory connectionFactory = new CachingConnectionFactory();
connectionFactory.setAddresses(mqConfigBean.getAddresses());
connectionFactory.setUsername(mqConfigBean.getUsername());
connectionFactory.setPassword(mqConfigBean.getPassword());
connectionFactory.setVirtualHost(mqConfigBean.getVirtualHost());
connectionFactory.addConnectionListener(connectionListener);
connectionFactory.addChannelListener(channelListener);
connectionFactory.setRecoveryListener(recoveryListener);
connectionFactory.setChannelCacheSize(3);
return connectionFactory;
}
ConnectionListener、ChannelListener可以正常触发,RecoveryListener暂时还没发现怎么触发。BlockedListener还没有发现应该设置在哪里,ConnectionFactory没有这个设置。
通过ConnectionListener和ChannelListener可以debug看出Connection和Channel都是有缓存的,因为onCreate()方法不会每次都调用。并且Connection和Channel的创建都是lazy的,程序启动时不会创建Connection和Channel,在第一次用到的时候才会创建。
3.5. 多个@RabbitListener消费一个queue
一个服务中可以有多个@RabbitListener注解的方法消费一个queue,如下:
@RabbitListener(queues = "queueName")
public void listener1(Message message) {
...
}
@RabbitListener(queues = "queueName")
public void listener2(Message message) {
...
}
这样写使用的仍是同一个Connection,一条消息也不会被两个方法都调用,如果RabbitListenerContainerFactory中设置concurrentConsumer为3,意味着每个方法产生3个consumer,一共会有6个consumer对这个queue进行消费。
也可以分布在不同的应用程序中,那样会在不同的Connection中。
一个服务中有如上的两个方法消费同一个queue,另一个服务中有一个方法消费同一个queue,产生的结果如下:
image可以看到,有两个消费者Connection,一个有3个Channel,一个有6个Channel。
image共产生了9个consumer。
3.6. publisher confirm and return
为了能让producer端知道消息是否成功进入了queue,并且避免使用事务大幅降低消息发送效率,可以用confirm和return机制来代替事务。
首先实现两个Callback,ReturnCallback和ConfirmCallback,需要哪个实现哪个,不一定都需要。
public RabbitTemplate.ReturnCallback returnCallback() {
return new RabbitTemplate.ReturnCallback() {
@Override
public void returnedMessage(Message message, int replyCode, String replyText, String exchange, String routingKey) {
logger.info("return call back");
}
};
}
public RabbitTemplate.ConfirmCallback confirmCallback() {
return new RabbitTemplate.ConfirmCallback() {
@Override
public void confirm(CorrelationData correlationData, boolean ack, String cause) {
logger.info("confirm call back");
}
};
}
然后将这两个Callback设置到RabbitTemplate中,将mandatory属性设为true(ReturnCallback需要,ConfirmCallback不需要):
rabbitTemplate.setReturnCallback(returnCallback);
rabbitTemplate.setConfirmCallback(confirmCallback);
rabbitTemplate.setMandatory(true);
然后在ConnectionFactory中将这Confirm和Return机制打开:
connectionFactory.setPublisherReturns(true);
connectionFactory.setPublisherConfirms(true);
这样就完成了。
ConfirmCallback和ReturnCallback的调用条件:
ConfirmCallback - 每一条发出的消息都会调用ConfirmCallback;
ReturnCallback - 只有在消息进入exchange但没有进入queue时才会调用。
相关方法入参:
correlationData - RabbitTemplate的send系列方法中有带这个参数的,如果传了这个参数,会在回调时拿到;
ack - 消息进入exchange,为true,未能进入exchange,为false,由于Connection中断发出的消息进入exchange但没有收到confirm信息的情况,也会是false;
cause - 消息发送失败时的失败原因信息。
另外,关于confirm和return官方文档上有下面这段信息,有必要了解一下:
When a rabbit template send operation completes, the channel is closed; this would preclude the reception of confirms or returns in the case when the connection factory cache is full (when there is space in the cache, the channel is not physically closed and the returns/confirms will proceed as normal). When the cache is full, the framework defers the close for up to 5 seconds, in order to allow time for the confirms/returns to be received. When using confirms, the channel will be closed when the last confirm is received. When using only returns, the channel will remain open for the full 5 seconds. It is generally recommended to set the connection factory’s channelCacheSize to a large enough value so that the channel on which a message is published is returned to the cache instead of being closed. You can monitor channel usage using the RabbitMQ management plugin; if you see channels being opened/closed rapidly you should consider increasing the cache size to reduce overhead on the server.
是说异步的接收confirm和return时仍然需要走原来发送消息用到的那个Channel,如果那个Channel被关闭了,是收不到confirm/return信息的。好在根据以上说明,Channel会等到最后一个confirm接收到时才会close,所以应该也不用担心Channel被关闭而接收不到confirm的问题。
3.7. retry
Starting with version 1.3 you can now configure the RabbitTemplate to use a RetryTemplate to help with handling problems with broker connectivity.
重试机制主要是解决网络不稳导致连接中断的问题。所以其实并不是重新发送消息,而是重新建立。
@Bean
public RetryTemplate retryTemplate() {
RetryTemplate retryTemplate = new RetryTemplate();
SimpleRetryPolicy simpleRetryPolicy = new SimpleRetryPolicy(Integer.MAX_VALUE);
retryTemplate.setRetryPolicy(simpleRetryPolicy);
return retryTemplate;
}
如上,配置一个RetryTemplate,再设置到AmqpTemplate即可。
RetryTemplate与spring-amqp及rabbitmq都没有关系,这是spring-retry中的类。以上示例中使用了最简单的重试策略,不断重试,直到Integer.MAX_VALUE次为止。
对producer端而言,如果Connection正常,但发送消息失败是不会重试的,如指定的exchange不存在的情况:
第1条发送完毕
收到第1条confirm,ack:false, correlationData:null
17:26:09.544 [AMQP Connection 127.0.0.1:5672] ERROR [CachingConnectionFactory.java:1344] - Channel shutdown: channel error; protocol method: #method<channel.close>(reply-code=404, reply-text=NOT_FOUND - no exchange 'xxx' in vhost 'vhost', class-id=60, method-id=40)
第2条发送完毕
收到第2条confirm,ack:false, correlationData:null
17:26:10.552 [AMQP Connection 127.0.0.1:5672] ERROR [CachingConnectionFactory.java:1344] - Channel shutdown: channel error; protocol method: #method<channel.close>(reply-code=404, reply-text=NOT_FOUND - no exchange 'xxx' in vhost 'vhost', class-id=60, method-id=40)
第3条发送完毕
收到第3条confirm,ack:false, correlationData:null
17:26:11.559 [AMQP Connection 127.0.0.1:5672] ERROR [CachingConnectionFactory.java:1344] - Channel shutdown: channel error; protocol method: #method<channel.close>(reply-code=404, reply-text=NOT_FOUND - no exchange 'xxx' in vhost 'vhost', class-id=60, method-id=40)
由Connection中断导致的发送消息失败,会进行重试:
第7条发送完毕
收到第7条confirm,ack:true, correlationData:null
第8条发送完毕
收到第8条confirm,ack:true, correlationData:null
第9条发送完毕
收到第9条confirm,ack:true, correlationData:null
第10条发送完毕
收到第10条confirm,ack:true, correlationData:null
第11条发送完毕
收到第11条confirm,ack:true, correlationData:null
17:01:44.000 [AMQP Connection 127.0.0.1:5672] ERROR [CachingConnectionFactory.java:1344] - Channel shutdown: connection error; protocol method: #method<connection.close>(reply-code=320, reply-text=CONNECTION_FORCED - broker forced connection closure with reason 'shutdown', class-id=0, method-id=0)
17:01:44.005 [AMQP Connection 127.0.0.1:5672] WARN [ForgivingExceptionHandler.java:115] - An unexpected connection driver error occured (Exception message: Connection reset)
17:01:44.602 [http-nio-8080-exec-2] INFO [AbstractConnectionFactory.java:455] - Attempting to connect to: [localhost:5672]
...
17:02:23.076 [http-nio-8080-exec-2] INFO [AbstractConnectionFactory.java:455] - Attempting to connect to: [localhost:5672]
17:02:24.578 [http-nio-8080-exec-2] INFO [AbstractConnectionFactory.java:471] - Created new connection: amqpConnectionFactory#3412a3fd:20/SimpleConnection@41298ed [delegate=amqp://guest@0:0:0:0:0:0:0:1:5672/test, localPort= 55092]
第12条发送完毕
收到第12条confirm,ack:true, correlationData:null
第13条发送完毕
收到第13条confirm,ack:true, correlationData:null
第14条发送完毕
收到第14条confirm,ack:true, correlationData:null
第15条发送完毕
收到第15条confirm,ack:true, correlationData:null
没有配置重试,或到达了重试次数依然失败,会抛出异常:
第15条发送完毕
收到第15条confirm,ack:false, correlationData:null
17:41:13.571 [AMQP Connection 127.0.0.1:5672] ERROR [CachingConnectionFactory.java:1344] - Channel shutdown: channel error; protocol method: #method<channel.close>(reply-code=404, reply-text=NOT_FOUND - no exchange 'xxx' in vhost 'paas_v3_vhost', class-id=60, method-id=40)
第16条发送完毕
收到第16条confirm,ack:false, correlationData:null
17:41:14.583 [AMQP Connection 127.0.0.1:5672] ERROR [CachingConnectionFactory.java:1344] - Channel shutdown: channel error; protocol method: #method<channel.close>(reply-code=404, reply-text=NOT_FOUND - no exchange 'xxx' in vhost 'paas_v3_vhost', class-id=60, method-id=40)
17:41:15.322 [AMQP Connection 127.0.0.1:5672] WARN [ForgivingExceptionHandler.java:115] - An unexpected connection driver error occured (Exception message: Connection reset)
17:41:15.579 [http-nio-8080-exec-1] INFO [AbstractConnectionFactory.java:455] - Attempting to connect to: [localhost:5672]
17:41:17.609 [http-nio-8080-exec-1] ERROR [ExceptionHandler.java:41] - unknown error
org.springframework.amqp.AmqpConnectException: java.net.ConnectException: Connection refused: connect
at org.springframework.amqp.rabbit.support.RabbitExceptionTranslator.convertRabbitAccessException(RabbitExceptionTranslator.java:62)
at org.springframework.amqp.rabbit.connection.AbstractConnectionFactory.createBareConnection(AbstractConnectionFactory.java:484)
at org.springframework.amqp.rabbit.connection.CachingConnectionFactory.createConnection(CachingConnectionFactory.java:626)
at org.springframework.amqp.rabbit.connection.CachingConnectionFactory.createBareChannel(CachingConnectionFactory.java:576)
对consumer端,如果采用的是@RabbitListener或其它类似异步接收消息的方式,则没必要配置重试。consumer端有ack机制,Connection中断导致rabbitmq收不到ack信息,消息会重新入队(可能会导致同一条消息重复消费)。
对于直接调用RabbitTemplate的receive系列方法获取消息的消费方式,则同消息发送端,没有retry或retry次数到达,则抛异常。
3.8. 发送端的消息丢失
这里讨论两种情况可能产生的消息丢失:
(1).rabbitmq没挂,只是短暂的网络异常,连接可以恢复,消息发送出去但没有到exchange。
(2).rabbitmq挂了且长时间无法恢复,消息没有发出去;
3.8.1. 可恢复的Connection中断
在配置了retry的情况下,Connection中断,会根据配置的retry策略尝试重连,即使重新连上了,消息依然可能会丢失。
本地测试,单线程间隔1毫秒循环发送1万条消息,模拟一个不断有消息发出的场景,在发送过程中手动关闭Rabbitmq服务再重新启动,模拟Connection短暂中断的场景。因为每一条消息都带有唯一的messageId(实际上是“线程名-序号”的形式),所以能轻易地从消费端读出所有消息之后找到丢失的消息。
测试结果:发送1万条消息,实际收到9999条,丢失1条。
发送端通过ConfirmCallback打印出所有ack=false的消息:
----------打印ack=false的消息----------
size:4
pool-5-thread-1-5881
pool-5-thread-1-5882
pool-5-thread-1-5883
pool-5-thread-1-5884
消费端读出所有消息后,找出丢失的消息:
--------total:10000---------
----------contain size: 9999----------
----------absent size: 1----------
pool-5-thread-1-5883
可以看到,ack=false的消息有4条,但实际上只丢了一条。因为消息的发送和Confirm是异步进行的,如果在消息发送出去之后,异步的confirm回来之前,Connection中断,那么ConfirmCallback会立即被调用,并且ack=false,原因是Channel被关闭了。
单线程情况下应该最多只会丢失一条,也有可能不会丢。
多线程的情况下丢消息的现象就很严重了。本地测试5个线程发消息的情况,一共50000条消息,丢失了1500多条。但其实如果把这5个线程分到5个请求,一个请求只跑一个线程,情况会好很多,类似于上面单线程的情况。
解决方案
最完美的解决方案是事务,但不推荐,为了rabbitmq的效率,退而求其次,采用confirm机制。
从上面的测试可以看到,在ConfirmCallback中ack=false的消息未必真的没有到达exchange,但没有到达exchange的消息ack一定是false,所以只需要将ack=false的消息重新发送一遍即可。(这种方案会导致消息重复发送,后面再解决这一问题)
实现方案各种各样,这里分享一下自己遇到的问题 。
ConfirmCallback的回调方法中没有Message对象
你可能会想从ConfirmCallback中拿到Message对象,当ack=false的时候将这个Message再重新发出去,但方法入参中没有Message对象。
@Component
public class ReissueMessageConfirmCallback implements RabbitTemplate.ConfirmCallback {
private static final Logger logger = LoggerFactory.getLogger(ReissueMessageConfirmCallback.class);
@Override
public void confirm(CorrelationData correlationData, boolean ack, String cause){
if (correlationData instanceof MessageCorrelationData) {
MessageCorrelationData messageCorrelationData = (MessageCorrelationData) correlationData;
logger.info("------------messageId: " + messageCorrelationData.getMessage().getMessageProperties().getMessageId() +
", ack: " + ack + ", cause:" + cause + "--------------");
if (!ack) {
SendFailedMessageHolder.add(messageCorrelationData);
}
}
}
}
注意到入参中有一个CorrelationData对象,同时在RabbitTemplate中有相应的send方法:
@Override
public void send(final String exchange, final String routingKey,
final Message message, final CorrelationData correlationData)
throws AmqpException {
}
这个方法AmqpTemplate中是没有的,是RabbitTemplate扩展的。
所以,虽然ConfirmCallback不能直接拿到Message,但可以拿到CorrelationData,于是问题就解决了。
直接在ConfirmCallback中调用RabbitTemplate发送消息导致死锁
现在我们可以通过CorrelationData在ConfirmCallback中拿到Message对象了,我们也有办法拿到RabbitTemplate,为了避免bean的循环依赖,我是这样做的:
@Autowired
@Bean
public RabbitTemplate amqpTemplate(ConnectionFactory amqpConnectionFactory,
RetryTemplate retryTemplate,
MessageConverter messageConverter,
//RabbitTemplate.ConfirmCallback confirmCallback,
RabbitTemplate.ReturnCallback returnCallback
){
RabbitTemplate rabbitTemplate = new RabbitTemplate();
rabbitTemplate.setConnectionFactory(amqpConnectionFactory);
rabbitTemplate.setRetryTemplate(retryTemplate);
rabbitTemplate.setMessageConverter(messageConverter);
//rabbitTemplate.setChannelTransacted(true);
rabbitTemplate.setReturnCallback(returnCallback);
rabbitTemplate.setConfirmCallback(new ReissueMessageConfirmCallback(rabbitTemplate));
rabbitTemplate.setMandatory(true);
return rabbitTemplate;
}
ReissueMessageConfirmCallback是自己写的一个实现类,将RabbitTemplate bean自己设置进去。然后我们在ConfirmCallback中发送消息:
@Component
public class ReissueMessageConfirmCallback implements RabbitTemplate.ConfirmCallback {
private static final Logger logger = LoggerFactory.getLogger(ReissueMessageConfirmCallback.class);
private RabbitTemplate rabbitTemplate;
public ReissueMessageConfirmCallback(RabbitTemplate rabbitTemplate){
this.rabbitTemplate = rabbitTemplate;
}
@Override
public void confirm(CorrelationData correlationData, boolean ack, String cause){
if (correlationData instanceof MessageCorrelationData) {
MessageCorrelationData messageCorrelationData = (MessageCorrelationData) correlationData;
String exchange = messageCorrelationData.getExchange();
String routingKey = messageCorrelationData.getRoutingKey();
Message message = messageCorrelationData.getMessage();
if (!ack) {
rabbitTemplate.send(exchange, routingKey, message, messageCorrelationData);
}
}
}
}
MessageCorrelationData是自己写的CorrelationData扩展类,增加了Message、exchange、routingKey属性。
在请求主线程发送1万条消息的过程中,将rabbitmq关闭,这时请求主线程和ConfirmCallback线程都在等待Connection恢复,然后重新启动rabbitmq,当程序重新建立Connection之后,这两个线程会死锁。
可行的方案:定时任务重发
@Component
public class ReissueMessageSchedule implements InitializingBean {
@Autowired
private RabbitTemplate rabbitTemplate;
private ScheduledExecutorService scheduledExecutorService = Executors.newScheduledThreadPool(2);
public void start(){
scheduledExecutorService.scheduleWithFixedDelay(new ReissueTask(rabbitTemplate), 10, 10, TimeUnit.SECONDS);
}
@Override
public void afterPropertiesSet(){
this.start();
}
}
public class ReissueTask implements Runnable {
private static final Logger logger = LoggerFactory.getLogger(ReissueTask.class);
private RabbitTemplate rabbitTemplate;
public ReissueTask(RabbitTemplate rabbitTemplate){
this.rabbitTemplate = rabbitTemplate;
}
@Override
public void run() {
List<MessageCorrelationData> messageCorrelationDataList = new ArrayList<>(SendFailedMessageHolder.getAll());
logger.info("------------------获取到" + messageCorrelationDataList.size() + "条ack=false的消息,准备重发------------------");
SendFailedMessageHolder.clear();
int i = 1;
for (MessageCorrelationData messageCorrelationData : messageCorrelationDataList) {
Message message = messageCorrelationData.getMessage();
String messageId = message.getMessageProperties().getMessageId();
logger.info("------------------重发第" + i + "条消息,id: " + messageId + "------------------");
i++;
message.getMessageProperties().setMessageId(messageId + "-重发");
rabbitTemplate.send(messageCorrelationData.getExchange(), messageCorrelationData.getRoutingKey(),
messageCorrelationData.getMessage(), messageCorrelationData);
}
logger.info("------------------重发完成------------------");
}
}
重发的消息会在原消息id后面跟上“重发”二字。
本地测试打印出的相关信息:
发送端:
15:07:36.063 [pool-3-thread-1] INFO c.l.l.r.p.schedule.task.ReissueTask [ReissueTask.java:29] - ------------------获取到13条发送失败的消息,准备重发------------------
15:07:36.063 [pool-3-thread-1] INFO c.l.l.r.p.schedule.task.ReissueTask [ReissueTask.java:35] - ------------------重发第1条消息,id: reactor-http-nio-3-7439------------------
15:07:38.030 [pool-3-thread-1] INFO o.s.a.r.c.CachingConnectionFactory [AbstractConnectionFactory.java:455] - Attempting to connect to: [localhost:5672]
15:07:40.036 [reactor-http-nio-3] INFO o.s.a.r.c.CachingConnectionFactory [AbstractConnectionFactory.java:455] - Attempting to connect to: [localhost:5672]
...
15:08:14.188 [pool-3-thread-1] INFO o.s.a.r.c.CachingConnectionFactory [AbstractConnectionFactory.java:455] - Attempting to connect to: [localhost:5672]
15:08:16.190 [reactor-http-nio-3] INFO o.s.a.r.c.CachingConnectionFactory [AbstractConnectionFactory.java:455] - Attempting to connect to: [localhost:5672]
15:08:16.710 [reactor-http-nio-3] INFO o.s.a.r.c.CachingConnectionFactory [AbstractConnectionFactory.java:471] - Created new connection: amqpConnectionFactory#2127e66e:25/SimpleConnection@ee0d88b [delegate=amqp://guest@127.0.0.1:5672/test, localPort= 57212]
15:08:16.716 [pool-3-thread-1] INFO c.l.l.r.p.schedule.task.ReissueTask [ReissueTask.java:35] - ------------------重发第2条消息,id: reactor-http-nio-3-7440------------------
15:08:16.716 [reactor-http-nio-3] INFO c.l.l.r.p.c.RabbitmqController [RabbitmqController.java:102] - send message, id: reactor-http-nio-3-7452
send message: reactor-http-nio-3-7452
15:08:16.717 [pool-3-thread-1] INFO c.l.l.r.p.schedule.task.ReissueTask [ReissueTask.java:35] - ------------------重发第3条消息,id: reactor-http-nio-3-7441------------------
15:08:16.718 [pool-3-thread-1] INFO c.l.l.r.p.schedule.task.ReissueTask [ReissueTask.java:35] - ------------------重发第4条消息,id: reactor-http-nio-3-7442------------------
15:08:16.718 [reactor-http-nio-3] INFO c.l.l.r.p.c.RabbitmqController [RabbitmqController.java:102] - send message, id: reactor-http-nio-3-7453
send message: reactor-http-nio-3-7453
15:08:16.718 [pool-3-thread-1] INFO c.l.l.r.p.schedule.task.ReissueTask [ReissueTask.java:35] - ------------------重发第5条消息,id: reactor-http-nio-3-7443------------------
15:08:16.719 [pool-3-thread-1] INFO c.l.l.r.p.schedule.task.ReissueTask [ReissueTask.java:35] - ------------------重发第6条消息,id: reactor-http-nio-3-7444------------------
15:08:16.719 [pool-3-thread-1] INFO c.l.l.r.p.schedule.task.ReissueTask [ReissueTask.java:35] - ------------------重发第7条消息,id: reactor-http-nio-3-7445------------------
15:08:16.719 [pool-3-thread-1] INFO c.l.l.r.p.schedule.task.ReissueTask [ReissueTask.java:35] - ------------------重发第8条消息,id: reactor-http-nio-3-7446------------------
15:08:16.720 [reactor-http-nio-3] INFO c.l.l.r.p.c.RabbitmqController [RabbitmqController.java:102] - send message, id: reactor-http-nio-3-7454
send message: reactor-http-nio-3-7454
15:08:16.720 [pool-3-thread-1] INFO c.l.l.r.p.schedule.task.ReissueTask [ReissueTask.java:35] - ------------------重发第9条消息,id: reactor-http-nio-3-7447------------------
15:08:16.720 [pool-3-thread-1] INFO c.l.l.r.p.schedule.task.ReissueTask [ReissueTask.java:35] - ------------------重发第10条消息,id: reactor-http-nio-3-7448------------------
15:08:16.720 [AMQP Connection 127.0.0.1:5672] INFO c.l.l.r.p.r.ReissueMessageConfirmCallback [ReissueMessageConfirmCallback.java:21] - ------------messageId: reactor-http-nio-3-7451, ack: true, cause:null--------------
15:08:16.721 [pool-3-thread-1] INFO c.l.l.r.p.schedule.task.ReissueTask [ReissueTask.java:35] - ------------------重发第11条消息,id: reactor-http-nio-3-7449------------------
15:08:16.721 [reactor-http-nio-3] INFO c.l.l.r.p.c.RabbitmqController [RabbitmqController.java:102] - send message, id: reactor-http-nio-3-7455
send message: reactor-http-nio-3-7455
15:08:16.721 [pool-3-thread-1] INFO c.l.l.r.p.schedule.task.ReissueTask [ReissueTask.java:35] - ------------------重发第12条消息,id: reactor-http-nio-3-7450------------------
15:08:16.722 [reactor-http-nio-3] INFO c.l.l.r.p.c.RabbitmqController [RabbitmqController.java:102] - send message, id: reactor-http-nio-3-7456
send message: reactor-http-nio-3-7456
15:08:16.723 [pool-3-thread-1] INFO c.l.l.r.p.schedule.task.ReissueTask [ReissueTask.java:35] - ------------------重发第13条消息,id: reactor-http-nio-3-7451------------------
15:08:16.723 [pool-3-thread-1] INFO c.l.l.r.p.schedule.task.ReissueTask [ReissueTask.java:41] - ------------------重发完成------------------
reactor-http-nio-3是请求主线程,pool-3-thread-1是执行重发消息定时任务的线程。
从以上日志信息可以看出,当rabbitmq关闭的时候,主线程与重发线程都在尝试重连,直到rabbitmq重启完成恢复Connection。
重发的消息有13条:reactor-http-nio-3-7439 ~ reactor-http-nio-3-7451。
再看消费端整理并打印出来的接收到的所有消息:
--------should receive:10000---------
----------actually receive: 10013----------
----------absent messages:0---------
----------resend messages: 13----------
reactor-http-nio-3-7439-重发
reactor-http-nio-3-7440-重发
reactor-http-nio-3-7441-重发
reactor-http-nio-3-7442-重发
reactor-http-nio-3-7443-重发
reactor-http-nio-3-7444-重发
reactor-http-nio-3-7446-重发
reactor-http-nio-3-7447-重发
reactor-http-nio-3-7445-重发
reactor-http-nio-3-7449-重发
reactor-http-nio-3-7448-重发
reactor-http-nio-3-7450-重发
reactor-http-nio-3-7451-重发
可以看到,我们正确收到了上面那重发的13条消息。不过这次运气比较好,没有消息遗漏。
同时,这里注意到一件事,消费端代码没有对重发的消息做排序,收到的重发消息的顺序与发送端重发消息的顺序是不匹配的,所以rabbitmq可能不保证先发出的消息一定先被接收。
下面是5个线程同时发送消息的测试结果:
发送端:
15:42:40.602 [pool-3-thread-1] INFO c.l.l.r.p.schedule.task.ReissueTask [ReissueTask.java:29] - ------------------获取到642条发送失败的消息,准备重发------------------
15:42:40.602 [pool-3-thread-1] INFO c.l.l.r.p.schedule.task.ReissueTask [ReissueTask.java:35] - ------------------重发第1条消息,id: pool-5-thread-4-6951------------------
...
省略重连过程
...
15:43:07.628 [pool-3-thread-1] INFO c.l.l.r.p.schedule.task.ReissueTask [ReissueTask.java:35] - ------------------重发第2条消息,id: pool-5-thread-5-6605------------------
...
省略中间600多条消息的重发
...
15:43:07.794 [pool-3-thread-1] INFO c.l.l.r.p.schedule.task.ReissueTask [ReissueTask.java:35] - ------------------重发第641条消息,id: pool-5-thread-1-6704------------------
15:43:07.794 [pool-3-thread-1] INFO c.l.l.r.p.schedule.task.ReissueTask [ReissueTask.java:35] - ------------------重发第642条消息,id: pool-5-thread-4-7088------------------
15:43:07.794 [pool-3-thread-1] INFO c.l.l.r.p.schedule.task.ReissueTask [ReissueTask.java:41] - ------------------重发完成------------------
消费端:
--------should receive:50000---------
----------actually receive: 50014----------
----------absent messages:628---------
pool-5-thread-1-6583
pool-5-thread-1-6584
...
pool-5-thread-1-6705
pool-5-thread-2-6538
...
pool-5-thread-2-6653
pool-5-thread-3-6093
...
pool-5-thread-3-6218
pool-5-thread-4-6955
...
pool-5-thread-4-7087
pool-5-thread-5-6605
...
pool-5-thread-5-6733
pool-5-thread-5-6734
----------resend messages: 642----------
pool-5-thread-1-6580-重发
pool-5-thread-1-6581-重发
...
pool-5-thread-1-6705-重发
pool-5-thread-1-6706-重发
pool-5-thread-2-6537-重发
...
pool-5-thread-2-6654-重发
pool-5-thread-3-6093-重发
...
pool-5-thread-3-6219-重发
pool-5-thread-4-6951-重发
...
pool-5-thread-4-7088-重发
pool-5-thread-5-6604-重发
...
pool-5-thread-5-6734-重发
pool-5-thread-5-6735-重发
可以看到,丢失的消息被完美地包含在重发的消息里面了。
3.8.2. 长时间无法恢复的Connection中断
上面讨论了retry之后可以恢复Connection的情况,也有可能长时间retry之后依然不能恢复Connection,如rabbitmq挂掉的情况,不能一直retry下去阻塞接口调用。
这种情况是没有confirm的,因为消息都没有发出去。所以处理就更简单了:
try {
rabbitTemplate.send(messageCorrelationData.getExchange(), messageCorrelationData.getRoutingKey(),
messageCorrelationData.getMessage(), messageCorrelationData);
}catch (AmqpConnectException e) {
SendFailedMessageHolder.add(messageCorrelationData);
}
retry失败或者没有retry机制都会抛出AmqpConnectException,catch之后将消息保存起来即可。
3.9. 消费端的消息去重
如果发送端采用confirm机制来做丢失消息的重发,上面提到,可能会出现没有丢失的消息也被重发了,导致消息重复。
这个问题很容易解决,MessageProperties中是有messageId属性的,每条消息设置一个唯一的messageId即可。
Message message = messageConverter.toMessage(messageId, new MessageProperties());
message.getMessageProperties().setMessageId(messageId);
3.10. 消息发送和接收使用不同的Connection
当一个服务同时作为消息发送端和接收端时,建议使用不同的Connection以避免一方出现故障影响到另一方。
并不需要做很多事情,只需RabbitTemplate配置中加一个属性设置即可:
rabbitTemplate.setUsePublisherConnection(true);
RabbitTemplate在创建Connection时,会根据这个boolean参数选择使用ConnectionFactory本身或者ConnectionFactory中的publisherConnectionFactory(也是一个ConnectionFactory)来创建,相关源码如下:
/**
* Create a connection with this connection factory and/or its publisher factory.
* @param connectionFactory the connection factory.
* @param publisherConnectionIfPossible true to use the publisher factory, if present.
* @return the connection.
* @since 2.0.2
*/
public static Connection createConnection(final ConnectionFactory connectionFactory,
final boolean publisherConnectionIfPossible) {
if (publisherConnectionIfPossible) {
ConnectionFactory publisherFactory = connectionFactory.getPublisherConnectionFactory();
if (publisherFactory != null) {
return publisherFactory.createConnection();
}
}
return connectionFactory.createConnection();
}
3.11. 消息过期
在发送端,可通过如下方式设置消息过期时间:
message.getMessageProperties().setExpiration("30000");
这样,这条消息的有效期是30秒,30秒没有被消费掉会被丢弃。
3.12. dead letter exchange
这个与spring-amqp无关,是rabbitmq的设置。
将一个queue设置了x-dead-letter-exchange及x-dead-letter-routing-key两个参数后,这个queue里丢弃的消息将会进入dead letter exchange,并route到相应的queue里去。
这里,被丢弃的消息包括:
The message is rejected (basic.reject or basic.nack) with requeue=false,
The TTL for the message expires; or
The queue length limit is exceeded.