Consumer disconnects from RabbitMQ and is not reconnecting

We have NserviceBus 7.4.4

When message sended to queue and service process time takes longer than “consumer_timeout” in rabbitmq, service loses queue and dont process any messages any more. Messages just hang in the status ready.

The only thing that helps is restarting the “dead” service after which it starts processing all messages from the queue

Can you provide some information that maybe can fix this issue via configuration of nservicebus or maybe this issue fixed in later versions of Nservicebus. Or maybe this issue cant be fixed.

Hi @Artem,

You might be experiencing a bug that was solved in the 6.1.1 release o the transport.

Could that be the case?

Cheers,
Tomek

Yeap, that it the problem gone! Thank you!

1 Like

I’ve been fighting this exact problem for months, only it is happening on the ServiceControl audit instance. Was this (or a similar) fix ever done on ServiceControl? We are currently running v5.0.5

Hi @RMarcum,

The 5.0.5 version of ServiceControl uses the already patched version of RabbitMQ transport.

Are there any errors in your log file for the audit instance?

Here are a few relevant line:

2024-07-16 00:25:53.1172|49|Warn|NServiceBus.Transport.RabbitMQ.MessagePump|‘Particular.Nsb.Nova.Staging MessagePump’ channel shutdown: AMQP close-reason, initiated by Peer, code=406, text=‘PRECONDITION_FAILED - delivery acknowledgement on channel 1 timed out. Timeout value used: 1800000 ms. This timeout value can be configured, see consumers doc guide to learn more’, classId=0, methodId=0
2024-07-16 00:25:53.1172|49|Warn|NServiceBus.Transport.RabbitMQ.MessagePumpConnectionFailedCircuitBreaker|The circuit breaker for ‘Particular.Nsb.Nova.Staging MessagePump’ is now in the armed state
2024-07-16 00:25:53.1172|75|Info|NServiceBus.Transport.RabbitMQ.MessagePump|‘Particular.Nsb.Nova.Staging MessagePump’: Attempting to reconnect in 10 seconds.
2024-07-16 00:26:03.1584|229|Info|NServiceBus.Transport.RabbitMQ.MessagePump|‘Particular.Nsb.Nova.Staging MessagePump’: Connection to the broker reestablished successfully.
2024-07-16 00:26:03.1584|75|Info|NServiceBus.Transport.RabbitMQ.MessagePumpConnectionFailedCircuitBreaker|The circuit breaker for ‘Particular.Nsb.Nova.Staging MessagePump’ is now disarmed
2024-07-16 00:57:03.1664|173|Warn|NServiceBus.Transport.RabbitMQ.MessagePump|‘Particular.Nsb.Nova.Staging MessagePump’ channel shutdown: AMQP close-reason, initiated by Peer, code=406, text=‘PRECONDITION_FAILED - delivery acknowledgement on channel 1 timed out. Timeout value used: 1800000 ms. This timeout value can be configured, see consumers doc guide to learn more’, classId=0, methodId=0
2024-07-16 00:57:03.1664|173|Warn|NServiceBus.Transport.RabbitMQ.MessagePumpConnectionFailedCircuitBreaker|The circuit breaker for ‘Particular.Nsb.Nova.Staging MessagePump’ is now in the armed state
2024-07-16 00:57:03.1822|221|Info|NServiceBus.Transport.RabbitMQ.MessagePump|‘Particular.Nsb.Nova.Staging MessagePump’: Attempting to reconnect in 10 seconds.
2024-07-16 00:57:13.2072|159|Info|NServiceBus.Transport.RabbitMQ.MessagePumpConnectionFailedCircuitBreaker|The circuit breaker for ‘Particular.Nsb.Nova.Staging MessagePump’ is now disarmed
2024-07-16 00:57:13.2072|229|Info|NServiceBus.Transport.RabbitMQ.MessagePump|‘Particular.Nsb.Nova.Staging MessagePump’: Connection to the broker reestablished successfully.
2024-07-16 01:28:13.2289|162|Warn|NServiceBus.Transport.RabbitMQ.MessagePump|‘Particular.Nsb.Nova.Staging MessagePump’ channel shutdown: AMQP close-reason, initiated by Peer, code=406, text=‘PRECONDITION_FAILED - delivery acknowledgement on channel 1 timed out. Timeout value used: 1800000 ms. This timeout value can be configured, see consumers doc guide to learn more’, classId=0, methodId=0
2024-07-16 01:28:13.2289|162|Warn|NServiceBus.Transport.RabbitMQ.MessagePumpConnectionFailedCircuitBreaker|The circuit breaker for ‘Particular.Nsb.Nova.Staging MessagePump’ is now in the armed state
2024-07-16 01:28:13.2289|131|Info|NServiceBus.Transport.RabbitMQ.MessagePump|‘Particular.Nsb.Nova.Staging MessagePump’: Attempting to reconnect in 10 seconds.
2024-07-16 01:28:23.3184|220|Info|NServiceBus.Transport.RabbitMQ.MessagePump|‘Particular.Nsb.Nova.Staging MessagePump’: Connection to the broker reestablished successfully.

Hi @RMarcum,

I’m sorry - somehow this slipped through the cracks and we missed your follow-up logs.

The errors you are seeing is not related to the RabbitMQ fix that Tomasz highlighted earlier.

The part from your logs that is important here is:

delivery acknowledgement on channel 1 timed out. Timeout value used: 1800000 ms

This indicates that ServiceControl tried to read a message from a queue and it took more than 3 minutes to ACK the message. This is almost definitely a problem with the internal ServiceControl database.

Can we ask you to send the full ServiceControl log files (including the RavenDB log files) to the particular support email address? That way we can look at them in more detail and let you know how we can resolve the connection issue.