Unhandled exception with RabbitMQ

I’m using NServiceBus with the RabbitMQ transport. The NServiceBus endpoint is implemented using the Windows Generic Host within an application that runs as a Windows service.

I encountered a situation where the service crashed while processing messages from the queue. It appears to be a network/communication issue with RabbitMQ, but this is strange because RabbitMQ and the Windows service are on the same server.

I see an unhandled exception in the Windows event log:

CoreCLR Version: 4.700.21.26205
.NET Core Version: 3.1.16
Description: The process was terminated due to an unhandled exception.
Exception Info: System.IO.IOException: Unable to write data to the transport connection: An existing connection was forcibly closed by the remote host..
 ---> System.Net.Sockets.SocketException (10054): An existing connection was forcibly closed by the remote host.
   at System.Net.Sockets.NetworkStream.Write(Byte[] buffer, Int32 offset, Int32 size)
   --- End of inner exception stack trace ---
   at System.Net.Sockets.NetworkStream.Write(Byte[] buffer, Int32 offset, Int32 size)
   at RabbitMQ.Client.Impl.SocketFrameHandler.Write(Byte[] buffer)
   at RabbitMQ.Client.Impl.SessionBase.Transmit(Command cmd)
   at NServiceBus.Transport.RabbitMQ.ModelExtensions.<>c.<BasicRejectAndRequeueIfOpen>b__2_0(Object state)
   at System.Threading.ExecutionContext.RunInternal(ExecutionContext executionContext, ContextCallback callback, Object state)
--- End of stack trace from previous location where exception was thrown ---
   at System.Threading.Tasks.Task.ExecuteWithThreadLocal(Task& currentTaskSlot, Thread threadPoolThread)
--- End of stack trace from previous location where exception was thrown ---
   at NServiceBus.Transport.RabbitMQ.MessagePump.Consumer_Received(Object sender, BasicDeliverEventArgs eventArgs)
   at System.Threading.Tasks.Task.<>c.<ThrowAsync>b__139_1(Object state)
   at System.Threading.ThreadPoolWorkQueue.Dispatch()

Prior to/around this time, the NServiceBus log file contains multiple messages similar to the following:

2022-11-15 18:27:29.414 ERROR Moving message '7b7563be-1424-4733-bfe0-af4e013d4cf5' to the error queue 'error' because processing failed due to an exception:
RabbitMQ.Client.Exceptions.AlreadyClosedException: Already closed: The AMQP operation was interrupted: AMQP close-reason, initiated by Library, code=541, text='Unexpected Exception', classId=0, methodId=0, cause=System.IO.IOException: Unable to read data from the transport connection: An existing connection was forcibly closed by the remote host..
 ---> System.Net.Sockets.SocketException (10054): An existing connection was forcibly closed by the remote host.
   at System.Net.Sockets.NetworkStream.Read(Byte[] buffer, Int32 offset, Int32 size)
   --- End of inner exception stack trace ---
   at RabbitMQ.Client.Impl.InboundFrame.ReadFrom(NetworkBinaryReader reader)
   at RabbitMQ.Client.Framing.Impl.Connection.MainLoopIteration()
   at RabbitMQ.Client.Framing.Impl.Connection.MainLoop()
   at RabbitMQ.Client.Framing.Impl.Connection.EnsureIsOpen()
   at RabbitMQ.Client.Framing.Impl.Connection.CreateModel()
   at NServiceBus.Transport.RabbitMQ.ConfirmsAwareChannel..ctor(IConnection connection, IRoutingTopology routingTopology, Boolean usePublisherConfirms)
   at NServiceBus.Transport.RabbitMQ.ChannelProvider.GetPublishChannel()
   at NServiceBus.Transport.RabbitMQ.MessageDispatcher.Dispatch(TransportOperations outgoingMessages, TransportTransaction transaction, ContextBag context)
   at NServiceBus.PipelineInvocationExtensions.InvokePipeline[TContext](TContext context)
   at NServiceBus.TransportReceiveToPhysicalMessageConnector.Invoke(ITransportReceiveContext context, Func`2 next)
   at NServiceBus.MainPipelineExecutor.Invoke(MessageContext messageContext)
   at NServiceBus.Transport.RabbitMQ.MessagePump.Process(BasicDeliverEventArgs message)
Exception details:
	Message ID: 7b7563be-1424-4733-bfe0-af4e013d4cf5
2022-11-15 18:27:29.416 FATAL Failed to execute recoverability policy for message with native ID: `7b7563be-1424-4733-bfe0-af4e013d4cf5`
RabbitMQ.Client.Exceptions.AlreadyClosedException: Already closed: The AMQP operation was interrupted: AMQP close-reason, initiated by Library, code=541, text='Unexpected Exception', classId=0, methodId=0, cause=System.IO.IOException: Unable to read data from the transport connection: An existing connection was forcibly closed by the remote host..
 ---> System.Net.Sockets.SocketException (10054): An existing connection was forcibly closed by the remote host.
   at System.Net.Sockets.NetworkStream.Read(Byte[] buffer, Int32 offset, Int32 size)
   --- End of inner exception stack trace ---
   at RabbitMQ.Client.Impl.InboundFrame.ReadFrom(NetworkBinaryReader reader)
   at RabbitMQ.Client.Framing.Impl.Connection.MainLoopIteration()
   at RabbitMQ.Client.Framing.Impl.Connection.MainLoop()
   at RabbitMQ.Client.Framing.Impl.Connection.EnsureIsOpen()
   at RabbitMQ.Client.Framing.Impl.Connection.CreateModel()
   at NServiceBus.Transport.RabbitMQ.ConfirmsAwareChannel..ctor(IConnection connection, IRoutingTopology routingTopology, Boolean usePublisherConfirms)
   at NServiceBus.Transport.RabbitMQ.ChannelProvider.GetPublishChannel()
   at NServiceBus.Transport.RabbitMQ.MessageDispatcher.Dispatch(TransportOperations outgoingMessages, TransportTransaction transaction, ContextBag context)
   at NServiceBus.MoveToErrorsExecutor.MoveToErrorQueue(String errorQueueAddress, IncomingMessage message, Exception exception, TransportTransaction transportTransaction)
   at NServiceBus.RecoverabilityExecutor.MoveToError(ErrorContext errorContext, String errorQueue)
   at NServiceBus.Transport.RabbitMQ.MessagePump.Process(BasicDeliverEventArgs message)

Two questions:

  1. Is there any way to determine what caused the unhandled exception. I have not been able to replicate in my development environment. If I shutdown RabbitMQ, NServiceBus will continue to try to reconnect every 10 seconds, but the service does not crash.
  2. Is there any way to prevent or recover from this situation?

What versions of the NServiceBus.* packages are you using? Please update to the latest minor/patch releases to not be affected by breaking changes.

What version of the RabbitMQ Client are you using? Validate if you are using the latest version of that package too.

The endpoint should not crash and the rabbitmq transport message pump should be resilient to such issues but maybe there is an edge case that we are not covering.

What is interesting is that the strack trace in the windows event log is different from the stack traces in the log file.

Awaiting more information from your end.

Thank you for the response.

Here are the versions of the NServiceBus.* packages that we are currently using:
NServiceBus 7.3.0
NServiceBus.Extensions.DependencyInjection 1.0.1
NServiceBus.Extensions.Hosting 1.0.1
NServiceBus.Newtonsoft.Json 2.2.0
NServiceBus.RabbitMQ 5.2.0

The RabbitMQ version is 3.9.1.

I can look into updating to the latest version of packages, but I assume that will require updating the RabbitMQ version as well.