I’m using NServiceBus with the RabbitMQ transport. The NServiceBus endpoint is implemented using the Windows Generic Host within an application that runs as a Windows service.
I encountered a situation where the service crashed while processing messages from the queue. It appears to be a network/communication issue with RabbitMQ, but this is strange because RabbitMQ and the Windows service are on the same server.
I see an unhandled exception in the Windows event log:
CoreCLR Version: 4.700.21.26205
.NET Core Version: 3.1.16
Description: The process was terminated due to an unhandled exception.
Exception Info: System.IO.IOException: Unable to write data to the transport connection: An existing connection was forcibly closed by the remote host..
---> System.Net.Sockets.SocketException (10054): An existing connection was forcibly closed by the remote host.
at System.Net.Sockets.NetworkStream.Write(Byte[] buffer, Int32 offset, Int32 size)
--- End of inner exception stack trace ---
at System.Net.Sockets.NetworkStream.Write(Byte[] buffer, Int32 offset, Int32 size)
at RabbitMQ.Client.Impl.SocketFrameHandler.Write(Byte[] buffer)
at RabbitMQ.Client.Impl.SessionBase.Transmit(Command cmd)
at NServiceBus.Transport.RabbitMQ.ModelExtensions.<>c.<BasicRejectAndRequeueIfOpen>b__2_0(Object state)
at System.Threading.ExecutionContext.RunInternal(ExecutionContext executionContext, ContextCallback callback, Object state)
--- End of stack trace from previous location where exception was thrown ---
at System.Threading.Tasks.Task.ExecuteWithThreadLocal(Task& currentTaskSlot, Thread threadPoolThread)
--- End of stack trace from previous location where exception was thrown ---
at NServiceBus.Transport.RabbitMQ.MessagePump.Consumer_Received(Object sender, BasicDeliverEventArgs eventArgs)
at System.Threading.Tasks.Task.<>c.<ThrowAsync>b__139_1(Object state)
at System.Threading.ThreadPoolWorkQueue.Dispatch()
Prior to/around this time, the NServiceBus log file contains multiple messages similar to the following:
2022-11-15 18:27:29.414 ERROR Moving message '7b7563be-1424-4733-bfe0-af4e013d4cf5' to the error queue 'error' because processing failed due to an exception:
RabbitMQ.Client.Exceptions.AlreadyClosedException: Already closed: The AMQP operation was interrupted: AMQP close-reason, initiated by Library, code=541, text='Unexpected Exception', classId=0, methodId=0, cause=System.IO.IOException: Unable to read data from the transport connection: An existing connection was forcibly closed by the remote host..
---> System.Net.Sockets.SocketException (10054): An existing connection was forcibly closed by the remote host.
at System.Net.Sockets.NetworkStream.Read(Byte[] buffer, Int32 offset, Int32 size)
--- End of inner exception stack trace ---
at RabbitMQ.Client.Impl.InboundFrame.ReadFrom(NetworkBinaryReader reader)
at RabbitMQ.Client.Framing.Impl.Connection.MainLoopIteration()
at RabbitMQ.Client.Framing.Impl.Connection.MainLoop()
at RabbitMQ.Client.Framing.Impl.Connection.EnsureIsOpen()
at RabbitMQ.Client.Framing.Impl.Connection.CreateModel()
at NServiceBus.Transport.RabbitMQ.ConfirmsAwareChannel..ctor(IConnection connection, IRoutingTopology routingTopology, Boolean usePublisherConfirms)
at NServiceBus.Transport.RabbitMQ.ChannelProvider.GetPublishChannel()
at NServiceBus.Transport.RabbitMQ.MessageDispatcher.Dispatch(TransportOperations outgoingMessages, TransportTransaction transaction, ContextBag context)
at NServiceBus.PipelineInvocationExtensions.InvokePipeline[TContext](TContext context)
at NServiceBus.TransportReceiveToPhysicalMessageConnector.Invoke(ITransportReceiveContext context, Func`2 next)
at NServiceBus.MainPipelineExecutor.Invoke(MessageContext messageContext)
at NServiceBus.Transport.RabbitMQ.MessagePump.Process(BasicDeliverEventArgs message)
Exception details:
Message ID: 7b7563be-1424-4733-bfe0-af4e013d4cf5
2022-11-15 18:27:29.416 FATAL Failed to execute recoverability policy for message with native ID: `7b7563be-1424-4733-bfe0-af4e013d4cf5`
RabbitMQ.Client.Exceptions.AlreadyClosedException: Already closed: The AMQP operation was interrupted: AMQP close-reason, initiated by Library, code=541, text='Unexpected Exception', classId=0, methodId=0, cause=System.IO.IOException: Unable to read data from the transport connection: An existing connection was forcibly closed by the remote host..
---> System.Net.Sockets.SocketException (10054): An existing connection was forcibly closed by the remote host.
at System.Net.Sockets.NetworkStream.Read(Byte[] buffer, Int32 offset, Int32 size)
--- End of inner exception stack trace ---
at RabbitMQ.Client.Impl.InboundFrame.ReadFrom(NetworkBinaryReader reader)
at RabbitMQ.Client.Framing.Impl.Connection.MainLoopIteration()
at RabbitMQ.Client.Framing.Impl.Connection.MainLoop()
at RabbitMQ.Client.Framing.Impl.Connection.EnsureIsOpen()
at RabbitMQ.Client.Framing.Impl.Connection.CreateModel()
at NServiceBus.Transport.RabbitMQ.ConfirmsAwareChannel..ctor(IConnection connection, IRoutingTopology routingTopology, Boolean usePublisherConfirms)
at NServiceBus.Transport.RabbitMQ.ChannelProvider.GetPublishChannel()
at NServiceBus.Transport.RabbitMQ.MessageDispatcher.Dispatch(TransportOperations outgoingMessages, TransportTransaction transaction, ContextBag context)
at NServiceBus.MoveToErrorsExecutor.MoveToErrorQueue(String errorQueueAddress, IncomingMessage message, Exception exception, TransportTransaction transportTransaction)
at NServiceBus.RecoverabilityExecutor.MoveToError(ErrorContext errorContext, String errorQueue)
at NServiceBus.Transport.RabbitMQ.MessagePump.Process(BasicDeliverEventArgs message)
Two questions:
- Is there any way to determine what caused the unhandled exception. I have not been able to replicate in my development environment. If I shutdown RabbitMQ, NServiceBus will continue to try to reconnect every 10 seconds, but the service does not crash.
- Is there any way to prevent or recover from this situation?