NServiceBus.Transport.AzureServiceBus.DefaultOutgoingBatchRouter - Error when publishing causing massive amounts of retries

Yesterday we encountered a weird error we haven’t seen before. We are running NSB 6.4.2 running on Azure Storage Bus. From what I can tell, one handler of ours kept failing as soon as it tried to publish an event. The error we’re seeing over and over again is below. In this handler we were adding data to a database and then sending off an event to signify that it was added successfully. Because it was failing here the data was added multiple times as duplicates to the database. During this time(roughly 1.5 hours) it happened repeatedly, but ONLY to this one handler. Any idea why this would have happened or what we should be doing to prevent this in the future?

2018-05-22 17:40:28 – NServiceBus.Transport.AzureServiceBus.DefaultOutgoingBatchRouter - Failed to dispatch a batch with the following message IDs: a28939b0-1de6-452d-8a4f-6dfe444973ea
System.TimeoutException: The operation did not complete within the allotted timeout of 00:00:20. The time allotted to this operation may have been a portion of a longer timeout. For more information on exception types and proper exception handling, please refer to http://go.microsoft.com/fwlink/?LinkId=761101 —> System.ServiceModel.FaultException1[System.ServiceModel.ExceptionDetail]: The operation did not complete within the allotted timeout of 00:00:20. The time allotted to this operation may have been a portion of a longer timeout. For more information on exception types and proper exception handling, please refer to http://go.microsoft.com/fwlink/?LinkId=761101 at Microsoft.ServiceBus.Messaging.Sbmp.DuplexRequestBindingElement.DuplexRequestSessionChannel.ThrowIfFaultMessage(Message wcfMessage) at Microsoft.ServiceBus.Messaging.Sbmp.DuplexRequestBindingElement.DuplexRequestSessionChannel.HandleMessageReceived(IAsyncResult result) --- End of stack trace from previous location where exception was thrown --- at Microsoft.ServiceBus.Common.AsyncResult.End[TAsyncResult](IAsyncResult result) at Microsoft.ServiceBus.Messaging.Sbmp.DuplexRequestBindingElement.DuplexRequestSessionChannel.EndRequest(IAsyncResult result) at Microsoft.ServiceBus.Messaging.Channels.ReconnectBindingElement.ReconnectChannelFactory1.RequestSessionChannel.RequestAsyncResult.<>c.b__9_3(RequestAsyncResult thisPtr, IAsyncResult r)
at Microsoft.ServiceBus.Messaging.IteratorAsyncResult1.StepCallback(IAsyncResult result) --- End of stack trace from previous location where exception was thrown --- at Microsoft.ServiceBus.Common.AsyncResult.End[TAsyncResult](IAsyncResult result) at Microsoft.ServiceBus.Common.AsyncResult1.End(IAsyncResult asyncResult)
at Microsoft.ServiceBus.Messaging.Channels.ReconnectBindingElement.ReconnectChannelFactory1.RequestSessionChannel.EndRequest(IAsyncResult result) at Microsoft.ServiceBus.Messaging.Sbmp.RedirectBindingElement.RedirectContainerChannelFactory1.RedirectContainerSessionChannel.RequestAsyncResult.<>c__DisplayClass8_1.b__4(RequestAsyncResult thisPtr, IAsyncResult r)
at Microsoft.ServiceBus.Messaging.IteratorAsyncResult1.StepCallback(IAsyncResult result) --- End of stack trace from previous location where exception was thrown --- at Microsoft.ServiceBus.Common.AsyncResult.End[TAsyncResult](IAsyncResult result) at Microsoft.ServiceBus.Common.AsyncResult1.End(IAsyncResult asyncResult)
at Microsoft.ServiceBus.Messaging.Sbmp.RedirectBindingElement.RedirectContainerChannelFactory1.RedirectContainerSessionChannel.EndRequest(IAsyncResult result) at Microsoft.ServiceBus.Messaging.Channels.ReconnectBindingElement.ReconnectChannelFactory1.RequestSessionChannel.RequestAsyncResult.<>c.b__9_3(RequestAsyncResult thisPtr, IAsyncResult r)
at Microsoft.ServiceBus.Messaging.IteratorAsyncResult1.StepCallback(IAsyncResult result) --- End of stack trace from previous location where exception was thrown --- at Microsoft.ServiceBus.Common.AsyncResult.End[TAsyncResult](IAsyncResult result) at Microsoft.ServiceBus.Common.AsyncResult1.End(IAsyncResult asyncResult)
at Microsoft.ServiceBus.Messaging.Channels.ReconnectBindingElement.ReconnectChannelFactory1.RequestSessionChannel.EndRequest(IAsyncResult result) at Microsoft.ServiceBus.Messaging.Sbmp.SbmpTransactionalAsyncResult1.<>c.b__18_3(TIteratorAsyncResult thisPtr, IAsyncResult a)
at Microsoft.ServiceBus.Messaging.IteratorAsyncResult1.StepCallback(IAsyncResult result) --- End of stack trace from previous location where exception was thrown --- at Microsoft.ServiceBus.Common.AsyncResult.End[TAsyncResult](IAsyncResult result) at Microsoft.ServiceBus.Messaging.Sbmp.SbmpMessageSender.EndSendCommand(IAsyncResult result) --- End of inner exception stack trace --- at Microsoft.ServiceBus.Messaging.Sbmp.SbmpMessageSender.EndSendCommand(IAsyncResult result) at Microsoft.ServiceBus.Messaging.Sbmp.SbmpMessageSender.<.ctor>b__4_1(IAsyncResult result, Boolean forceCleanUp) at Microsoft.ServiceBus.Messaging.BatchManager1.PerformFlushAsyncResult.OnSingleOperationCompleted(IAsyncResult result)
— End of stack trace from previous location where exception was thrown —
at Microsoft.ServiceBus.Common.AsyncResult.End[TAsyncResult](IAsyncResult result)
at Microsoft.ServiceBus.Messaging.BatchManager1.PerformFlushAsyncResult.End(IAsyncResult result) at Microsoft.ServiceBus.Messaging.BatchManager1.EndFlush(IAsyncResult result)
— End of stack trace from previous location where exception was thrown —
at Microsoft.ServiceBus.Common.AsyncResult.End[TAsyncResult](IAsyncResult result)
at Microsoft.ServiceBus.Messaging.BatchManager1.BatchedObjectsAsyncResult.End(IAsyncResult result) at Microsoft.ServiceBus.Messaging.BatchManager1.EndBatchedOperation(IAsyncResult result)
at Microsoft.ServiceBus.Messaging.BatchManagerAsyncResult1.OnBatchedCallback(IAsyncResult result) at Microsoft.ServiceBus.Common.AsyncResult.AsyncCompletionWrapperCallback(IAsyncResult result) --- End of stack trace from previous location where exception was thrown --- at Microsoft.ServiceBus.Common.AsyncResult.End[TAsyncResult](IAsyncResult result) at Microsoft.ServiceBus.Messaging.Sbmp.SbmpMessageSender.OnEndSend(IAsyncResult result) at Microsoft.ServiceBus.Messaging.MessageSender.RetrySenderAsyncResult.<>c.<GetAsyncSteps>b__11_7(RetrySenderAsyncResult thisPtr, IAsyncResult r) at Microsoft.ServiceBus.Messaging.IteratorAsyncResult1.StepCallback(IAsyncResult result)
— End of stack trace from previous location where exception was thrown —
at Microsoft.ServiceBus.Common.AsyncResult.End[TAsyncResult](IAsyncResult result)
at Microsoft.ServiceBus.Messaging.MessageSender.<>c.b__49_1(IAsyncResult a)
at System.Threading.Tasks.TaskFactory1.FromAsyncCoreLogic(IAsyncResult iar, Func2 endFunction, Action1 endAction, Task1 promise, Boolean requiresSynchronization)
— End of stack trace from previous location where exception was thrown —
at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
at NServiceBus.Transport.AzureServiceBus.TaskWithRetryExtensions.d__1.MoveNext()
— End of stack trace from previous location where exception was thrown —
at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
at NServiceBus.Transport.AzureServiceBus.DefaultOutgoingBatchRouter.d__14.MoveNext()
— End of stack trace from previous location where exception was thrown —
at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
at System.Runtime.CompilerServices.TaskAwaiter.ValidateEnd(Task task)
at NServiceBus.Transport.AzureServiceBus.DefaultOutgoingBatchRouter.d__13.MoveNext()

Hi Phillip,

First, the multiple records in your database. Please note that even when SendsAtomicWithReceive transport transaction mode is used, it only affects messages and not database operations. It is impossible to use Transaction Scope transport transaction mode with Azure Service Bus. Either the handler needs to be idempotent, or the Outbox feature could be used. The Outbox feature does have a requirement of keeping its data in the same database as the business data.

In regards to this odd behaviour - the exception is a server side exception, not something a client can retry.
Have you tried to investigate this with Azure support?

Is transport on this endpoint configured to use Partitioned queues?

Hi Phillip,

In my opinion this timeout is most likely caused by a networking issue, either on your local network, at your ISP or at the azure side. The retries are expected behavior as timeouts are usually transient, but may last longer.

Most azure services do not support transactions so any logic inside a handler hosted should be built in an idempotent way. We have several strategies documented that show how you can achieve idempotency without relying on transactions Avoiding Transactions in Azure • NServiceBus • Particular Docs

If your data is stored in sql database, which supports local transactions, you can set up the outbox feature as sean suggested. But for other databases it won’t work.

Hope this helps to prevent this from happening again.

Kind regards,
Yves

Thanks for your responses. We’ve put code in place to make sure that duplicate entries no longer happen when critical commands like these get retried, so we have that covered. I’ve opened up a ticket with Azure services so hopefully they’ll be able to give a bit more info as it is most likely an issue on their end. We’re only seeing it from one of the Service Bus’ which is a little odd.

Hi Phillip,

Did you end up resolving this issue?
We are having the same issue. Appreciate if you could share the solution please.

Regards,

Toby

Hi Toby,

This specific issue was actually caused by a azure outage that effected a very specific region. It did however cause us to look at our code and put in checks and balances to ensure that the same message wouldn’t get processed twice if we know it’s already run. Sorry I don’t have a better solution for this.

Thanks,
Phill