Handler Caught in a retry loop caused it to execute thousands of times

nservicebus

(Rommel Villa) #1

Hi,

Currently we are using
NSB v6.5.4
NServiceBus.Azure.Transpots.WindowsAzureServiceBus v7.2.12
NServiceBus.Azure.Transports.WindowsAzureStorageQueues v7.5.4
NServiceBus.DataBus.AzureBlogStorage v1.1.0
NServiceBus.Persistence.AzureStorage v1.5.1
WindowsAzure.Storage v8.2.1

Hi so we found ourselves in a situation where one of our handlers was caught in an execution loop and ended being retried thousands of times, resulting in sending emails to the same person a couple thousand times.

In Service Insights, you can see thousands of those messages (AccountCreatedNotification firing which causes a SendMail message). They obviously are the same message and look like they are part of the same flow. This functionally has been working correctly for more than a year. Funny part there is no failed messages logged anywhere, and these ones were successful.

During this time we found thousands of storage exceptions entries in the nservicebus logs, which grew them to s huge size ( See Below end of this message.).

It’s important to understand what could be causing this and how can we prevent this from happening again as we do not want to find ourselves in the same situation.,

Thanks for any information you can provide.

2019-03-24 23:34:08.762 WARN NServiceBus.AzureStorageQueues.ReceiveStrategy The error pipeline wasn’t able to handle the exception.
Microsoft.WindowsAzure.Storage.StorageException: The remote server returned an error: (404) Not Found. —> System.Net.WebException: The remote server returned an error: (404) Not Found.
at Microsoft.WindowsAzure.Storage.Table.Protocol.TableOperationHttpResponseParsers.TableOperationPreProcess(TableResult result, TableOperation operation, HttpWebResponse resp, Exception ex) in c:\Program Files (x86)\Jenkins\workspace\release_dotnet_master\Lib\ClassLibraryCommon\Table\Protocol\TableOperationHttpResponseParsers.cs:line 61
at Microsoft.WindowsAzure.Storage.Core.Executor.Executor.EndGetResponse[T](IAsyncResult getResponseResult) in c:\Program Files (x86)\Jenkins\workspace\release_dotnet_master\Lib\ClassLibraryCommon\Core\Executor\Executor.cs:line 299
— End of inner exception stack trace —
at Microsoft.WindowsAzure.Storage.Core.Executor.Executor.EndExecuteAsync[T](IAsyncResult result) in c:\Program Files (x86)\Jenkins\workspace\release_dotnet_master\Lib\ClassLibraryCommon\Core\Executor\Executor.cs:line 50
at Microsoft.WindowsAzure.Storage.Core.Util.AsyncExtensions.<>c__DisplayClass1`1.b__0(IAsyncResult ar) in c:\Program Files (x86)\Jenkins\workspace\release_dotnet_master\Lib\ClassLibraryCommon\Core\Util\AsyncExtensions.cs:line 66
— End of stack trace from previous location where exception was thrown —
at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
at NServiceBus.Azure.Transports.WindowsAzureStorageQueues.DelayDelivery.NativeDelayDelivery.d__2.MoveNext() in C:\BuildAgent\work\e471362c5647d835\src\Transport\DelayDelivery\NativeDelayDelivery.cs:line 50
— End of stack trace from previous location where exception was thrown —
at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
at NServiceBus.AzureStorageQueues.Dispatcher.d__2.MoveNext() in C:\BuildAgent\work\e471362c5647d835\src\Transport\Dispatcher.cs:line 47
— End of stack trace from previous location where exception was thrown —
at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
at NServiceBus.DelayedRetryExecutor.d__1.MoveNext() in C:\BuildAgent\work\b549d46003942065\src\NServiceBus.Core\Recoverability\DelayedRetryExecutor.cs:line 53
— End of stack trace from previous location where exception was thrown —
at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
at NServiceBus.RecoverabilityExecutor.d__4.MoveNext() in C:\BuildAgent\work\b549d46003942065\src\NServiceBus.Core\Recoverability\RecoverabilityExecutor.cs:line 101
— End of stack trace from previous location where exception was thrown —
at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
at NServiceBus.AzureStorageQueues.AtLeastOnceReceiveStrategy.d__1.MoveNext() in C:\BuildAgent\work\e471362c5647d835\src\Transport\AtLeastOnceReceiveStrategy.cs:line 54
Request Information
RequestID:0bba3595-5002-00a9-119a-e2d6da000000
RequestDate:Sun, 24 Mar 2019 23:34:08 GMT
StatusMessage:Not Found
ErrorCode:TableNotFound
ErrorMessage:The table specified does not exist.
RequestId:0bba3595-5002-00a9-119a-e2d6da000000
Time:2019-03-24T23:34:08.7269790Z

2019-03-24 23:34:08.772 WARN NServiceBus.AzureStorageQueues.ReceiveStrategy The error pipeline wasn’t able to handle the exception.
Microsoft.WindowsAzure.Storage.StorageException: The remote server returned an error: (404) Not Found. —> System.Net.WebException: The remote server returned an error: (404) Not Found.
at Microsoft.WindowsAzure.Storage.Table.Protocol.TableOperationHttpResponseParsers.TableOperationPreProcess(TableResult result, TableOperation operation, HttpWebResponse resp, Exception ex) in c:\Program Files (x86)\Jenkins\workspace\release_dotnet_master\Lib\ClassLibraryCommon\Table\Protocol\TableOperationHttpResponseParsers.cs:line 61
at Microsoft.WindowsAzure.Storage.Core.Executor.Executor.EndGetResponse[T](IAsyncResult getResponseResult) in c:\Program Files (x86)\Jenkins\workspace\release_dotnet_master\Lib\ClassLibraryCommon\Core\Executor\Executor.cs:line 299
— End of inner exception stack trace —
at Microsoft.WindowsAzure.Storage.Core.Executor.Executor.EndExecuteAsync[T](IAsyncResult result) in c:\Program Files (x86)\Jenkins\workspace\release_dotnet_master\Lib\ClassLibraryCommon\Core\Executor\Executor.cs:line 50
at Microsoft.WindowsAzure.Storage.Core.Util.AsyncExtensions.<>c__DisplayClass1`1.b__0(IAsyncResult ar) in c:\Program Files (x86)\Jenkins\workspace\release_dotnet_master\Lib\ClassLibraryCommon\Core\Util\AsyncExtensions.cs:line 66
— End of stack trace from previous location where exception was thrown —
at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
at NServiceBus.Azure.Transports.WindowsAzureStorageQueues.DelayDelivery.NativeDelayDelivery.d__2.MoveNext() in C:\BuildAgent\work\e471362c5647d835\src\Transport\DelayDelivery\NativeDelayDelivery.cs:line 50
— End of stack trace from previous location where exception was thrown —
at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
at NServiceBus.AzureStorageQueues.Dispatcher.d__2.MoveNext() in C:\BuildAgent\work\e471362c5647d835\src\Transport\Dispatcher.cs:line 47
— End of stack trace from previous location where exception was thrown —
at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
at NServiceBus.DelayedRetryExecutor.d__1.MoveNext() in C:\BuildAgent\work\b549d46003942065\src\NServiceBus.Core\Recoverability\DelayedRetryExecutor.cs:line 53
— End of stack trace from previous location where exception was thrown —
at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
at NServiceBus.RecoverabilityExecutor.d__4.MoveNext() in C:\BuildAgent\work\b549d46003942065\src\NServiceBus.Core\Recoverability\RecoverabilityExecutor.cs:line 101
— End of stack trace from previous location where exception was thrown —
at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
at NServiceBus.AzureStorageQueues.AtLeastOnceReceiveStrategy.d__1.MoveNext() in C:\BuildAgent\work\e471362c5647d835\src\Transport\AtLeastOnceReceiveStrategy.cs:line 54
Request Information
RequestID:0bba3598-5002-00a9-139a-e2d6da000000
RequestDate:Sun, 24 Mar 2019 23:34:08 GMT
StatusMessage:Not Found
ErrorCode:TableNotFound
ErrorMessage:The table specified does not exist.
RequestId:0bba3598-5002-00a9-139a-e2d6da000000
Time:2019-03-24T23:34:08.7369860Z

2019-03-24 23:34:08.782 WARN NServiceBus.Azure.Transports.WindowsAzureStorageQueues.DelayDelivery.DelayedMessagesPoller Failed to fetch delayed messages from the storage
Microsoft.WindowsAzure.Storage.StorageException: The remote server returned an error: (404) Not Found. —> System.Net.WebException: The remote server returned an error: (404) Not Found.
at Microsoft.WindowsAzure.Storage.Shared.Protocol.HttpResponseParsers.ProcessExpectedStatusCodeNoException[T](HttpStatusCode expectedStatusCode, HttpStatusCode actualStatusCode, T retVal, StorageCommandBase1 cmd, Exception ex) in c:\Program Files (x86)\Jenkins\workspace\release_dotnet_master\Lib\Common\Shared\Protocol\HttpResponseParsers.Common.cs:line 50 at Microsoft.WindowsAzure.Storage.Blob.CloudBlobContainer.<AcquireLeaseImpl>b__e(RESTCommand1 cmd, HttpWebResponse resp, Exception ex, OperationContext ctx) in c:\Program Files (x86)\Jenkins\workspace\release_dotnet_master\Lib\ClassLibraryCommon\Blob\CloudBlobContainer.cs:line 2359
at Microsoft.WindowsAzure.Storage.Core.Executor.Executor.EndGetResponse[T](IAsyncResult getResponseResult) in c:\Program Files (x86)\Jenkins\workspace\release_dotnet_master\Lib\ClassLibraryCommon\Core\Executor\Executor.cs:line 299
— End of inner exception stack trace —
at Microsoft.WindowsAzure.Storage.Core.Executor.Executor.EndExecuteAsync[T](IAsyncResult result) in c:\Program Files (x86)\Jenkins\workspace\release_dotnet_master\Lib\ClassLibraryCommon\Core\Executor\Executor.cs:line 50
at Microsoft.WindowsAzure.Storage.Core.Util.AsyncExtensions.<>c__DisplayClass1`1.b__0(IAsyncResult ar) in c:\Program Files (x86)\Jenkins\workspace\release_dotnet_master\Lib\ClassLibraryCommon\Core\Util\AsyncExtensions.cs:line 66
— End of stack trace from previous location where exception was thrown —
at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
at NServiceBus.Azure.Transports.WindowsAzureStorageQueues.DelayDelivery.LockManager.d__5.MoveNext() in C:\BuildAgent\work\e471362c5647d835\src\Transport\DelayDelivery\LockManager.cs:line 33
— End of stack trace from previous location where exception was thrown —
at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
at NServiceBus.Azure.Transports.WindowsAzureStorageQueues.DelayDelivery.DelayedMessagesPoller.d__17.MoveNext() in C:\BuildAgent\work\e471362c5647d835\src\Transport\DelayDelivery\DelayedMessagesPoller.cs:line 96
— End of stack trace from previous location where exception was thrown —
at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
at NServiceBus.Azure.Transports.WindowsAzureStorageQueues.DelayDelivery.DelayedMessagesPoller.d__16.MoveNext() in C:\BuildAgent\work\e471362c5647d835\src\Transport\DelayDelivery\DelayedMessagesPoller.cs:line 68
Request Information
RequestID:791e8b6a-201e-008f-179a-e24d6e000000
RequestDate:Sun, 24 Mar 2019 23:34:07 GMT
StatusMessage:The specified container does not exist.
ErrorCode:ContainerNotFound
ErrorMessage:The specified container does not exist.
RequestId:791e8b6a-201e-008f-179a-e24d6e000000
Time:2019-03-24T23:34:08.7478189Z

2019-03-24 23:34:08.837 WARN NServiceBus.Azure.Transports.WindowsAzureStorageQueues.DelayDelivery.DelayedMessagesPoller Failed to fetch delayed messages from the storage
Microsoft.WindowsAzure.Storage.StorageException: The remote server returned an error: (404) Not Found. —> System.Net.WebException: The remote server returned an error: (404) Not Found.
at Microsoft.WindowsAzure.Storage.Shared.Protocol.HttpResponseParsers.ProcessExpectedStatusCodeNoException[T](HttpStatusCode expectedStatusCode, HttpStatusCode actualStatusCode, T retVal, StorageCommandBase1 cmd, Exception ex) in c:\Program Files (x86)\Jenkins\workspace\release_dotnet_master\Lib\Common\Shared\Protocol\HttpResponseParsers.Common.cs:line 50 at Microsoft.WindowsAzure.Storage.Blob.CloudBlobContainer.<AcquireLeaseImpl>b__e(RESTCommand1 cmd, HttpWebResponse resp, Exception ex, OperationContext ctx) in c:\Program Files (x86)\Jenkins\workspace\release_dotnet_master\Lib\ClassLibraryCommon\Blob\CloudBlobContainer.cs:line 2359
at Microsoft.WindowsAzure.Storage.Core.Executor.Executor.EndGetResponse[T](IAsyncResult getResponseResult) in c:\Program Files (x86)\Jenkins\workspace\release_dotnet_master\Lib\ClassLibraryCommon\Core\Executor\Executor.cs:line 299
— End of inner exception stack trace —
at Microsoft.WindowsAzure.Storage.Core.Executor.Executor.EndExecuteAsync[T](IAsyncResult result) in c:\Program Files (x86)\Jenkins\workspace\release_dotnet_master\Lib\ClassLibraryCommon\Core\Executor\Executor.cs:line 50
at Microsoft.WindowsAzure.Storage.Core.Util.AsyncExtensions.<>c__DisplayClass1`1.b__0(IAsyncResult ar) in c:\Program Files (x86)\Jenkins\workspace\release_dotnet_master\Lib\ClassLibraryCommon\Core\Util\AsyncExtensions.cs:line 66
— End of stack trace from previous location where exception was thrown —
at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
at NServiceBus.Azure.Transports.WindowsAzureStorageQueues.DelayDelivery.LockManager.d__5.MoveNext() in C:\BuildAgent\work\e471362c5647d835\src\Transport\DelayDelivery\LockManager.cs:line 33
— End of stack trace from previous location where exception was thrown —
at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
at NServiceBus.Azure.Transports.WindowsAzureStorageQueues.DelayDelivery.DelayedMessagesPoller.d__17.MoveNext() in C:\BuildAgent\work\e471362c5647d835\src\Transport\DelayDelivery\DelayedMessagesPoller.cs:line 96
— End of stack trace from previous location where exception was thrown —
at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
at NServiceBus.Azure.Transports.WindowsAzureStorageQueues.DelayDelivery.DelayedMessagesPoller.d__16.MoveNext() in C:\BuildAgent\work\e471362c5647d835\src\Transport\DelayDelivery\DelayedMessagesPoller.cs:line 68
Request Information
RequestID:791e8b73-201e-008f-1f9a-e24d6e000000
RequestDate:Sun, 24 Mar 2019 23:34:07 GMT
StatusMessage:The specified container does not exist.
ErrorCode:ContainerNotFound
ErrorMessage:The specified container does not exist.
RequestId:791e8b73-201e-008f-1f9a-e24d6e000000
Time:2019-03-24T23:34:08.8188685Z

2019-03-24 23:34:08.852 WARN NServiceBus.Azure.Transports.WindowsAzureStorageQueues.DelayDelivery.DelayedMessagesPoller Failed to fetch delayed messages from the storage
Microsoft.WindowsAzure.Storage.StorageException: The remote server returned an error: (404) Not Found. —> System.Net.WebException: The remote server returned an error: (404) Not Found.
at Microsoft.WindowsAzure.Storage.Shared.Protocol.HttpResponseParsers.ProcessExpectedStatusCodeNoException[T](HttpStatusCode expectedStatusCode, HttpStatusCode actualStatusCode, T retVal, StorageCommandBase1 cmd, Exception ex) in c:\Program Files (x86)\Jenkins\workspace\release_dotnet_master\Lib\Common\Shared\Protocol\HttpResponseParsers.Common.cs:line 50 at Microsoft.WindowsAzure.Storage.Blob.CloudBlobContainer.<AcquireLeaseImpl>b__e(RESTCommand1 cmd, HttpWebResponse resp, Exception ex, OperationContext ctx) in c:\Program Files (x86)\Jenkins\workspace\release_dotnet_master\Lib\ClassLibraryCommon\Blob\CloudBlobContainer.cs:line 2359
at Microsoft.WindowsAzure.Storage.Core.Executor.Executor.EndGetResponse[T](IAsyncResult getResponseResult) in c:\Program Files (x86)\Jenkins\workspace\release_dotnet_master\Lib\ClassLibraryCommon\Core\Executor\Executor.cs:line 299
— End of inner exception stack trace —
at Microsoft.WindowsAzure.Storage.Core.Executor.Executor.EndExecuteAsync[T](IAsyncResult result) in c:\Program Files (x86)\Jenkins\workspace\release_dotnet_master\Lib\ClassLibraryCommon\Core\Executor\Executor.cs:line 50
at Microsoft.WindowsAzure.Storage.Core.Util.AsyncExtensions.<>c__DisplayClass1`1.b__0(IAsyncResult ar) in c:\Program Files (x86)\Jenkins\workspace\release_dotnet_master\Lib\ClassLibraryCommon\Core\Util\AsyncExtensions.cs:line 66
— End of stack trace from previous location where exception was thrown —
at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
at NServiceBus.Azure.Transports.WindowsAzureStorageQueues.DelayDelivery.LockManager.d__5.MoveNext() in C:\BuildAgent\work\e471362c5647d835\src\Transport\DelayDelivery\LockManager.cs:line 33
— End of stack trace from previous location where exception was thrown —
at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
at NServiceBus.Azure.Transports.WindowsAzureStorageQueues.DelayDelivery.DelayedMessagesPoller.d__17.MoveNext() in C:\BuildAgent\work\e471362c5647d835\src\Transport\DelayDelivery\DelayedMessagesPoller.cs:line 96
— End of stack trace from previous location where exception was thrown —
at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
at NServiceBus.Azure.Transports.WindowsAzureStorageQueues.DelayDelivery.DelayedMessagesPoller.d__16.MoveNext() in C:\BuildAgent\work\e471362c5647d835\src\Transport\DelayDelivery\DelayedMessagesPoller.cs:line 68
Request Information
RequestID:791e8b84-201e-008f-2f9a-e24d6e000000
RequestDate:Sun, 24 Mar 2019 23:34:07 GMT
StatusMessage:The specified container does not exist.
ErrorCode:ContainerNotFound
ErrorMessage:The specified container does not exist.
RequestId:791e8b84-201e-008f-2f9a-e24d6e000000
Time:2019-03-24T23:34:08.8328779Z


(Sean Feldman) #2

Hi Mel,

Quick review of the stack trace, it looks like the endpoint is failing to connect to the delayed messages table. Where any tables removed by any chance? Restarting endpoint would ensure the table is recreated. If that doesn’t help it’s better to handle this via support case (https://particular.net/support).

PS: do you use both transports, NServiceBus.Azure.Transpots.WindowsAzureServiceBus and NServiceBus.Azure.Transports.WindowsAzureStorageQueues?


(Rommel Villa) #3

Hi Sean,

Could it be a failure to connect to those tables? We are seeing storage exceptions in other cases which I am going to post in another thread?


(Sean Feldman) #4

Hard to say w/o seeing handler code, endpoint configuration, and having more context. Hence the ask to route it via support to understand A) what’s happening B) how to reproduce it to fix in case it’s a bug.


(Rommel Villa) #5

Thanks Sean, will do