I have an issue where messages are duplicated across scaled out nodes when delayed retries execute.
Initially the system used the TimeoutManager so I assumed that was the cause for the message duplication. But changing to sql transport native delayed delivery did not correct the behavior.
The experienced behavior:
As soon as a message on one node is sheduled for delayed retry, that retry is consumed by multiple nodes when executed. (Same message id). It seems that for some reason the native delayed delivery is experiencing the same race condition as TimeoutManager does, it is likely due to something in our setup, but I have not been able to figure this out.
I have read through most of the documentation and to what I can see our setup should be able to guarantee ‘excactly-once’ delivery without TransactionScope level and DTC.
- SQL Transport using native delayed delivery.
- Default deployment scenario for SQL transport using SendAtomicWithReceive transaction level.
- Centralized database with Sql transport and Sql persistence.
- Single shared input queues for multiple competing consumers.
NServiceBus version=“6.4.3” targetFramework=“net452”
NServiceBus.SqlServer version=“3.1.3” targetFramework=“net452”
NServiceBus.Persistence.Sql version=“3.0.3” targetFramework=“net452”
I would very much appreciate some input on the topic and to see if someone har a similar setup either working as expected, or having similar behavior.