Duplicate messages found in SqlDB

we built resiliency using Nservice bus Saga, when a service went down. We used SqlDB to store the failed messages with another unique identifier(CorreltionId) and we used Saga Timeout to retry the failed message to check if the service is up and running and messages are submitted. Every thing is working as expected in Non Prod

In our production environment we will push messages from 16 boxes to the Nservice bus, and there are 4 boxes in which the Saga application code was deployed to handle those messages. During critical failure, we saw the failed messages were duplicated (same message id with different CorrelationId).

Not sure how does those duplicates were created.

Also is there a way how to replicate this issue in non prod environment.

How do we solve this?

Where did you see the duplication? In your logs? In the saga table?

To better understand your situation we need to look at your saga code and know what transport, persister you are using (including the version of those). I think this is better handled via a support case so would you be able to email support@particular.net with the details so that we can help you out?

Cheers,

Andreas