I want to delete a message from it’s underlying transport if it has not been successful in a given time period (24 hours)
based on our immediate and delayed retry policies, the messages that would required deletion would be sitting in ServicePulse, waiting for a manual retry (which we want to prevent after 24 hours)
MSMQ is our underlying transport. We’re using SqlServer persistence. NServiceBus 6
FYI, we’re moving away from the NServiceBus Host and towards self-hosting in our next release, so if there is a solution that is based on NSB Host, we won’t be able to use it.
If you are modeling a business process, it is best to model them via a Saga. The ‘discard’ needs to be mapped to a business reason and it can be implemented as a policy. For example (off top of my head), in a saga that manages user’s account activation, if a user does not activate the account after an email is sent, the process removes the whole process via a TimeOut. Read more about saga timeouts here.
You can use ‘Time To Be Received’ (TTBR) attribute (or by applying it via a convention) on your message which essentially says after a certain time elapses it does not make sense to process that message anymore and the system will discard the message. It won’t happen. You need to be aware that you might lose messages if your only endpoint is down for some time and the messages get expired. Read more about this here. Also, in MSMQ, when the message expires it is NOT moved to DLQ (to avoid disk consumption) so it can not be recovered.
I had another read through your post. I think you want to also expire messages when they get to in the audit/error queue, is that correct? We currently have no way of doing that, but it would be a good idea to be able to ‘expire’ messages in the error queue based on TTBR.
@mgmccarthy yeah, unfortunately, that is the case.
Does the process have to be fully automated?
If not then you could choose an appropriate
and archive the error messages. Once they are archived they will get cleaned up by the retention period. The problem is though that the retention period is applied to all archived error messages.
Yeah, what I’m looking for is a way to scope that retention period for a given message type. In fact, to revise the overall problem space, if the message is not handled/picked up for processing in 24 hours, I actually don’t think that’s a problem.
The problem is multiple failed message of the same type in the error queue. The order of retrying becomes a support concern, and there is a good amount of risk in getting it wrong b/c we could end up losing data if manually retried out of order.
The safer route is to archive them after 24 hours if they’re in the error queue.
Sounds like for now, my best bet is to try to formalize something in code.
Thanks @HEskandari and @danielmarbach for the ideas, the information, but most importantly, the conversation around this.