Can someone verify the below assumptions are correct in regards to outbox and custom idempotent code:
Assumption 1: Scenario : 2 endpoints with outbox enabled
When Endpoint A sends a message to Endpoint B outbox is only able to perform de-duplication and guarantee exactly-once processing if the business/application logic interacts with the same transactional store as the outbox ( sql server in this example ). If the business/application logic of the handler communicates with s3 to add a file then outbox is unable to help with de-duplication and it’s up to the developer to write idempotent code. If so it sounds like in the real world it’s unlikely for outbox to the the silver bullet for de-duplication in an endpoint and there is going to be a mixture of using outbox for certain handlers and certain handlers will require custom idempotence logic. Is this correct?
Assumption 2: Scenario : 2 endpoints with outbox enabled
If you have outbox enabled on an endpoint and the handler isn’t transactional with the outbox you can still benefit from deterministic id’s being generated automatically for you for outgoing messages but as stated above if the outgoing messages go to a handler that is not transactional the handler MUST be idempotent. Is this correct?
Can you elaborate on this? I didn’t quite get what you were saying. The deterministic ID’s generated by outbox cannot be used for idempotency checks on the reciever unless it stored the guid’s in a table somewhere which would essentially be mimicing outbox.
The deterministic ID is always generated by NServiceBus (no Outbox) for every message. The Outbox use this ID for own de-duplication algorithm. The outbox is proffered way to use but If you for some reason cannot use it then you have to implement own custom de-duplication algorithm and “stored the guid’s in a table somewhere” is one part of it and YES at some point in time you can recognize that you implement the own Outbox implementation so it’s better to use NServiceBus Outbox and focus more on business/application logic.
The second case you described is when you cannot use NServiceBus Outbox for de-duplication because the used resource by design is out of the Outbox context, for example calling 3rd party HTTP API.
generate custom unique value and pass to message body or NSericeBus custom header
one important thing - when Endpoint A send message to Endpoint B and B calls 3rd party API then the unique value must be generated by Endpoint A to be always the same in the Endpoint B
use NServiceBus generated message ID which is always the same when recoverability kicks off.
Of course it’s all dependes on the context and functionality to develop so sometimes one option fits better and sometimes the other (no silver bullet).
It does but I believe your understanding on the nservicebus messageid is incorrect when outbox is not used. When not using outbox nservicebus just does a guid.new(). This means if the publish/send fails or retries due to a broker issue for example a new message is will be generated for the outgoing message. When using outbox since it stores the outgoing messages in a table before sending/publishing you can guarantee ithe messageid will be deterministic and not change on every retry from recoverability
You are right! However If I choose Transport supporting Sends atomic with ReceiveTransaction level it could be work “since all outgoing operations are atomic with the ongoing receive operation”.
I described only the example with possible solution to consider. Your way of analysis, asking questions and trying to find weakness is a right direction to find correct solution for concrete problem.
I think we can agree when it’s possible NServiceBus Outbox should be used because exactly-once messsge processing is not as easy to achieve as it seems at first glance.
Outbox still helps here by the way for preventing ghost messages to not accidentally emit messages where there is a failure on transports that are “ReceiveOnly”.
As @mikedevbo states this is not an issue for higher consistency transports.
A ghost message is a message that was transmitted but should not have because the handler failed after sending. Possibly containing identifiers to resources that no longer exist. It states a false fact that did not yet complete in storage.
Isolate non-transactional tasks
In general, don’t mix transactional and non-transactional IO in a single handler. When dealing with non-transactional resources use a handler that does a single task.
If you have a multi-task process then:
send messages for each task, or
send a message to the next task after a task completes, or
use an orchestrator pattern implemented via sagas.
Added benefits are that all tasks could run concurrently and can fail independently. Not having to worry about checking which tasks already completed.
Non-transactionan and non-idempotent
Even if tasks are not idempotent you can choose to have such an endpoint have it configured for best effort to try processing messages just once (no scaleout, no recovery, sequential processing) and forward the message to the error queue for manual processing in case of a failure.