Hey @lochness
What would you expect to happen with an incoming message once your circuit breaker is triggered?
NServiceBus has the concept of “unrecoverable exceptions” as a fail fast mechanism. If you define an unrecoverable exception, the related message will skip recoverability. However, this means that it immediately ends up in the error queue and you have to retry all those failed messages manually. See the documentation for unrecoverable exceptions here: Recoverability • NServiceBus • Particular Docs
It sounds like it would make sense to create a dedicated endpoint which is responsible for the HTTP requests to the unreliable API. You can then configure recoverability to fit better with this specific scenario’s needs, e.g. disable immediate retries and configure delayed retries to wait at least one minute between the attempts. This way, other messages aren’t affected by the unreliable third party while you can still let recoverability handle the situation. See the documentation about configuring delayed retries: Configure delayed retries • NServiceBus • Particular Docs
A more flexible but complex approach is a custom behavior in the pipeline which can keep an eye out for specific exceptions from the Web API. It can then act as a circuit breaker by stopping the pipeline invocation early without invoking the actual handler and therefore avoiding subsequent requests to the API. This approach requires careful thinking to ensure it only affects the relevant messages and a good plan on what to do with the affected message. Just failing the message will let it go through the recoverability steps very quickly and potentially end up in the error queue as well. By combining it with a custom recoverability policy, you can delay those messages rejected by the circuit breaker using a longer delay by default.
Here’s our documentation about custom behaviors: Manipulate pipeline with behaviors • NServiceBus • Particular Docs
and custom recoverability policies: Configure delayed retries • NServiceBus • Particular Docs
Before diving deeper into custom behaviors and recoverability I’d recommend to see whether a dedicated endpoint with some adjusted recoverability settings can solve the problem already.