Thanks for your reply Bob, it’s much appreciated
I am in Australia so am guessing the timezones don’t really work for us that well
We have a scenario where for example we create a user enquiry in our system. After doing whatever it has to do successfully, an event is published onto an integration endpoint. This handler then works out various integration points and send commands to different endpoints so that different API’s are called on various different external systems, each with their own microservices to handle the integration for their specific api.
Some of the processing may immediately be successful. some may be successful on various immediate/delayed retries and others may fail after multiple delayed retries
Each endpoint has their own different circuit breaker error policy. e.g. some only allow an hour to fail. Others will retry for up to 24 hours
On failure each handler will throw an ExternalIntegrationException, which I can subscribe to this at an endpoint level.
However, depending on the type of integration, I may want to send a notification back to the user notifying it failed and that we are continuing to retry in background. On the final retry failure we will then update the status of an entity and send an email to the user indicating they should retry. Others are handled by internal monitoring of the error queue. Each integration point has its own error handling scenario, some failing fast and some able to retry over an extended period.
Hope this makes sense