How should a saga behave if the requested service fails

I’ve been trying to figure this out for a while but just can’t wrap my mind around it.

Lets say I have a Saga that orchestrates a sale process:

The steps executed sequentially are

  1. raise a command to OrderService to place order
  2. raise a command to ProcessPayments
  3. publish an event to Ship Order

If there is a transient/ semi-transient error in one of these services the configured primary and delayed retries kick in. In the worst case scenario lets suppose there is an infrastructure failure and the db doesn’t come back up or an internal web service call fails, these messages are moved to the error queue (where we can retry later using service pulse)

How should the saga behave if the OrderService service faces a problem like this and doesn’t send a OrderSuccessful message back to the saga. Are there any best practices?

I’m sure it is problem that everyone’s faced …feel miserable not being able to figure out this :slight_smile:


Ashish Chettri

This is a good example of where you would use saga timeouts

Check with your business how long they are prepared to wait before shipping an order and set a timeout to some value less than that. Then take what ever action the business wants to happen in that case, notifying the customer that there is a delay, some manual action to get it shipped etc

Side note: This also leads to a discussion on SLA’s in terms of managing the error queue. Lets say you want to ship within 8 hours you’d want your process to manage the errors to handle them within that 8 hour window.

Hope this helps!

Hi @andreasohlund,

Thank you so much. Time outs was something that I was considering but wasn’t sure if it was best practice. :smile:


Ashish Chettri