Design help for dynamic system

Hi
To start, I’m using nservicebus 6, non-core, and Azure SB, Azure Storage.
There are normally 2-3 copies of each nservicebus message processor running

My system looks like:

  • Users can upload files that contain multiple email messages to their clients. The sending of these emails can be staggered through the working day (send a fixed amount per minute calculated to send the last of the messages at the end of the day), sent at a particular number per minute, or just all sent at once (the easy case).
  • So far easy enough. The problem comes in because the users can perform multiple uploads per day, change the way that emails are sent (between staggered through the day, number per minute or all at once), change the fixed number per minute, or delete emails - meaning that the number to be sent each minute varies dynamically.
  • There are normally 10’s thousands of emails sent per day per user, but sometimes this goes into the 100’s thousands.
  • The exact timing of the messages is not important, as long as they are all sent when staggered through the day and that they are roughly sent at the rate specified (± 5%)

Things that I need to avoid at all costs:

  • Any email being sent twice (except maybe one email if the process restarts)

What is the best design for this system?

My current thought is for each user to have a saga

  • When a user uploads a file all the messages are stored in a database and a reference (pk?) for each message is sent to the saga for that user
  • The saga will have a state variable of the emails to send, and a total number remaining to send
  • When the user modifies their office hours, the saga will be notified
  • When the user modifies the send frequency type, the saga will be notified
  • When the user deletes emails, the saga will be notified with the reference for each one (and will decrement the total)
  • The saga will have a one minute timer and the handler for the timer will calculate the number of emails to send, and then batch the emails into 99 email batches, mark them as sent and decrement the total

Things I’m concerned about:

  • How do I store > 100,000 references in the saga?
  • If it takes longer than 1 minute to calculate how many to send, what will happen when the timer fires (ASB, Azure Storage and transactions)
  • Could I store all the information for each email in the saga?

This has been going around my whiteboard for a week now, so I thought I’d ask the far more knowledgable community. Any help is gratefully received!

Hi @carlbm,

I’m Dennis van der Stelt, a solution architect at Particular Software.

That’s quite a lot of detail and that’s really great. Thanks for that. I still do have some questions though and also a few tips I would like to provide to you.

Would it be a good idea to get on a call and discuss this? We could get started with understanding the system and what not and then try to create a design of what you want to build. I expect that to give the best result. If you’re up for that, I could send you some timeslots that we could have the conversation. Only thing I can’t figure out from your profile is which timezone you are in, which would be nice to have to set up a call.

Let me know if you’re interested, otherwise I’ll write a somewhat lengthier email with questions and what not.

Hi. Thanks!
I’m in the UK, and happy to setup a call. I can’t work out how to send you a private message though :joy:

Otherwise I don’t mind answering questions here in case they can help others in the future.

@Dennis poke :slight_smile:

1 Like

Hi @carlbm Happy to get on a call and discuss, can you email me to sean.farmar at particular.net please?

cheers,

Sean

We discussed the pros and cons of having no, one and many sagas for coordinating this processes. We also discussed pros and cons of different persisters and the Outbox feature in NServiceBus.

Several design possibilities were discussed and @carlbm is investigating several options. As always, if you have any additional questions, don’t hesitate to ask. Depending on the question, we’ll discuss it here or on a call again.

After a few PoCs I eventually decided to go with:

  • A saga to control everything. The saga is notified when any changes are made to the portal. There’s one saga per client
  • A few handlers to process the work (create x email NSB messages, send an email to an SMTP server, etc)
  • A SQL Server table to hold the ‘data’, in this case the message ids. Access is configured to treat it as a queue

When the user uploads a bunch of emails they are uploaded to an append only table in the SQL database, each of their ids is added to the SQL queue table, and then a message is sent to the saga with the number added.

The saga has all the data and the logic to calculate how many to send per minute, and the starting and stopping times etc which is updated when a change is made at the portal. There is a timer that the saga responds to, and sends out a message saying send x emails for client y from the queue
.
The worker proceess will then do what the saga asked, and reply with the actual number sent.
Additions and deletions are handled in terms of dynamically adjusting the number sent per minute by adjusting the saga data, and any deletions are removed from the queue table too.(Removed from the queue, and the saga is informed of how many were removed)

In terms of the outbox or other devices to stop processing an NSB message multiple times, the decision was taken that multiple sends are okay after a server restart during heavy sending type periods, but they should be avoided elsewhere. The endpoint that actually sends the messages does not have an outbox configured, and we’ve only detected a handful of duplicated messages from millions, so all-in-all acceptable.

So far the system as a whole has worked well, with the saga doing what it needs to do, and the SQL queue table working fine too. SQL as a queue gave more flexibility than something like Azure Storage Queues as we do need to sometimes delete from the queue. It also allowed bigger batches (Storage Queues limited to 32 per batch) and more flexibility around the dequeue mechanism. There was more work, but not much more than the infrastructure code for Storage Queues.

I’m happy to explain further should anyone be interested :slight_smile: