Batching multiple events

blaur · December 6, 2018, 8:06am

Hi guys,

We’ve been using NServiceBus in the past months and things are looking good.

One thing that we have come across and wanting to know more about is if there is an inbuild way of batching multiple events?

Essentially, we want to grab the next 10 or something events and do what is required with them.

Any suggestions on how to use something already build in to NServiceBus to achieve this?

Best regards

indualagarsamy · December 6, 2018, 10:03pm

Hi Brian,

What transport are you currently using? And what version?

Thanks,
Indu Alagarsamy
Particular Software

blaur · December 7, 2018, 6:05am

Hi Indu,

Apologies for not bringing all information to the table.

NServiceBus version: 7.13
Transport: RabbitMQ

Best regards,
Brian

danielmarbach · December 10, 2018, 7:47am

Hi Brian,

Can you clarify what you mean with batching? Are you talking about the batching on the receiver side? I’m guessing that based on that you wrote

we want to grab the next 10 or something…

Are you using an event sourced approach and thus the events that you are publishing on the bus have a high coupling to each other and that’s why you’d like to batch them together on the receiver side to avoid out of order processing or what is the driver for this question?

Thanks

Regards
Daniel

blaur · December 10, 2018, 8:04am

Hi Daniel,

It is exactly on the receiver side.

The driver is simply that for us it can make sense from a performance point of view to batch multiple events in one database update transaction.

The coupling is not high and they are published individually but can be updated with one statement on the receiver side. Let us say that we have 100 events in queue, in some instances it could make sense for our handlers just to those 100 events and processes them in the same transaction.

Does this make sense?

danielmarbach · December 11, 2018, 4:16pm

Hi Brian

It does make sense yes. Given that independent events are published to potentially multiple subscribers and each subscriber has its own ideas where and how to store it I’m not sure that looking at it from a performance point is a good idea. What is the harm of storing it one by one?

The problem is that once you start introducing something that batches things either in memory or in storage this solution becomes complex and error-prone. In worst cases you might even expose yourself to message loss scenarios. Because the batching solution needs to get the messages, buffer them and thus ackknowledge the individual transport transactions and then later store it. There are also quite a few scenarios in here that need to be taken into account like can you really assume a continues stream of events coming in? So your buffering scenario needs to be time window based plus number of item based. Then there are other factors like suddenly what previously have been individual transactions that can fail they are bundled together into one storage transaction. How would you retry that? Suddenly you are creating a transport in memory and need to reimplement parts of the queuing system including the failure handling

That being said if you have proof that you really need this to do you could use a saga to write the state against and then batch them all together with the saga. Then the saga sends out a local command that writes everything into the database. With that you could batch and have the posibility to retry the inserts. But still every saga update would mean a storage update so I’m not sure how much you’d gain unless that specific endpoint would be using a more performant persister.

Can you elaborate a bit more what makes you drive towards batching on the receiver side?

Regards

Daniel