Fair distribution of workload among tenants

hbulens · May 5, 2023, 4:13pm

I have a multi-tenant application, which inevitably experiences the noisy neighbor problem. When one tenant processes something in bulk, others will most definitely feel that with delayed execution times.

Is there a way to have NServiceBus invoke the messages evenly for the tenants that have data in the queues? And secondly, is there a way to process the messages for each tenant sequentially? All available threads can be filled up, although it shouldn’t be allowed to be processing multiple messages from one tenant at the same time.

Sagas have previously been suggested, but I am not quite sure this is going to work for me since there is no correlation between operations. The only common denominator is the tenant ID, but the messages themselves don’t have a particular order.

This is pretty much the same topic as written down beautifully here:

Although I should dig deeper into the subject matter, but I believe Azure Service Bus Session could provide a solution. But IIRC, this is not supported in NServiceBus.

SeanFeldman · May 6, 2023, 4:32pm

github.com/Particular/NServiceBus.Transport.AzureServiceBus

Add support for message sessions

opened 02:06PM - 22 Apr 22 UTC

SeanFeldman

**Is your feature request related to a problem? Please describe.** For a long t…ime Particular has shied away from ordered messages delivery for historical reasons. The originally supported MSMQ transport did not provide guaranteed ordered delivery and the rest of the transports were molded into the assumption that ordered message processing is either not something that should be done, or accomplished using the saga feature. More modern technologies such as Azure Service Bus are capable of handling messages in the order those were sent, providing a feature called [Message Sessions](https://docs.microsoft.com/en-us/azure/service-bus-messaging/message-sessions). **Describe the solution you'd like** Add support for message sessions to allow endpoints that require message ordered processing at scale to do the job without resorting to more complex solutions such as sagas and state persistence. **Additional context** There are domains where message processing in the order those were sent is critical. Even the pizza [pizza](https://particular.net/blog/you-dont-need-ordered-delivery) preparation process.

hbulens · May 7, 2023, 11:38am

Thanks for the pointer. Upvoted the issue. Have you ever managed to get this working?

Whenever I run NSB with session-enabled subscriptions and queues, I get this error message:

“It is not possible for an entity that requires sessions to create a non-sessionful message receiver.”

SzymonPobiega · May 8, 2023, 10:02am

Hi

Thanks for reaching out. This is a very interesting angle of looking at the session feature. Up until now the only use case we were aware of was one of sending messages in order. This specific use case we considered an “anti-requirement” in the sense that the order of message processing is never guaranteed in any multi-threaded system with a failure handling that is non-blocking.

I’ll investigate what would it take to support fairness in a multi-tenant system using transport-specific mechanisms such as sessions and get back to you.

danielmarbach · May 8, 2023, 1:53pm

Hi Hendrik,

Thanks for raising this interesting question. Based on my understanding, we are looking at two constraints:

Fairness between tenants
Ordering of messages within a tenant

Let’s look at those individually for a moment.

Fairness

When looking at fairness, it becomes apparent when using a single queue with scaled out consumers it is difficult to achieve fairness because a busy tenant might produce more messages into the queue that then lead to the result of other messages having to wait. With that, the critical time of a message for a tenant might fluctuate tremendously depending on how many noisy neighbors are connected to the same queue.

When you look at the message session, it looks like it might be a good fit for achieving fairness, but I wonder if that is really the case based on my experience with sessions. A session processor has to be configured with an appropriate session idle timeout, maximum number of concurrent session and a maximum concurrency per session. The idle timeout sets the maximum amount of time to wait for a message to be received for a currently active session and after that is elapsed, the processor will close the session and attempt to process another session. This means a busy session id/group can end up being continuously processed, and you would need to fine tune a lot in alignment with your tenants to achieve “fairness” even with sessions.

The message session feature in Azure Service Bus in my opinion, is designed to achieve ordering within a session and not “fairness”. So you’d try to use a feature that might not be “fit for purpose” which can lead to other issues down the line.

Order

When it comes to strict ordering, we have found many ordering cases don’t actually need ordering when looking at those cases closer. Dennis and David wrote an excellent blog post about this with the title “You don’t need ordered delivery”. Usually, when looking at “those ordering cases” closer and modeling those like you mentioned with sagas we get better tradeoffs than relying on technical solutions like message sessions. It is true though that not all use cases are the same and the message session has its value when ordering is really required. As such, it might be nice to provide an API for our users to make it supported. Since with Azure Service Bus when ordering is required, message sessions are the recommended approach.

Sessions in general

When you look at the session feature, it becomes clear that enabling sessions has consequences for the producer of the messages as well as the consumer.

When sessions are enabled on a queue or a subscription, the client applications can no longer send/receive regular messages. All messages must be sent as part of a session (by setting the session id) and received by accepting the session.

This means there is tighter coupling between the producer and the consumer. Which for me would mean I would only ever want to apply this pattern for producer and consumers within the same service boundary (which for me would also indicate only for commands). Since commands only involve a certain coupling by definition (by knowing where to send the intent to) I do not think it is necessarily more involved to send the command to the tenant specific queue/endpoint once you ask yourself the question of multi-tenancy vs multi-single tenancy.

Multi tenancy vs multi-single tenancy

When we look at the problem of fairness, I wonder whether this is a question of choosing the “right” tenancy structure. Gregor Hohpe wrote an excellent blog post about this on LinkedIn. In there, he talks about Multi-single-tenancy (“efficient single family homes”). With such an approach, you would have tenant-specific queues. With such an approach, you gain numerous benefits like:

Dedicated resources per tenant that can be right sized depending on the “tier” of the customer. For example, compute and memory, as well as concurrency, can be controlled by the tenant tier.
Metrics, telemetry etc. are clearly separated by tenant, which gives you greater insights of what is going on within a single tenant than what you can get with sessions.
It is possible to do tenant specific scaling of consumers. With a single queue, you can only do competing consumer on the resources of that queue.
Quotas of the queue infrastructure can be aligned with the allowed quotas of the tier and closely monitored per tenant.
Onboarding and offboarding tenants can be fully automated with the infrastructure deployment. When a tenant is offboarded, all the tenant-specific data can either be removed or it is clear how long the infrastructure needs to be kept around until all the tenant-specific data is processed because you get the necessary queue statistics to even answer this question.

I’m aware that there are operational overheads in terms of compute, storage and memory compared to multi-tenancy, but these are tradeoffs that might be worthwhile exploring further given the benefits you get with multi-single-tenancy.

Regards,
Daniel

remyvd · May 10, 2023, 9:10am

I’m interested in this topic.

I also have a multi-tenant application and experience the noisy neighbor problem. Looking for some solution to add Fairness.

In my case I see it often happening when a customer is doing a bulk or initialization action. This customer knows he is doing a bulk action and has no problem with longer critical time of messages. But his bulk action is affection other customers.

The term ‘sharded queue’ is an interesting concept/pattern and would be great if this could be implemented in some way. For me it’s still the same logic endpoint, but adding fairness to the order of messages based on some attribute/header.

Sessions feature in ASB don’t feel like a good fit for this.

I don’t know if Partition feature in Azure Service Bus could fix this. That seems to be more an implementation level feature.