Publish high volume # of messages / second using RabbitMQ Transport


#1

I am using
NServiceBus 7.1.4
NServiceBus.RabbitMQ 5.0.1

I am trying to publish, or send, a high volume of messages from outside of a message handler, approximately 3,000 messages/s.I’m having trouble being able to publish this kind of volume using NSB.

The messages are events coming off of a device. Consumer(s) will subscribe to the events and push the information into persistent storage where it can later be aggregated. Batching the messages into groups of 100-1000 per publish event doesn’t seem reliable, as it would mean losing those messages that were in memory…

After the first 20-30 messages are published, there is a LONG delay, and upwards of 20K message publish tasks are in-progress before they begin to finish. It takes upwards of 20s for some of the publish tasks to complete.

I tried using RequireImmediateDispatch to improve the overall performance, but it hasn’t helped.

There is a page regarding how to performance tune integration with ASB, but I see nothing like that for RabbitMQ.

Does anyone have any ideas or suggestions on how to improve the publish/send performance to handle this volume?


(Brandon Ording) #2

I would think that you should be able to achieve that sort sustained message rate, so it sounds like something might be wrong.

Here are a few things that would be helpful to know:

  • What version of RabbitMQ and Erlang are you running?
  • What are the hardware specs of the broker machine? (CPU, RAM, Disk IO)
  • What is the speed of the network being used to talk to to the broker?
  • Is the broker clustered?
  • What size are the messages you’re trying to send?

While RabbitMQ calls everything a “publish”, from an NServiceBus perspective, are you sending a command, or publishing an event? If it is an event, how many subscribers are there, and if the broker is clustered, how are those subscribers distributed across the cluster?

Also, can you provide a sample of the code you’re using to send the messages? If it was possible to provide a fully working example, that would be even better!


#3

Thanks for the reply Brandon.

This was using Erlang 20.1 and Rabbit 3.6.9
The Broker machine is 6 cpu, 12 GB of RAM and has a 50 GB SSD for Rabbit
Network speed is 100 GB
Broker is not clustered
Message sizes are pretty small < 1 KB
Using NSB publish event with a single subscriber.
Note: The consumer has 2 instances, so this results in NSB creating 3 rabbitMQ q’s.

There were several reasons for the issues identified

  1. Publishes to the Q were done as Task.Run(). This was to quickly offload the messages from the primary processing loop, as waiting 8-15 ms to publish would have caused the consuming loop to fall too far behind. This resulted in NSB trying to have too many RabbitMQ connections and led to some contention.

  2. If the consumer fell behind on its processing, it’s Q would grow. If it grew too long, then RabbitMQ would start flushing messages to disk, which would then cause the publisher to slow down and begin experience data loss.

I decided to work around the problem.

Publishing individual items, when rates can exceed 7K/s would have required too many resources thrown at the problem. I added a TPL Data flow component that sends out messages in batches of 100, or every 1 second, whichever comes first. Now processing 60-70 messages per second, each with 100 data items. Significantly fewer resources needed, and data loss potential is acceptable.