NServiceBus.RabbitMQ resiliency 5 years later

JoeShook · October 16, 2020, 11:09pm

I know back [2015 the NServiceBus.RabbitMQ](http://NServiceBus.RabbitMQ 3.0.0 - Major Release Available) package used to allow for multiple hosts in the connection string. Then it was removed and it seemed at the time is was the right thing to do. We have lived for some time talking to a 3 node RabbitMQ HA cluster where we deploy essentially three environments that are in turn configured to connect one of the three nodes.

We have had a few incidents where a RabbitMQ node fails to stay in the cluster and then our endpoints connected to this node circuit break and shutdown. This is still on NSB 5.0. Back in the day we did not have a budget for load balancers (did not try software based load balancer). We were pretty reliable and things worked pretty well. I asked Udi at a conference this question 5 years ago and he indicated to engage Pivotal. Another not in the budget things. Actually I also took his course and asked again hoping for a different answer . . Roll the clock forward a lot of years and we now need that reliability and resiliency and we have the budget.

But I still have the question of why this feature could not return. Today there are other ways of resiliency in the face of a endpoint becoming unavailable such as using Consul for discovering microservices. What are the thoughts around this today, 5 years later?

I tried to look through past topics looking for this type of question around resiliency in the absence of a load balancer like an F5 and didn’t find much. One of the interesting topics was about a HashiCorp vault rolling credential change resulting in RabbitMq credential changes. The guidance seemed to be let the endpoint respond to the critical error by shutting down and an external process would restart it. I guess that is a solution but it feels clunky compared compared to how NSB typically behaves, is this accurate guidance?

I have at least one type of send only endpoint that is not hosted conventionally that just doesn’t play well with the shutdown and restart scenario.

BTW we are moving to NSB 7. I have a prototype of the whole stack moved to NSB 7 so the move to async is well known.

bording · October 27, 2020, 6:58pm

Hi @JoeShook,

We do have Support multiple RabbitMQ node hostnames · Issue #525 · Particular/NServiceBus.RabbitMQ · GitHub open, which is about re-adding support for specifying multiple host names in some way, so it’s definitely something we could consider adding as part of a future enhancement release of the transport.