I’m currently analyzing an issue with an endpoint that does not handle messages in our test environment. (It’s a new product, so no production environment yet)
The endpoint easily handles a bunch of messages on the dev machine.
The scenario I set up to analyse is:
- I put 50 messages of the same type in the queue. Each message will start a saga and in total will lead to about 10 message for each saga.
- Start the endpoint
- After about a minute or 2 all the sagas are completed (including talking to 3rd party systems)
If I only put a few (3-5) of the same messages into the queue in the test environment. The endpoint will maybe handle 1 message, or none. And the queue will stay at it’s initial size forever. Restarting the endpoint might allow it to handle 1 more message before stopping to handle messages again.
I have turned on the Debug logging for NServiceBus.
What I get are mostly
DEBUG NServiceBus.Transport.SQLServer.DelayedMessageHandler [(null)] Scheduling next attempt to move matured delayed messages to input queue in 00:00:01
from time to time
INFO NServiceBus.RecoverabilityExecutor [(null)] Immediate Retry is going to retry message '4328ea74-4816-43fc-b4c8-aa8800018149' because of an exception:
but that’s all. I would expect to at least see some logs of NServiceBus invoking message handlers. It appears that the endpoint is unable to take any messages from the queue.
Other endpoints appear to be doing fine (they mostly have only simple message handlers, no sagas though)
We’re using SQL Transport. Queue tables are created exactly the same way using a parameterized script (same script for all environments).
We also have the Heartbeat and Monitoring plugins installed which work fine and I can see heartbeats, queue sizes etc in service pulse on the test environment.
I’m running out of ideas on how to analyze this issue.
What else can I do to find the cause and resolve the issue? What more information should I provide to help pointing me in a direction?
Help is really appreciated, this issue is driving me nuts