Possible bug relating to synchronous handlers returning Task?

Dag_Oystein_Johansen · February 29, 2024, 10:11am

I am seeing issues with messages staying put on the input queue, despite the endpoint host being alive and otherwise sending and receiving messages.

This was curious, so I had a peek at the handler, and it is implemented along the following lines:

public SomeDbContext Db { get; set; }
public INotificationConnection NotifciationConnection { get; set; }

public Task Handle(SomeMessage message, IMessageHandlerContext context)
{
   return HelperMethod(message.FieldA, message.FieldB);
}

private async Task HelperMethod(int x, decimal y) 
{
   var data = await Db.SomeExtensionMethod(x, y).ConfigureAwait(false);
   if (!data.Any()) return;

   await NotificationConnection.Send(x, data).ConfigureAwait(false);
}

I believe this should be allright? As long as the NServiceBus code that invokes the handler awaits the task returned from Handle at some point before it ends the transaction (commit or rollback), I don’t really see any reason this should not work.

But I do see those messages building up on the queue. I really have no clue whether it actually has any relation to the non-async method that returns a Task, but this is the only thing about the situation that I find at all noteworthy. I hope NSB does not reflect on the assembly to see if the handler is an async method or not, because as far as I can understand that would be incorrect, what matters is whether it returns a task.

In any case it must be a bug, even if not related to async at all. Messages are not supposed to just be left on the queue forever when the endpoint is alive and kicking, and if we have done something wrong, we should have got an exception somewhere along the way…? (Obviously this may be a tall order in some cases, but it definitely would be preferable to have it fail hard every time so we’d catch it in testing, instead of discovering such things in production.)

danielmarbach · February 29, 2024, 11:06am

Hi @Dag_Oystein_Johansen

I doubt it is related to the method being async. NServiceBus takes care of knowing when the handlers are done and then marks the message as consumed or not depending on the outcome of the handler. To me it sounds more like something is hanging in the handler and due to that all the concurrency of the endpoint is “eaten up” which then makes all other messages in the queue have to wait until the number of concurrent handlers are completed.

Do you do any sort of implicit or explicit transaction management with your DB context? Do you have anything in the DI container initialization that could lock/deadlock when the dependencies are resolved? What is the NotificationConnection doing? Do you know whether the handlers are actually completing or not? Could you consult the log or add some additional log statements to that handler to pinpoint things down?

Regards,
Daniel