Timeout message not included in metadata registry

I’m using NServiceBus 7.7.3 and have been trying to work out why some message types used only for a timeout in a saga are being directed to the error queue.

The error is the “…logical message from physical message…” one, details added below.

I’m using conventions to identify message types, and I have these in a separate project with no NServiceBus reference. The project containing the saga references this message project in order to consume and handle those messages.

I have all messages in the message project in the same namespace, this includes the “command” messages and the timeout message.

Debug logs for when the endpoint starts show that any command handled by the saga is detected by NServiceBus.Unicast.MessageHandlerRegistry as being associated to the saga, and NServiceBus.Conventions identifies the message “as message type by Modified NServiceBus Marker Interfaces convention”.

The timeout message however is detected as being associated to the saga (due to the IHandleTimeouts) but the MyTimeout message is not detected by NServiceBus.Conventions.

If I add a reference in the message project to NServiceBus and inherit the MyTimeout message from IMessage the message is detected and the exception does not occur. Is this by design and should I always apply an interface to timeout message types?

I’m going to see if I can put together a simple demo app to be sure it can be replicated outside of my environment. I’ll post later with an update on that in due course.

Here’s the exception details:

NServiceBus.MessageDeserializationException: An error occurred while attempting to extract logical messages from incoming physical message 206070a0-4002-4c84-a331-3ba4621d0491
 ---> System.Exception: Could not find metadata for 'Newtonsoft.Json.Linq.JObject'.
Ensure the following:
1. 'Newtonsoft.Json.Linq.JObject' is included in initial scanning. 
2. 'Newtonsoft.Json.Linq.JObject' implements either 'IMessage', 'IEvent' or 'ICommand' or alternatively, if you don't want to implement an interface, you can use 'Unobtrusive Mode'.
   at NServiceBus.Unicast.Messages.MessageMetadataRegistry.GetMessageMetadata(Type messageType) in /_/src/NServiceBus.Core/Unicast/Messages/MessageMetadataRegistry.cs:line 46
   at NServiceBus.Pipeline.LogicalMessageFactory.Create(Type messageType, Object message) in /_/src/NServiceBus.Core/Pipeline/Incoming/LogicalMessageFactory.cs:line 51
   at NServiceBus.DeserializeMessageConnector.Extract(IncomingMessage physicalMessage) in /_/src/NServiceBus.Core/Pipeline/Incoming/DeserializeMessageConnector.cs:line 129
   at NServiceBus.DeserializeMessageConnector.ExtractWithExceptionHandling(IncomingMessage message) in /_/src/NServiceBus.Core/Pipeline/Incoming/DeserializeMessageConnector.cs:line 48
   --- End of inner exception stack trace ---
   at NServiceBus.DeserializeMessageConnector.ExtractWithExceptionHandling(IncomingMessage message) in /_/src/NServiceBus.Core/Pipeline/Incoming/DeserializeMessageConnector.cs:line 52
   at NServiceBus.DeserializeMessageConnector.Invoke(IIncomingPhysicalMessageContext context, Func`2 stage) in /_/src/NServiceBus.Core/Pipeline/Incoming/DeserializeMessageConnector.cs:line 30
   at NServiceBus.MutateIncomingTransportMessageBehavior.InvokeIncomingTransportMessagesMutators(IIncomingPhysicalMessageContext context, Func`2 next) in /_/src/NServiceBus.Core/MessageMutators/MutateTransportMessage/MutateIncomingTransportMessageBehavior.cs:line 59
   at NServiceBus.InvokeAuditPipelineBehavior.Invoke(IIncomingPhysicalMessageContext context, Func`2 next) in /_/src/NServiceBus.Core/Audit/InvokeAuditPipelineBehavior.cs:line 18
   at NServiceBus.ProcessingStatisticsBehavior.Invoke(IIncomingPhysicalMessageContext context, Func`2 next) in /_/src/NServiceBus.Core/Performance/Statistics/ProcessingStatisticsBehavior.cs:line 25
   at NServiceBus.TransportReceiveToPhysicalMessageConnector.Invoke(ITransportReceiveContext context, Func`2 next) in /_/src/NServiceBus.Core/Pipeline/Incoming/TransportReceiveToPhysicalMessageConnector.cs:line 37
   at NServiceBus.RetryAcknowledgementBehavior.Invoke(ITransportReceiveContext context, Func`2 next) in /_/src/NServiceBus.Core/ServicePlatform/Retries/RetryAcknowledgementBehavior.cs:line 25
   at NServiceBus.MainPipelineExecutor.Invoke(MessageContext messageContext) in /_/src/NServiceBus.Core/Pipeline/MainPipelineExecutor.cs:line 35
   at NServiceBus.TransportReceiver.InvokePipeline(MessageContext c) in /_/src/NServiceBus.Core/Transports/TransportReceiver.cs:line 58
   at NServiceBus.TransportReceiver.InvokePipeline(MessageContext c) in /_/src/NServiceBus.Core/Transports/TransportReceiver.cs:line 64
   at NServiceBus.Transport.SqlServer.ReceiveStrategy.TryProcessingMessage(Message message, TransportTransaction transportTransaction) in /_/src/NServiceBus.Transport.SqlServer/Receiving/ReceiveStrategy.cs:line 47
   at NServiceBus.Transport.SqlServer.ProcessWithNativeTransaction.TryProcess(Message message, TransportTransaction transportTransaction) in /_/src/NServiceBus.Transport.SqlServer/Receiving/ProcessWithNativeTransaction.cs:line 109

I added a Sample, something simple to start with (Sample1, and edging closer to my actual setup (Sample2). I cannot replicate the issue though with this setup.

Whilst diagnosing I did notice that my timeout message types are in an Event folder and have a namespace mapped to message and events by convention. Not sure if this in itself would be an issue as a timeout is a Send rather than a Publish from a saga. I have tried to replicate this situation in Sample2 with “BadOrderSaga” but there appears to be no complaints and the timeout works fine.

I’ve altered my real code to no longer map timeout message types to event convention and just map to message convention. Will report back tomorrow if I still get the errors

So I’m no longer seeing the error, now that I’ve modified the timeout message namespace to only be captured by the message convention and not the event convention too. To be honest though I’m not certain this is what solved it or if there was some other underlying problem, like a saga instance not being available anymore when the timeout occurs, or similar. Can’t replicate now so will leave it there.

I’ve updated my sample (main branch) with code (under Sample2 folder) to reproduce the issue I am seeing.

The code (like my real-world code) transforms the EnclosedMessageTypes header for outgoing messages to only use the type full name rather than FQDN. It’s also using the
NetwonsoftSerializer, hence why we see JObject errors rather than something XML-like.

Steps to replicate a working process:

  1. Start multiple projects for SampleClient and SampleEndpoint under the Sample2 folder.
  2. Focus on SampleClient window and press Enter to send a StartOrder message to the SampleEndpoint endpoint.
  3. Wait 10 seconds, SampleEndpoint should output details about the order being cancelled having processed the OrderTimeout message.

Steps to replicate error:
4. Repeat previous step 1 and 2, and just after the SampleEndpoint receives the StartOrder message, close the debug session to stop the processes.
5. Wait any amount of time (doesn’t matter if timeout has passed), and repeat step 1 to start the SampleEndpoint again. At this point the OrderExpired timeout will be processed immediately or wait till the expiry and then process. Either way, the message cannot be deserialized.

As a further test, if the code that registers the PublishFullTypeNameOnlyBehavior is removed/commented and all the steps repeated, no exception occurs.

Any ideas why a process restart would cause the issue?