Subscriptions Behaving Oddly

Using NSB v6.4.3 and NSB.RavenDB v4.2.5.

Hard to describe this one, and I will have to post any examples privately via email.

The issue is that the publisher seems to be only sporadically publishing events to certain endpoints. So If I publish MyEvent five times, as Subscriber1 I might only receive it twice. What’s more, Subscriber2 (that subscribes to the same event) might get it twice as well, but it seems that it would get two different events than Subscriber1. It’s almost as if I have things configured for competing consumers, but we are using very vanilla configuration and I don’t see anything in the config/setup that could cause this.

I think this is only happening to subscriptions that originate from Developers’ machines (this is in our dev environment).

From the publisher, I will see the publish ([RoutingToDispatchCon] DEBUG Destination: etc) and the list of subscribers published to; however, there is usually only one subscribing Developer machine endpoint in the list (and all subscribing servers I believe).

Assuming this is indeed only happening for developer subscriptions, my suspicion is that duplicate queue names (even on different machines) are causing the problem. For example, in the raven Subscriptions collection we might have:

MyService_Local@MyMachine
MyService_Local@OtherMachine

and in the Client collection we may have the same:

“Queue” : “MyService_Local”,
“Machine” : “MyMachine”

“Queue” : “MyService_Local”,
“Machine” : “OtherMachine”

I’m not sure where the code that pulls the Raven Subscription data in and loops over it is (NSB.Core of NSB.Raven). I did pull down the source code but ran out of time to look into it.

Let me know what I can do to help narrow it down. Happy to work with support to do a screen share if that helps.

Thanks,

Phil

What are the endpoint names of these subscribers? If they are the same, but just on another machine than these represent the same logical subscriber, but with 2 instances. So if that is the case, then yes, a publish will only be send to one of the instances.

Another potential reason for not seeing all messages being send is are you await correctly? I’ve seen this behaviour happen if the the Send/Publish is not awaited.

– Ramon

Hi Phil,

I suppose we’re talking about MSMQ as Transport.

Can you, as a first thing, double check that expected subscriptions are present in the RavenDB storage?
Are you using versioned subscriptions or the new non-versioned ones?

.m

Yes, MSMQ. We are not disabling versioning, and the expected subscriptions are indeed in the Raven document.

Is there any option that subscribers have a different “messages” assembly,
that has a different version compared to the publisher one?

Yes, the endpoint names are the same.

If they are the same, but just on another machine than these represent the same logical subscriber, but with 2 instances.

Has this always been the case? I don’t believe this happened in NSB 5.x. We had a system where we had more than one dev using the same endpoint name for months, and never ran into this problem.

No, it’s a pretty simple setup. All the consuming code bases are the same. Again, these are developer machines. Also see my reply to Ramon–I don’t think this happened in NSB 5.x.

the RavenDB subscription storage is not stripping away machine name, we’re using the transport address that for MSMQ includes machine name in which case subscriber@machine-A is a different subscriber from subscriber@machine-B

Source: https://github.com/Particular/NServiceBus.RavenDB/blob/support-4.1/src/NServiceBus.RavenDB/Subscriptions/SubscriptionPersister.cs#L112-L128

However it’s probably NServiceBus Core doing that now: https://github.com/Particular/NServiceBus/blob/support-6.4/src/NServiceBus.Core/Routing/UnicastPublishRouter.cs#L32

If I understand the code correctly subscribers are grouped by Endpoint and then a distribution strategy is applied, and in any case only 1 destination is selected.

@andreasohlund can you confirm? Has this changed from V5 to V6?

Thanks Mauro. For now, we can write a little code to make sure the endpoint names are unique for dev machines. Still interested in whether this changed between versions.

Also interested to know how others handle having multiple dev machines subscribing to the same endpoint.

Thanks!

Phil

in theory you could use something like:

endpointConfiguration.MakeInstanceUniquelyAddressable("unique-dev-id");

so that each dev machine will be treated as a different endpoint. I’m not really sure how it works with events.

Yes and no, with NServiceBus 5 you would host multiple instance of the same endpoint behind the distributor. Such instances would now be Workers, and when these subscribe they would use the address of the distributor.

If you would NOT use the distributor, then each instance would indeed subscribe independently but from our opinion this is incorrect. This is a scale-out configuration and a single logical endpoint which has more than one instance should still only receive one copy of the event.

With version 6 we now use ‘sender side distribution’, this now has the logical endpoint information. Hence you see the behavior you experiencing as those round robins between the available entries for load-balancing.

If you really want this you just have to make sure that each endpoint is logically unique. Meaning, just add a suffix to the logical endpoint name:

new EndpointConfiguration($"MyEndpoint-{Environment.UserName}" );

As that is what you want, you logically want to differentiate between different developer boxes I think. This way each gets their own copy of an event as each is logically different and can do any kind of thing with the data.

I don’t see a good reason for doing this. Why would multiple dev boxes subscribe to a publisher? Would be interested in learning the use case for this.

– Ramon

Ramon’s suggestion is probably better than mine, as the uniquely
addressable endpoint thing won’t work with publishers, I guess.

We actually still have a service on 5.x that uses the distributor, which has worked great but is an extra codebase that has to be maintained and deployed (although we pretty much deployed it once and haven’t had to touch it since). So getting rid of the distributor when we upgrade will make everyone happy. I have not (yet) read up on server-side distribution, since we had not been planning on upgrading that particular endpoint for a while.

If you really want this you just have to make sure that each endpoint is logically unique. Meaning, just add a suffix to the logical endpoint name:

new EndpointConfiguration($“MyEndpoint-{Environment.UserName}” );

Yeah, that’s pretty close to what I did. I used the machine name instead of the user name (either would work), and only when the machine is being used for development.

I don’t see a good reason for doing this. Why would multiple dev boxes subscribe to a publisher? Would be interested in learning the use case for this.

This makes me wonder if we are doing something weird with our setup. If we have multiple devs working on an endpoint that is a subscriber of any publisher, that machine will subscribe whenever it’s run (we use auto-subscribe). So without the above code that modifies the endpoint name, you’re going to end up with one subscriber with the same endpoint name for each dev that has ever run the project.

Note that the dev may not even being doing anything related to NSB–they may be working on some other aspect of the code base that still requires running the code.

I’m not disagreeing with you, as the small adjustment we will have to make is well worth getting rid of Distributor. I’m just not clear on how other teams handle this. I was really surprised that no one else ran into this problem going from 5.x to 6.x.

Thanks for your help!

Phil

Phil,

do I get it right that devs are locally working on an endpoint that is a subscriber to publisher that, at development time, is not running on their machines but on a shared development environment?

.m

Mauro,

Yep, that’s exactly it.

Phil

In which case probably @ramonsmits suggestion to customize the endpoint name on the dev machine is the easiest workaround.

.m