Performance issues on upgrade of NServiceBus.Transport.SqlServer from 6.0.1 to 6.2.0

adbe · June 8, 2021, 6:46am

Current setup

20+ publishers & subscribers with 5 running instances each
One MsSQL db with all messaging tables required for all publishers and subscribers

NServiceBus packages:

NServiceBus 7.4.4
NServiceBus.Extensions.DependencyInjection 1.0.1
NServiceBus.Newtonsoft.Json 2.2.0
NServiceBus.Persistence.Sql 6.2.0 (upgraded from 6.0.4)
NServiceBus.Transport.SqlServer 6.2.0 (upgraded to from 6.0.1)

Enpoint configuration

// Notes

// concurrencyLimit varies from component to component but its usually 8 or around

endpointConfiguration.DisableFeature();

endpointConfiguration.PurgeOnStartup(false);

endpointConfiguration.UseContainer(container);

endpointConfiguration.OverrideLocalAddress(endpointAddress);

endpointConfiguration.SendFailedMessagesTo(endpointErrorAddress);

var persistence = endpointConfiguration.UsePersistence();

persistence.SqlDialect<SqlDialect.MsSqlServer>();

persistence.ConnectionBuilder(() => new SqlConnection(connectionString));

persistence.SubscriptionSettings().CacheFor(TimeSpan.FromHours(1));

persistence.TablePrefix(tablePrefix);

var transport = endpointConfiguration.UseTransport();

transport.QueuePeekerOptions(null, concurrencyLimit * 2);
// transport.QueuePeekerOptions(TimeSpan.FromSeconds(5), concurrencyLimit * 2);

transport.Transactions(TransportTransactionMode.TransactionScope);

transport.ConnectionString(connectionString);

transport.Routing();

endpointConfiguration.LimitMessageProcessingConcurrencyTo(concurrencyLimit);

endpointConfiguration.DefineCriticalErrorAction(OnCriticalError);

endpointConfiguration.Pipeline.Register(loggingRelatedPipeline);

var recoverability = endpointConfiguration.Recoverability();

recoverability.CustomPolicy(customPolicy); //just checking the exception type and decide retry behavior based on that

Behaviour after upgrade

Db server batch requests per second jump up to almost ten times than before causing high cpu usage and general performance degradation.

Changing the queue peeker durations to 5 seconds improves things somewhat by not having constantly the high batch request per second but spikes still happen at short intervals.

The same behavior is observed with any version greater than NServiceBus.Transport.SqlServer 6.0.1 from 6.1.1 not just 6.2.0

The same behavior is observed with or without upgrading NServiceBus.Persistence.Sql from 6.0.4 to 6.2.0

Are there any other things that need to be configured that we missed?

tmasternak · June 8, 2021, 12:58pm

Hi @adbe,

do I understand correctly that what you are seeing is an increased number of table peek queries generated by the transport?

Cheers,
Tomek

adbe · June 8, 2021, 2:15pm

Hi,

From what we have been able to observe yes the activity increased on the db containing the queue tables without any increase in the load being processed by our system.

tmasternak · June 9, 2021, 9:05am

Hi adbe,

I suspect that the behavior you are experiencing is a result of changes done in how we estimate the number of messages in the input queue (in version 6.1.1 we introduced a query that uses an index seek instead of an index scan).

It might be the case that in your environment new query takes significantly less amount of time which in turn causes more queries per second and a higher load on the CPU.

That, said I suggest we continue further investigation using our support channel. Could you send an email to support@particular.net and let me know on this thread so that I can pick it up from there?

Thanks,
Tomek

adbe · June 10, 2021, 1:59pm

Hi,

I sent an email to support as suggested.

Thanks.

Dennis · February 17, 2024, 9:43pm

This has been fixed in a patch. You can find more info in these links:

NServiceBus.SqlServer 7.0.5 and 6.3.7 – Patch releases available
Excessive queries to fetch new messages when no messages are available to be processed · Issue #1293 · Particular/NServiceBus.SqlServer · GitHub