FileLoadException on NServiceBus.Core.dll during Assembly Scanning

Hello,

I have an application with a modular monolith architecture. As part of this we have multiple endpoints, typically one endpoint per module which are using azure service bus as the messaging infrastructure. In total we have 8 endpoints which are starting up, one at a time when the monolithic web application is starting up. Occasionally in production we get the below error.

This only happens on app services that are processing messages. We have some app services running the same code with read-only endpoints that have never experienced this issue. That makes sense as it appears the error is coming from some assembly scanning code, but I am not really sure where to go from here. We are unable to reproduce in any test environments and it happens on production maybe once 1/20 deployments.

Just wondering if anybody has any suggestions on areas to further my investigation?

Assembly Scanning Code:

internal AssemblyScannerConfiguration IncludeOnly( AssemblyScannerConfiguration configuration, params Assembly[] assembliesToInclude)
{
	List<string> includedAssemblyNames = assembliesToInclude.Select(b => $"{b.GetName().Name}.dll").ToList();
    
	string[] files = _fileListProvider.GetFiles($"{AppDomain.CurrentDomain.BaseDirectory}", "*.dll",SearchOption.AllDirectories);
    string[] excludedAssemblyNames = files
        .Select(path => Path.GetFileName(path))
        .Where(a => !includedAssemblyNames.Contains(a))
        .Where(a => !a.ToLower().Contains("nservicebus."))
        .ToArray();
	
    configuration.ExcludeAssemblies(excludedAssemblyNames);
	
    return configuration;
}

Exception:

System.IO.FileLoadException: Could not load file or assembly ‘NServiceBus.Core.dll’ or one of its dependencies. An assertion failure has occurred. (Exception from HRESULT: 0x8007029C)
File name: ‘NServiceBus.Core.dll’ —> System.Runtime.InteropServices.COMException (0x8007029C): An assertion failure has occurred. (Exception from HRESULT: 0x8007029C)
at System.Reflection.AssemblyName.nGetFileInformation(String s)
at System.Reflection.AssemblyName.GetAssemblyName(String assemblyFile)
at NServiceBus.AssemblyValidator.ValidateAssemblyFile(String assemblyPath, Boolean& shouldLoad, String& reason)
at NServiceBus.Hosting.Helpers.AssemblyScanner.TryLoadScannableAssembly(String assemblyPath, AssemblyScannerResults results, Assembly& assembly)
at NServiceBus.Hosting.Helpers.AssemblyScanner.ScanAssembliesInDirectory(String directoryToScan, List`1 assemblies, AssemblyScannerResults results)
at NServiceBus.Hosting.Helpers.AssemblyScanner.GetScannableAssemblies()
at NServiceBus.AssemblyScanningComponent.Initialize(Configuration configuration, SettingsHolder settings)
at NServiceBus.HostCreator.d__1.MoveNext()
— End of stack trace from previous location where exception was thrown —
at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
at NServiceBus.Endpoint.d__1.MoveNext()
— End of stack trace from previous location where exception was thrown —
at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
at Messaging.Core.EndpointCreation.EndpointBuilder.d__30.MoveNext()
— End of stack trace from previous location where exception was thrown —
at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
at Web.App_Start.NServiceBus.NServiceBusConfig.d__2.MoveNext() in C:\a\1\s\Web\App_Start\NServiceBus\NServiceBusConfig.cs:line 62
— End of stack trace from previous location where exception was thrown —
at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
at Web.MvcApplication.Application_Start() in C:\a\1\s\Global.asax.cs:line 73

Application is .NET Framework 4.6.2
Using NServiceBus NuGetPackage v7.6.0

Hi @igobl

Since this only seems to happen occasionally and on Azure App Services, this sounds very hard to reproduce. Unfortunately, without being able to reproduce the issue more reliably, I’m not sure how to further investigate the problem.

It seems that race conditions are unlikely the problem since you’re saying that the endpoints are being started one-by-one. Does this exception occur on the first endpoint being started on the app?

It would be very helpful to have the assembly binding logs available but I’m not sure that’s doable on Azure App Services and you’re saying it’s only happening there? Have a look at the following guidance in case you’re using ASP.NET: Troubleshoot ASP.NET assembly loading failures using Fusion Logging - Azure App Service

As a first attempt of workaround, you could try to disable file-based assembly scanning and only rely on scanning of assemblies loaded into the app domain. This is available in NServiceBus 7.7 (see the documentation here: Assembly scanning • NServiceBus • Particular Docs) and can reduce the amount of assembly loading failure scenarios, especially in externally hosted environments.

Hi Tim,

Thanks for the response. Yes I realized when typing it that it would be difficult to do any concrete diagnosis on it. My best guess would be that the clash is happening between the two deployment slots in the azure app service. I think slots on app services share a disk (production slot and staging slot). When the staging slot starts up during a deployment it my guess is that it is trying to use the DLL that is being used on the production slot.

Thanks for your advice, it definitely gives us some more avenues to explore. Your last suggestion on disabling the file-based assembly scanning sounds like it could be a good solution so I will start there. I’ll report back if it fixes the issue in case anybody else hits the same in the future.

That sounds like a good plan. Eliminating file-based assembly scanning can resolve a lot of assembly loading problem categories (but you need to pay a bit more attention that all your required message contract and handler assemblies are loaded at endpoint startup time).

I’d very interested to hear whether that helped to resolve the issue :+1: