Identify / Clean up backing transport/persistance for decommissioned endpoints

ppittle · June 25, 2018, 7:53pm

NSB Documentation discusses logistics of decommissioning endpoints, but doesn’t cover how to programatically remove the backing transport / persistance resources consumed by a decomissioned endpoint.

Does NSB support or provide any utilities to assist in cleaning up endpoint resources??

I have a scenario where I am constantly creating very short lived ephemeral endpoints. Naively cleaning up after the endpoint is quite burdensome. I’m using Azure Storage transport / persistance, so I need to remove up to 4 queues (main queue, retries, timeouts), blob storage (timeouts / delays), and tables (timeouts / delays). Even if I need to make the azure api calls myself to delete these resources it would be very helpful if there was a nsb library I could use to inspect an endpoint and tell me the names of all the resources that endpoint used.

Note: This was copied from a github issue: On Decommission Endpoint - Clean up backing transport / persistance · Issue #5196 · Particular/NServiceBus · GitHub

ppittle · June 25, 2018, 8:24pm

Dennis van der Stelt asked in the github issue:

The short answer, however, is, that we do not have such a tool or anything that helps.

That being said, I’d like to add two remarks:

I’m not aware of many customers actually requiring such a tool, but maybe there are. We’d need to more carefully think through if this is something that is useful and what edge cases we’d need to think about when adding this to NServiceBus or release it as a separate tool. I will, however, take this into consideration and discuss this internally.

Why are you creating these short-lived endpoints? If they perform a certain business actions, why would they only exist for a short time? And if these ‘respawn’ on a regular basis, can’t you have an endpoint that gets updated/redeployed instead of creating a new one, over and over again?

Context:

I work on a SaaS product that makes heavy use of powershell. Powershell has two interesting constraints:

The endpoints I’m connecting to have strict throttling limitations. I am limited on both the number of client connections that are allowed and the number of messages sent in a time period.
The connection handshake is relatively expensive vs say a https rest endpoint. It’s disadvantageous to constantly be creating a new connection both in terms of time/io but it also counts against throttling.

Additionally, we can have 100s of worker agents that all need to interact with the same powershell tenant endpoint.

The design we came up with then was to spin up a dedicated worker for each powershell tenant with its own queue on demand. This way we can funnel all requests for a specific tenant endpoint to a specific worker; we can persist the underlying powershell connection and limit the number of simultaneous requests being sent to the powershell endpoint.

However, as our query load is inconsistent - we might make 5 queries in 3 minutes but not make another query for several hours or days - we don’t want to pay the compute costs for the dedicated worker to be constantly sitting idle. So after a few minutes of inactivity we’ll tear down the worker.

When we tear down a worker, we also want to clean up the NSB transport/persistance resources that worker was using. We don’t want to flood the Azure Storage Account we’re using with 1000s of queues, tables, and blobs that aren’t being used.

Dennis · August 14, 2018, 1:05pm

Isn’t another design possible where you just have 1 logical endpoint (possibly scaled-out for performance or high-availability reasons) and orchestrate something inside to be able to support your multi-tenant scenario? As in, load assemblies on demand and delegate work to some code from there?

I’m also not exactly aware of what a “powershell tenant” exactly is?