Migrate ServiceControl/ServicePulse and backing MSMQ queues to new server

AS-IS:
Windows 2016 application-server hosting ServiceControl, ServicePulse and the MSMQ queues used by ServiceControl to receive messages.

TO-BE:
Windows 2022 application-server hosting ServiceControl, ServicePulse and the MSMQ queues used by ServiceControl to receive messages.

Extra requirement:

  • The applications feeding ServiceControl via MSMQ messages must be able to point to the new queues address on their own schedule
  • No (0) unhandled ‘Failed Messages’ can get lost during the migration…

Timeline:

  1. Applications send messages to servicecontrol@appserver1, ServiceControl on appserver1 is operational
  2. OPS provisions new server named appserver2 and sets up servicecontrol.
  3. Developers consult the ServicePulse UI on https://appserver2
  4. Time goes by, applications can keep sending messages to servicecontrol@appserver1
  5. Developers have time to release a new version, pointing to servicecontrol@appserver2
  6. All the producers filling the servicecontrol@appserver1 have been migrated
  7. appserver1 can be decomissioned

Is ServiceControl remote instances • ServiceControl • Particular Docs combined with Zero downtime upgrades • ServiceControl • Particular Docs the way to go or are there better alternatives.

The different servicecontrol/servicepulse instances can be taken offline for a bit without a problem.
Some historical audit data can get lost etc…
However no (0) unhandled ‘Failed Messages’ can get lost during the migration…

We also have producers going to a Azure Service Bus queue and there it’s just a matter of shutting one SC down, migrating the data to new machine and starting up again but MSMQ is more strictly tied to the actual physical instance…

I’ve been dabbling a bit with remotes, see ServiceControl Remotes : Errors not showing up in primary, but before I go down the rabbit hole better ask for some feedback :slight_smile:

Hi @janv8000,

could you answer the following questions to let me better understand your scenario and provide more concrete guidance on the migration?

  1. What is the version of ServiceControl you are using?
  2. Do I understand correctly that the audit data should ideally be migrated as well?

Cheers,
Tomek

  1. ServiceControl version: v4.33.2
  2. Ideally indeed. However if it’s deemed too much effort, the old servers can be kept online until all the old audit messages have expired

Hi @Jan,

The easiest way would be to:

  • Copy over database and services
  • Setup message forwarding between [error|audit]@appserver1 and [error|audit]@appserver2 until all endponts have their configuration migrated.

Unfortunately, version 4.33.2 of ServiceControl is running on top of ESENT (used by RavenDB 3.5) which does not provide storage format compatibility between major versions of Windows. As a result, the migration process will need to be more complicated.

I would suggest the following migration plan:

  • Install new error and audit instances on appserver2
  • Upgrade ServiceControl on appserver1 to the latest 4.* release.
  • Stop ingesting messages on appserver1 by setting IngestErrorMessages and IngestAduitMessages to false
  • Setup message forwarding between appserver1 to appserver2. This sample shows how this can be done.
  • Add the the audit instance on appserver1 as one of the remote instances on appserver2
  • Migrate the error data from appserver1 to appserver2 following the error instance migration guideline

At this point, you can configure ServicePulse to use the new instance (running on appserver2) and it will give you access to all the historical data (both error and audit) as well as to the new messages arriving at error and audit queues both on appserver1 and appserver2.

After the audit message retention period on appserver1 elapses you can safely remove the old audit instance. Once, all endpoints are migrated to use [error|audit]@appserver2 you can remove the message forwarding.

I hope this is helpful. If you have any questions please let me know.

Cheers,
Tomek

Thank you for the detailed step by step.

I assume this can be the most recent release in the 5.x range?

Yes. Sorry for not mentioning that.

I’m at the fourth bullet point, setting up forwarding.
Your plan mentions running a self-written forwarder, can the same be achieved with the built-in forwarding in ServiceControl?

.config on appserver1:

        <add key="ServiceControl/ForwardErrorMessages" value="True" />
        <add key="ServiceBus/ErrorLogQueue" value="error@appserver2" />

Unfortunately, this will not work.

IngestErroMessages and IngestAuditMessages options prevent Service Control from processing any messages and forwarding is done as part of that logic.

And do I have to disable ingestion?
It’s not possible to ingest on appserver1 and have it forward the same message to appserver2 (where it in turn can be ingested)?

Sorry to be hammering on the same nail :slight_smile: , but I haven’t found much documentation concerning the ingestion logic and being able to just use the existing running Windows service to do the forwarding would be quite helpful.

You could keep it on, but I would still advise to go with turning off the ingestion. More details in my second answer.

Keeping the ingestion on would mean that until ServiceControl instances on appserver1 are decommissioned both old and new instances of ServiceControl ingest error (and audit) messages.

As a result, migrating the error messages from the old instance would require you to figure out in the UI, what are the old (stored only in the old instance) vs. the new errors (stored locally but also forwarded) for a given endpoint. In principle, this could be done but in my opinion, introduces a significant risk of missing a message or ending up with a duplicated error.

I think you are asking very valid questions and I hope my explanations make the setup a bit clearer :).