Particular monitoring is running on an EC2 instance in AWS, in which all environments (EC2) send their messages to PSP. Network connectivity looks ok and no issues flagged by our infrastructure team in this regard.
However, it is a very common occurrence everyday, that Particular Service Control and Service Pulse ends up with a backlog of events with the following “Endpoint has failed to send expected heartbeat to ServiceControl. It is possible that the endpoint could be down or is unresponsive. If this condition persists restart the endpoint” maybe spanning two or three hours ago.
It has worked up to the point and then after it just goes down. More often than not, I end up having to reboot the instance running Particular Service Control and Service Pulse, this sometimes can introduce a side effect where the MSMQ service is hung on “starting” and end up carrying out a force kill of MSMQ in order to get it going again.
The version of Service Control is v3.6.2 and Service Pulse v1.16.0, from the status bar on the bottom of the page, it shows it is connected.
This is turning into a such an intermittent thing that is a daily occurrence.
Any advice would be appreciated,