We are currently on service control v1.41.3 and service pulse version 1.9.1. We recently had a bug in one of our applications that threw thousands of error messages which ended up causing some issues with service control being backed up and made service pulse not usable due to timeouts occurring. We were able to get that fixed by recycling the service control windows service and we archived most of the failed messages.
Right now we have 2 issues the first is that it appears service control/raven is rebuilding it indexes or something because when i look at service insight it only shows messages from 3 months ago (our retention limit) but each time i refresh more and more messages start showing up.
The second issues is in service pulse (behind the scenes service control), the failed message count I believe is not reflective of the number of actual failures that exist. When i look at the Failed Message grouping and count the number it would be about 10 at most but the failed messages badge at the top indicates a few thousand. When clicking on the All Failed Messages it shows all the few thousand errors but when i try to archive a few of them nothing happens. It makes me think that the failed message grouping part has the correct errors but something is up with the embedded raven of service control
Edit – We have also noticed that we can’t see any new errors in service pulse and haven’t been able to retry messages either. We are thinking the messages will show up once everything gets “reprocessed” and shows up in service insight. It looks like this will take some time as we have a lot of messages it has to go through to get caught up to today.