We have a process that generates over a million messages over the course of a hour or two every morning. My audit queue starts to fill up and messages get processed but eventually, messages start failing, get retried, and then end up in the audit queue’s DLQ with the reason of “MaxDeliveryCountExceeded”. The reply to is from the queue that is doing the processing and the target is the audit queue.
I have watched the server that is running service control during this time and find the service control process runs between 25% and 100% but is generally around 50%. It’s an m4.xlarge AWS EC2 instance (4 vcpus/16gb ram).
Do I just need a bigger service control box or is there something else to look at? Let me know what else you need to know.