Issue Summary:
EngagementHQ had a 15 minute downtime in Australia and New Zealand on Thursday, 16 August 2018. The issue started at around 09:56 AM AEST and the last reported error was at 10:11 AM AEST.
Service in our other regions - Canada, the UK and the US - were unaffected.
Our investigation unveiled that we experienced the issue due to our third party spam checker 'Akismet' taking too long to respond to our requests.
Root Cause:
We use Akismet to stop spammers from making contributions on your site. During this outage, requests sent from us to Akismet for spam checking did not receive responses in time, therefore creating long queues on our server and causing new requests to fail. This continued for almost 15 minutes after which Akismet started responding normally and our server was able to process the queued requests.
Timeline on Timeout(AEST):
09:56 AM: Akismet request delay begins
10:11 AM: All services up and running.
Corrective and Preventive measures:
We are currently in the process of further debugging our spam checker module and contacting Akismet to proceed with a joined investigation. In the meantime, we have established additional monitoring to temporarily terminate requests to Akismet should this issue recur.