On October 5, 2022, at approximately 08:24 UTC, we received multiple synthetic alerts that indicated Definition Service, which is used to store and retrieve metadata for pages used by Anaplan, was degraded in our Anaplan Data Center – US-East. This resulted in some customers experiencing prolonged platform response times and, in some cases, errors when trying to use some functionalities within the New User Experience pages.
Initial investigation revealed a high CPU and maxed-out connection to the database by a specific pod within the node. This would have impacted some customers’ traffic routed to the unhealthy pod. We restarted the suspected pod and service was restored at 09:38 UTC.
Further investigation identified that an end-user ran multiple attempts to clone an extremely large report page. Due to the sheer size, this task was CPU and memory-intensive, which maxed out connections to the database for a specific pod.
To prevent a reoccurrence, we have added additional monitoring to increase our visibility of requests with durations that exceed normal thresholds. We are also implementing extra measures to limit the size of pages users can clone.
We sincerely apologize for the impact this had on your business operations. We understand the disruption these issues can cause to your business and users, and we are continuously strengthening our systems and procedures to avoid similar problems from happening again in the future. If you have any further questions or concerns, please contact Anaplan Support. Thank you for your patience during this situation and thank you for being an Anaplan customer.