On November 13, 2024, at 00:10 UTC, our alerting notified us of a back-end component of the authorization service not performing as expected. At this point there was no impact to customers.
We restarted the impacted component, but errors continued to persist. We performed a rolling restart of the Authorization service. However, the rolling restart failed, which resulted in all customers being unable to access the platform at 01:39 UTC. This issue affected Anaplan Data Center — U.S. East, Anaplan Data Center — U.S. West, Anaplan Data Center — Germany, Anaplan Data Center — Netherlands, Anaplan Google Cloud Public — U.S. East, Anaplan Google Cloud Public — Japan, Anaplan Amazon Cloud Public — U.S., and Anaplan Amazon Cloud Public — Europe.
Our investigations found that a connectivity issue caused the restart failure. We investigated the connectivity issue but didn't find any outliers. As a result, we performed a rolling restart of a subservice within the authorization service that was impacted by the connectivity issue. Shortly after the restart, access to the platform was restored at 02:38 UTC.
At 03:09 UTC, we started to receive reports that integration within CloudWorks™ weren't being processed. We identified that integrations were in a stuck state because of an authentication issue. To resolve this issue, we completed a rolling restart of multiple sub-services within the CloudWorks™ service. Integrations started to process around 05:38 UTC. However, due to the delay in restoring the CloudWorks™ service, there was a large backlog of integrations. This resulted in a degraded performance of the CloudWorks™ service. At 06:44 UTC, the integration backlog had cleared, and performance was restored in Anaplan Data Center — U.S. East, Anaplan Data Center — U.S. West, Anaplan Google Cloud Public — Japan, Anaplan Amazon Cloud Public — U.S., and Anaplan Amazon Cloud Public — Europe.
CloudWorks™ performance remained degraded in Anaplan Data Center — Germany, Anaplan Data Center — Netherlands, and Anaplan Google Cloud Public — U.S. East as the backlog of integrations remained high. We increased resource allocation for CloudWorks™ within these regions to attempt to speed up the processing of the backlog. At 08:24 UTC, the integration backlog had cleared, and performance was restored in Anaplan Data Center — Germany and Anaplan Google Cloud Public — U.S. East.
CloudWorks™ integrations within Anaplan Data Center — Netherlands remained degraded. We continued to review the service and made further increases, both vertically and horizontally, to resources. This helped the backlog of integrations process faster. However, until the backlog was completely cleared, integration processing continued to be slower than usual. The backlog was completed by 12:44 UTC and CloudWorks™ performance was fully restored.
To prevent a recurrence, we have conducted a thorough analysis of the network, infrastructure, and application services. We identified that there was a performance degradation to one of our firewalls due to resource exhaustion. This resulted from an existing policy misconfiguration. The firewall was never inoperable, but inbound and outbound traffic was impacted for a period, affecting most customers from around 01:12 UTC. The firewall degradation also contributed to the failure of the rolling restart of the authorization service. We immediately updated the firewall policy to prevent the resource exhaustion that occurred. Additionally, we are conducting a full review of all firewall policies to ensure no further misconfigurations exist. We have also looked at the CloudWorks™ service and found improvements that can be made to improve the processing of integrations. We will be adding further capacity to the CloudWorks™ service to enable new configurations to be applied to provide more processing headroom. These mitigations are in progress, and we expect improvements to service over the next few days.
We sincerely apologize for the impact this has had on your customers' business operations. We know these issues can cause trouble for your customers' business and users. We are always improving our systems and procedures to prevent similar issues from happening again.
If you have any further questions or concerns, please contact Anaplan Customer Care. Thank you for your patience during this situation and thank you for being an Anaplan customer.