Atlassian's cross product user search service is currently degraded.
Incident Report for Compass
Postmortem

SUMMARY

On Dec 18, 2023, between 12:29 p.m. and 3:35 p.m. UTC, Dec 18, 2023, Atlassian's cloud customers using Atlas, Bitbucket Cloud, Compass, Confluence Cloud, Jira Service Management, Jira Software, Jira Work Management, Jira Product Discovery products were unable to search for users or use the "@mention" functionality. Customers' user search results failed or were delayed as Atlassian's service returning user search results was degraded in several regions.

The incident originated from a computationally intensive operation that was triggered multiple times in rapid succession, resulting in degraded performance of Atlassian's user search service across several regions. Notably, customers in the EU west region were most affected. The incident was detected within 2 minutes by automated monitoring, and our team promptly took action by recovering unhealthy systems and scaling up the service's infrastructure temporarily. The resolution process concluded in 3 hours and 06 minutes.

IMPACT

The overall impact was between Dec 18, 2023, between 12:29 p.m. UTC and Dec 18, 2023, 3:35 p.m. UTC. The Incident caused service disruption to cloud customers worldwide. Customers experienced delayed or failed user searches when using the following Atlassian cloud products:

  • Atlas
  • Bitbucket Cloud
  • Compass
  • Confluence Cloud
  • Jira Service Management
  • Jira Software
  • Jira Work Management
  • Jira Product Discovery

ROOT CAUSE

The incident stemmed from Atlassian's user search service receiving commands to process multiple computationally intensive operations in rapid succession. These operations were directed at the same customer data set, and therefore overloaded resources within a clustered database system, leading to memory exhaustion and subsequent unresponsiveness to user search requests. 

REMEDIAL ACTIONS PLAN & NEXT STEPS

To prevent a recurrence of such incidents, we are implementing the following measures:

  • Implement a mechanism to queue computationally intensive operations in order to avoid overloading the resources within the systems and process them without impact on customer experience.
  • Fine-tune our clustered database settings to mitigate the impact of resource exhaustion on the overall system. 

We apologize to customers whose services were affected during this incident; we are taking immediate steps to improve the service’s resiliency.

Thanks,

Atlassian Customer Support

Posted Dec 28, 2023 - 03:18 UTC

Resolved
It has been resolved. Atlassian's cross product user search is working.
Posted Dec 19, 2023 - 03:28 UTC
Update
Atlassian's cross product user search service is currently healthy. Searches for users within Atlassian products are working as expected.
We are in the process of investigating the root cause of this incident.
Posted Dec 18, 2023 - 16:23 UTC
Update
Atlassian's cross product user search service is recovering. Searches for users within Atlassian products are returning to normal.
Posted Dec 18, 2023 - 16:06 UTC
Update
Atlassian's cross product user search service is recovering. Searches for users within Atlassian products are returning to normal.
Posted Dec 18, 2023 - 16:02 UTC
Update
Atlassian's cross product user search service is recovering. Searches for users within Atlassian products are returning to normal.
Posted Dec 18, 2023 - 16:01 UTC
Investigating
We are investigating reports of intermittent errors for Atlassian, Confluence, Jira Work Management, Jira Service Management, Jira Software, Atlassian Bitbucket, Jira Align, Jira Product Discovery, Atlas, and Compass Cloud customers. We will provide more details once we identify the root cause.
Posted Dec 18, 2023 - 15:04 UTC