Increased latency impacting some main cluster apps

Incident Report for Bubble

Postmortem

The three incidents on Jan 6 were caused by one of our database shards experiencing heavier-than-normal load. As a result, applications hosted on that shard, which represent roughly 1/7 of our main cluster applications, may have seen issues loading pages or data.

The root cause of the issues was one app performing a very expensive deletion operation. In response to the issues, we’re making the following short term and longer-term changes:

We temporarily blocked the operation causing the issues, and reached out to the app owner to discuss workarounds
We discovered that the app was missing some critical database indexes, which we created to prevent recurrences of the problem with that app going forward
We adjusted the rate at which we do deletions of items that are heavily-referenced elsewhere in the database, which should provide protection from other apps causing the same issue.
Longer-term, we are continuing to migrate off a legacy stored procedure framework (targeting by end of Q1) which would have prevented this incident from occurring

Posted Jan 09, 2025 - 16:42 EST

Resolved

Our systems are functional and we are closing out this incident.

Posted Jan 06, 2025 - 18:19 EST

Investigating

We are investigating reports of issues with our systems.

Posted Jan 06, 2025 - 18:15 EST

This incident affected: Bubble Core (Main Bubble Environment).