You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
One of MySQL databases cluster got restarted due to high memory consumption.
11
+
During the MySQL restart on the cluster replica instances, data corruption was detected on both replica nodes.
12
+
Data restoration was conducted from a backup.
13
+
14
+
### Incident Details:
15
+
**Initial Detection**: A database monitoring heartbeat notification was missed.
16
+
**Affected Components**: One of MySQL databases cluster.
17
+
**User Impact**: The service was down while the primary node was restarted (about 1.5 hours).
18
+
19
+
### Root Cause Analysis:
20
+
**Preliminary Findings**: High memory consumption on all MySQL nodes on one of MySQL databases cluster.
21
+
**Investigation**: A rare situation occurred where all nodes of one of MySQL databases cluster restarted simultaneously due to high memory consumption.
22
+
Service restarts were performed on all database nodes.
23
+
As the cluster finished loading, the replicas issued errors related to data corruption.
24
+
25
+
### Resolution and Recovery:
26
+
-**Immediate Actions**: Data restoration was immediately performed on both replicas from backups.
27
+
28
+
### Lessons Learned:
29
+
-**What Went Well**: Immediate response ensured that replicas were restored to their previous state.
30
+
-**What Could Be Improved**: Improved monitoring of high memory usage.
0 commit comments