Skip to content

Commit

Permalink
fix(alerts): set severity of 'ectdMembersDown' from 'critical' to 'wa…
Browse files Browse the repository at this point in the history
…rning'

Downgraded severity of 'etcdMembersDown' from 'critical' to 'warning' as a single etcd member being not available should not be a problem for etcd's quorum. If the quorum would not be fulfilled, 'etcdInsufficientMembers' should fire. In addition the 'for' interval was extended from '10m' to '20m' as e.g. a node reboot with a big physical node takes usually longer than 10 minutes.

Signed-off-by: Sebastian Gaiser <[email protected]>
  • Loading branch information
sebastiangaiser committed Jan 29, 2025
1 parent 35d20d1 commit 49638b0
Showing 1 changed file with 2 additions and 2 deletions.
4 changes: 2 additions & 2 deletions contrib/mixin/alerts/alerts.libsonnet
Original file line number Diff line number Diff line change
Expand Up @@ -16,9 +16,9 @@
)
> 0
||| % { etcd_instance_labels: $._config.etcd_instance_labels, etcd_selector: $._config.etcd_selector, network_failure_range: $._config.scrape_interval_seconds * 4 },
'for': '10m',
'for': '20m',
labels: {
severity: 'critical',
severity: 'warning',
},
annotations: {
description: 'etcd cluster "{{ $labels.%s }}": members are down ({{ $value }}).' % $._config.clusterLabel,
Expand Down

0 comments on commit 49638b0

Please sign in to comment.