-
Notifications
You must be signed in to change notification settings - Fork 2
Comprehensive improvements for alert formatting #4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
📝 WalkthroughSummary by CodeRabbit
✏️ Tip: You can customize this high-level summary in your review settings. WalkthroughAdds alert-formatting utilities to Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant Parser as parseAlerts()
participant Utils as src/utils.js (formatAlerts)
participant Grafana as Grafana
participant AM as Alertmanager
participant Logs as Logs/Runbook
Parser->>Utils: pass raw alert payload (RESPECT_GROUPBY=1)
Utils->>Utils: segment strings, compute diffs, merge summaries
Utils->>Grafana: build/encode explore & dashboard URLs
Utils->>AM: construct Alertmanager silence links
Utils->>Logs: assemble logs & runbook links
Utils-->>Parser: return single grouped, formatted alert (HTML/details)
Parser-->>Client: emit formatted alert downstream
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20–30 minutes
Pre-merge checks and finishing touches✅ Passed checks (3 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
Disabled knowledge base sources:
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
⛔ Files ignored due to path filters (1)
package-lock.jsonis excluded by!**/package-lock.json
📒 Files selected for processing (3)
.dir-locals.el(1 hunks)package.json(1 hunks)src/utils.js(3 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
src/utils.js (1)
src/routes.js (2)
alerts(37-37)utils(2-2)
🔇 Additional comments (15)
.dir-locals.el (1)
1-4: LGTM!Editor configuration for consistent code style within the project.
src/utils.js (13)
1-2: LGTM!Standard import for the jsdiff library added to dependencies.
3-5: LGTM!The regex pattern correctly segments strings into alphanumeric and non-alphanumeric parts for diffing.
7-13: LGTM!Correctly adapts the array-based diff back to character-level information.
226-230: LGTM!The dual color attribute approach (data-mx-color + font color) provides good cross-client compatibility. Though currently unused in the new formatAlerts, it's retained for potential future use as mentioned in the PR objectives.
517-550: LGTM!The per-alert detail blocks correctly filter out common labels and annotations, avoiding redundant information display.
554-592: LGTM!The dynamic time window calculation based on alert start times is a solid improvement over static time ranges, ensuring relevant data is visible in Grafana.
594-605: LGTM!Using common labels for the silence filter ensures the silence covers all alerts in the group.
607-626: LGTM!The dashboard URL deduplication and placeholder substitution logic is sound. The documented limitation of using only the first value when multiple exist is reasonable for the initial implementation.
628-635: LGTM!Simple and effective runbook URL deduplication.
659-663: LGTM!The regex-based label replacement correctly generates Loki queries that match all alert instances, with proper value deduplication.
669-680: LGTM!Clean assembly of all generated URLs with appropriate numbering when multiple links of the same type exist.
692-694: LGTM!The opt-in approach via RESPECT_GROUPBY environment variable provides safe rollout with backward compatibility.
438-450: LGTM!The automatic logs template generation for common Kubernetes label patterns is a helpful convenience feature. Consider testing this with your actual alert configurations to ensure the generated Loki queries work as expected.
package.json (1)
35-35: No action required. Version ^8.0.2 of the "diff" package is valid and free from known vulnerabilities. The package version is current and safe to use.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
♻️ Duplicate comments (2)
src/utils.js (2)
73-76: Critical: Set.intersection incompatible with Node 14.This issue was already flagged in a previous review. Line 75 uses
Set.prototype.intersection(), which is an ES2025 feature only available in Node.js 22+. The package.json specifies"node": ">= 14", so this will cause a runtime error on Node 14-21.
662-667: Critical: Fix operator precedence error in logs datasource comparison.This issue was already flagged in a previous review. Line 665 has a logic error due to operator precedence. The condition evaluates as
alert.annotations.logs_datasource || (defaultDatasource === logsDatasource)instead of the intended(alert.annotations.logs_datasource || defaultDatasource) === logsDatasource.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
Disabled knowledge base sources:
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (1)
src/utils.js(3 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
src/utils.js (1)
src/routes.js (2)
alerts(37-37)utils(2-2)
🔇 Additional comments (1)
src/utils.js (1)
701-703: LGTM: Clean opt-in implementation.The opt-in behavior via
RESPECT_GROUPBYenvironment variable maintains backward compatibility while enabling the new grouped formatting feature.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
Dockerfile (1)
1-1: Consider using Alpine variant and pinning to a specific patch version.For production builds, using the Alpine variant reduces image size significantly and pinning to a specific patch version ensures reproducible, deterministic builds:
-FROM node:24 +FROM node:24-alpineIf available, further improve reproducibility by pinning to a specific patch version:
-FROM node:24 +FROM node:24.x.x-alpine
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
Disabled knowledge base sources:
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
⛔ Files ignored due to path filters (1)
package-lock.jsonis excluded by!**/package-lock.json
📒 Files selected for processing (2)
Dockerfile(1 hunks)package.json(2 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- package.json
🔇 Additional comments (1)
Dockerfile (1)
1-1: Node 24 is now a suitable LTS choice; pinning the patch version is optional.Node.js 24 entered Active LTS in October 2025 and is supported through April 30, 2028, making it a valid production choice that satisfies the project's
>= 22engine requirement. The upgrade from node:14 (EOL April 2023) is appropriate.Optional improvements:
- Pin to a specific patch version (e.g.,
node:24.13.0-bookworm) for reproducible builds instead of auto-receiving updates.- Consider using the Alpine variant (e.g.,
node:24-alpine) to reduce image size if small image footprint is a priority.Likely an incorrect or invalid review comment.
Fizzadar
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice, this is a vast improvement!
|
|
||
| const colorize = (color, msg) => { | ||
| // Wrap it with two different ways of doing color, one of which | ||
| // works on Desktop and one of which works on Android. Yes I know. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🫠 I feel ya
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Nitpick comments (1)
src/utils.js (1)
646-676: Limit logs URL generation to existing template/datasource combinations.Currently
logsDatasourcesis built from all alerts, independent oflogsTemplate, so the nested loops can emit URLs for(logs_template, logs_datasource)pairs that never actually occur together. Those URLs are still syntactically valid but may point at generic “catch‑all” queries not tied to any alert.You can tighten this by deriving datasources per template:
- const logsTemplates = new Set(data.alerts - .map(alert => alert.annotations.logs_template) - .filter(Boolean)); - for (const logsTemplate of logsTemplates) { - const logsDatasources = new Set( - data.alerts - .map(alert => alert.annotations.logs_datasource || defaultDatasource) - .filter(Boolean) - ); + const logsTemplates = new Set( + data.alerts + .map(alert => alert.annotations.logs_template) + .filter(Boolean) + ); + for (const logsTemplate of logsTemplates) { + const logsDatasources = new Set( + data.alerts + .filter(alert => alert.annotations.logs_template === logsTemplate) + .map(alert => alert.annotations.logs_datasource || defaultDatasource) + .filter(Boolean) + ); for (const logsDatasource of logsDatasources) { const alerts = data.alerts.filter( alert => ( alert.annotations.logs_template === logsTemplate && (alert.annotations.logs_datasource || defaultDatasource) === logsDatasource ), );This keeps the behavior but avoids generating links for template/datasource pairs that do not correspond to any alert.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
Disabled knowledge base sources:
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (2)
Dockerfile(1 hunks)src/utils.js(3 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
src/utils.js (1)
src/routes.js (2)
alerts(37-37)utils(2-2)
🔇 Additional comments (3)
Dockerfile (1)
1-1: Node 24 base image looks appropriate for the updated runtime.Using
node:24-alpineis consistent with a modern Node runtime and should be compatible with the existing build steps (npm install --only=production,npm start).src/utils.js (2)
433-559: GroupedformatAlertsimplementation matches the Alertmanager payload shape.The new
formatAlertscorrectly:
- Groups alerts by
status,- Derives a merged summary via
mergeStrings,- Uses common vs per-alert labels/annotations to avoid repetition, and
- Switches to emoji-based severity/status indicators as per the PR description.
The HTML structure (
<details>,<summary>, common header + per-alert expansions) should produce readable, compact messages for grouped notifications.
701-703:RESPECT_GROUPBYbranch cleanly switches to grouped formatting.When
RESPECT_GROUPBY === "1",parseAlertsnow returns a single-element array containing the grouped HTML fromformatAlerts(data), while preserving the old per-alert behavior otherwise. This keeps the return type (string[]) stable and makes the new behavior opt‑in, which aligns with the rollout/backtesting plan described in the PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (3)
src/utils.js (3)
75-75: Verify Node 22+ requirement resolvesSet.intersectionincompatibility.The past review flagged
Set.prototype.intersection()(ES2025) as incompatible with Node 14-21. The AI summary indicates the Node engine requirement was updated to >= 22, which would resolve this issue.Please verify that
package.jsonnow specifies Node >= 22 to confirm this issue is resolved:#!/bin/bash # Check Node engine requirement in package.json cat package.json | jq -r '.engines.node'
112-120: Verify emptyvaryingRangesis handled correctly.A past review flagged that when all input strings are identical,
varyingRangesbecomes empty. Tracing through the logic:
- When empty, line 206 sets
onVarying = unchangedRanges[0].start > 0(typically false for identical strings)- The while loop (lines 209-230) should process
unchangedRangesand exit safelyHowever, to reduce risk and improve clarity, consider adding an early-return guard after line 120:
let varyingRanges = []; for (let i = 0; i < unchangedRangesAugmented.length - 1; i++) { let start = unchangedRangesAugmented[i].end; let end = unchangedRangesAugmented[i + 1].start; if (start !== end) { varyingRanges.push({ start, end }); } } + + // If there are no varying ranges, all strings are identical. + if (varyingRanges.length === 0) { + return strings[0]; + }Run the following test to confirm the function handles identical strings correctly:
#!/bin/bash # Test mergeStrings with identical inputs node -e " const utils = require('./src/utils.js'); const jsdiff = require('diff'); const segmentString = (string) => { return Array.from(string.matchAll(/[a-z0-9.-]+|[^a-z0-9.-]+/gi), match => match[0]); } const diffSegmented = (left, right) => { return jsdiff.diffArrays(segmentString(left), segmentString(right)).map(match => { match.value = match.value.join(''); match.count = match.value.length; return match; }); } const mergeStrings = (strings) => { if (!strings || strings.length === 0) { throw new Error(\`No strings to merge!\`); } if (strings.length === 1) { return strings[0]; } const diffs = []; for (const str of strings) { diffs.push(diffSegmented(strings[0], str)); } const unchangedSets = []; for (const diff of diffs) { const unchanged = new Set(); let ptr = 0; for (const edit of diff) { if (!edit.added) { for (let i = 0; i < edit.value.length; i++) { if (!edit.removed) { unchanged.add(ptr); } ptr += 1; } } } if (ptr !== strings[0].length) { throw new Error(\`Diff didn't cover length of original string!\`); } unchangedSets.push(unchanged); } let unchangedCommon = unchangedSets[0]; for (const unchanged of unchangedSets) { unchangedCommon = unchangedCommon.intersection(unchanged); } let unchangedRanges = []; let rangeActive = false, rangeStart = -1; for (let ptr = 0; ptr < strings[0].length + 1; ptr++) { if (unchangedCommon.has(ptr) && !rangeActive) { rangeActive = true, rangeStart = ptr; } else if (!unchangedCommon.has(ptr) && rangeActive) { rangeActive = false, unchangedRanges.push({ start: rangeStart, end: ptr, }); } } let unchangedRangesAugmented = [ { start: -1, end: 0 }, ...unchangedRanges, { start: strings[0].length, end: strings[0].length + 1 }, ]; let varyingRanges = []; for (let i = 0; i < unchangedRangesAugmented.length - 1; i++) { let start = unchangedRangesAugmented[i].end; let end = unchangedRangesAugmented[i + 1].start; if (start !== end) { varyingRanges.push({ start, end }); } } console.log('varyingRanges.length:', varyingRanges.length); console.log('unchangedRanges.length:', unchangedRanges.length); // Simplified - just return first string if no variations if (varyingRanges.length === 0) { return strings[0]; } return 'merged (simplified test)'; } try { const result = mergeStrings(['identical', 'identical', 'identical']); console.log('Result:', result); console.log('✓ Test passed'); } catch (e) { console.error('✗ Test failed:', e.message); process.exit(1); } "
447-459: Verified: Correct use ofannotationsinstead oflabels.The auto-generation logic correctly checks
alert.annotations.logs_urlandalert.annotations.logs_template(not labels), as confirmed by past review fixes.
🧹 Nitpick comments (1)
src/utils.js (1)
628-631: Dashboard URL substitution uses only the first value (by design).When multiple alerts have different values for a dashboard URL placeholder, only the first value is substituted. This is documented as intentional in the comment above (lines 621-624), with a note that future enhancement could duplicate query parameters to pass all values.
For production use, consider whether alerts with differing label values should receive separate dashboard links (numbered, like the Grafana queries) rather than a single link with an arbitrary value choice.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
Disabled knowledge base sources:
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (1)
src/utils.js(3 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
src/utils.js (1)
src/routes.js (2)
alerts(37-37)utils(2-2)
🔇 Additional comments (4)
src/utils.js (4)
3-5: LGTM:matchAlliterator correctly converted to array.The previous issue with calling
.map()on the iterator has been resolved by usingArray.fromwith the mapping function parameter.
7-13: LGTM: Segmented diff logic is sound.The function correctly applies array-based diffing to segmented strings and reconstructs the diff results.
702-704: LGTM: Clean feature flag for grouped formatting.The
RESPECT_GROUPBYenvironment variable provides a safe opt-in mechanism for the new grouped formatting behavior while preserving backward compatibility.
669-673: Verify the logs template regex replacement logic.The regex replacement converts label matchers from
="$label"or=~"$label"format into=~"value1|value2|..."format. This assumes:
- Templates use the pattern
label="$placeholder"orlabel=~"$placeholder"- The regex
=~?"\$([a-z0-9_]+)"correctly captures both forms- Multiple values are joined with
|to form a regex alternationThis logic appears sound for Prometheus/Loki label matchers, but edge cases to consider:
- What if a label has special regex characters (e.g.,
.,*)? They should be escaped.- What if
valuesis empty? The fallback.+matches any value, which may be too broad.Verify the regex escaping for label values:
#!/bin/bash # Check if there's any escaping logic for regex special characters rg -n "replace.*regex|escape.*regex|quotemeta" src/utils.jsIf escaping is missing, consider adding it:
const expr = logsTemplate.replace(/=~?"\$([a-z0-9_]+)"/g, (_, label) => { const values = new Set(alerts.map(alert => alert.labels[label]).filter(Boolean)); - const regex = values.size > 0 ? [...values].join("|") : ".+"; + const escapedValues = [...values].map(v => v.replace(/[.*+?^${}()|[\]\\]/g, '\\$&')); + const regex = escapedValues.length > 0 ? escapedValues.join("|") : ".+"; return `=~"${regex}"`; });
This PR introduces a completely refurbished alertmanager-to-matrix transform pipeline, fixing numerous outstanding issues and paving the way for future improvements to be made quickly and straightforwardly. No changes to upstream alerts are required for the new implementation to work correctly, but we can take advantage of the added features incrementally, for example by adjusting the default
group_byconfiguration in our Alertmanager.group_by implementation
The biggest problem solved here is the lack of
group_bysupport in the original matrix-alertmanager project. When multiple similar alerts fire, thegroup_byfeature of Alertmanager intelligently groups them together based on your configuration, and sends a single notification covering the entire batch. However, matrix-alertmanager was originally undoing all of this work and bridging the single message from Alertmanager as an entire slew of Matrix messages.For example, here is how one alert showed up as eight different messages in the old system, basically impossible to read:
Compare that to the new version, in a single message:
Firing and resolved alerts are grouped apart, and the alert summaries are intelligently merged, adapting to any input format, so that no information is duplicated.
Inside the disclosure triangle, the labels and annotations that are shared amongst every alert in the group are displayed, followed by the individual instances of the alert, which can be individually expanded to see the additional labels and annotations that vary between the instances:
(Note that this alert has a needless
descriptionannotation which duplicates the information already present in the labels. That could be removed to improve readability further.)alert summary combining
The alert summarizer uses a custom algorithm based on the jsdiff library in order to reliably adapt to any set of provided alert summaries, and automatically pick out the shared text even when it differs in more than one place. For example, consider the following set of alert summaries:
These would be detected as similar and coalesced as follows, without any need to add hinting to the underlying alert definition:
The algorithm is well-documented in code comments and should be relatively straightforward to understand if fixes are required in future.
readability improvements
The old implementation attempted to use colors to disambiguate firing from resolved alerts, as well as to disambiguate alert levels. However, in practice, this had two problems:
The coloring scheme has now been replaced with emojis since I believe they're much easier to parse visually, and do not impair readability of the underlying text (as, for example, the old implementation's yellow text would do). The emoji scheme is:
severity="critical"(PagerDuty)severity="error"severity="warning"severity="info"This makes it easy to tell at a glance which alerts were routed to PagerDuty, and makes each level easy to visually distinguish. In case more than one severity of alert is somehow grouped together in the same alert group, multiple emojis will be shown in the summary line, and they will be individually matched to the corresponding alert instances inside the disclosure triangle.
Despite colors no longer being used, I included a working colorization function that prepares colored text that will render correctly on all Beeper clients, just in case a future contributor wishes to re-introduce that feature more robustly.
better url handling
URLs have been deduplicated. Previously, separate URLs would be rendered for every individual instance of an alert, wasting space and impairing usability (the silence link for an alert would create a silence that did not apply to other, also-firing instances of the same alert).
Now, URLs are dynamically adapted to cover the entire set of firing and resolved alerts in the same alert group. For example:
logs_template, a single URL is covered that uses label regexp matching to show logs from each of the firing alerts in the group. No changes to alert definitions are required to make use of this feature.$envplaceholders, which were not substituted by the previous code, are now handled properly. At present, this only provides a single value for each placeholder, even if the value varies across the firing alerts in the group, although such a case seems unlikely to occur in practice. However, it would be straightforward to use Grafana's multi-select variable feature to cover this case as well.general refactoring
Code duplication is significantly reduced, with the concrete result of fixing a number of inconsistencies across different alert types. For example, all Grafana links now populate an accurate time range based on the reported information about the firing alerts, rather than sometimes defaulting to just the last 15 minutes from dashboard load. In addition, the hardcoded cases for automatic Kubernetes log URL synthesis introduced some years ago by Toni have been factored out to a single location and transparently re-use the
log_templatefacility rather than needing to duplicate the Grafana URL logic.rollout and backtesting
All new functionality is opt-in via the environment variable
RESPECT_GROUPBY=1, which can be separately applied to the matrix-alertmanager instance in production to control the rollout. In addition, I've arranged matrix-alertmanager to run locally against a Beeper room in staging, and have pulled the last 30 days of alerts that were sent to bc-infra in order to backtest and verify that none trigger crashes or other serious misbehavior.