-
Couldn't load subscription status.
- Fork 47
dr: adds shadowing docs #1381
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: beta
Are you sure you want to change the base?
dr: adds shadowing docs #1381
Conversation
✅ Deploy Preview for redpanda-docs-preview ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
|
Important Review skippedAuto incremental reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the 📝 WalkthroughWalkthrough
Sequence Diagram(s)sequenceDiagram
autonumber
actor Admin as Operator
participant Prim as Primary Cluster
participant Shadow as Shadow Cluster
participant Ctrl as Admin API / rpk
participant Sec as Auth/TLS
participant Obs as Monitoring
rect rgb(235, 245, 255)
note over Admin,Ctrl: Configure Shadowing
Admin->>Ctrl: Create shadow link (templates, filters)
Ctrl->>Sec: Authenticate / TLS handshake
Ctrl->>Prim: Apply link config
Prim-->>Shadow: Establish replication channel
end
rect rgb(245, 255, 235)
note over Prim,Shadow: Ongoing Replication (normal ops)
Prim-->>Shadow: Replicate topics/configs/ACLs/schema
Prim-->>Shadow: Preserve offsets/timestamps (where applicable)
Admin->>Ctrl: rpk/admin queries (status/metrics)
Ctrl-->>Obs: Emit metrics/alerts
end
rect rgb(255, 245, 235)
note right of Admin: Planned ops are handled in Shadowing guide
end
sequenceDiagram
autonumber
actor Admin as Operator
participant Prim as Primary Cluster
participant Shadow as Shadow Cluster
participant Ctrl as Admin API / rpk
participant Apps as Applications/Clients
participant Obs as Monitoring
rect rgb(255, 245, 235)
note over Admin,Prim: Emergency Failover Runbook
Admin->>Prim: Assess incident, document state
Admin->>Shadow: Verify readiness/health
Admin->>Ctrl: Initiate failover (full or selective)
Ctrl->>Shadow: Transition shadow links (FAILING_OVER→ACTIVE)
Shadow-->>Obs: Report progress/status
end
rect rgb(245, 255, 235)
note over Apps,Shadow: Post-failover
Admin->>Apps: Update bootstrap/endpoints, TLS/ACLs
Apps->>Shadow: Reconnect and resume traffic
Admin->>Ctrl: Verify topics/consumer groups/offsets
end
alt Issues detected
Obs-->>Admin: Alerts (PAUSED, stuck states, auth failures)
Admin->>Ctrl: Troubleshoot per runbook steps
else Stable
Admin->>Prim: Plan recovery/back-sync later
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Pre-merge checks and finishing touches✅ Passed checks (3 passed)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (4)
modules/ROOT/nav.adoc (1)
88-91: Add the emergency failover doc to navigationShadowing entry looks good. Add a sibling nav item for the emergency runbook so users can find it.
Example:
**** xref:deploy:redpanda/manual/high-availability.adoc[High Availability] **** xref:deploy:redpanda/manual/resilience/shadowing.adoc[Shadowing] +**** xref:deploy:redpanda/manual/resilience/emergency-shadowing.adoc[Emergency Shadowing Failover] **** xref:deploy:redpanda/manual/sizing-use-cases.adoc[Sizing Use Cases]modules/deploy/pages/redpanda/manual/resilience/shadowing.adoc (2)
290-299: Avoid promoting plaintext secrets in examplesAdd a callout suggesting env vars or file-based secrets for credentials (and mTLS certs/keys), not inline plaintext.
Example:
- Prefer env vars (RPK_SASL_PASSWORD) or reference secret files
- Link to security guidance on managing secrets
38-38: Diagram TODOIf you need help, I can draft a diagram (draw.io/mermaid) showing active→shadow replication, preserved offsets/timestamps, and replicated artifacts.
modules/deploy/pages/redpanda/manual/resilience/emergency-shadowing.adoc (1)
74-83: Call out irreversibility before executing failoverAdd an [IMPORTANT] note that failover promotion is irreversible; no automatic fallback. Place immediately before the commands.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
Disabled knowledge base sources:
- Jira integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (3)
modules/ROOT/nav.adoc(1 hunks)modules/deploy/pages/redpanda/manual/resilience/emergency-shadowing.adoc(1 hunks)modules/deploy/pages/redpanda/manual/resilience/shadowing.adoc(1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: Redirect rules - redpanda-docs-preview
- GitHub Check: Header rules - redpanda-docs-preview
- GitHub Check: Pages changed - redpanda-docs-preview
🔇 Additional comments (6)
modules/deploy/pages/redpanda/manual/resilience/emergency-shadowing.adoc (2)
6-10: Enterprise license note is consistent; LGTMKeep this partial include at the top across both docs for consistency.
48-56: Verifyrpk shadowsubcommands and flags: Confirm thatrpk shadow list,status,failover,delete,resumeand their flags (--all,--topic,--no-confirm) used inemergency-shadowing.adoc(and the corresponding sections inshadowing.adoc) match the current output ofrpk shadow --help.modules/deploy/pages/redpanda/manual/resilience/shadowing.adoc (4)
330-425: Verify ShadowLinkConfig schema alignment
Ensure the YAML example’s field names (client_options, authentication_configuration, topic_metadata_sync_options, synced_shadow_topic_properties, consumer_offset_sync_options, security_sync_options) exactly match the ShadowLinkConfig schema in the Admin API or rpk CLI.
54-57: Verify and cite Shadowing’s minimum version requirement
- Confirm that Shadowing was introduced in Redpanda v25.3 and update the prerequisite if needed.
- Add a link to the official v25.3 release notes or product specification where this requirement is defined.
557-576: Confirm shadow-link metrics are documented and standardize type/units
Verify that eachredpanda_shadow_link_*metric appears in modules/reference/pages/public-metrics-reference.adoc and update every description to explicitly specify the Prometheus type (counter vs gauge) and units (bytes, records, offsets).
231-237: Verifyrpk shadow config generateexists and--outputflagConfirm this subcommand and its
--outputflag are implemented in the CLI; update the docs if they’re missing.
modules/deploy/pages/redpanda/manual/resilience/emergency-shadowing.adoc
Outdated
Show resolved
Hide resolved
modules/deploy/pages/redpanda/manual/resilience/emergency-shadowing.adoc
Outdated
Show resolved
Hide resolved
|
@paulohtb6 I have a hard time finding these changes in https://deploy-preview-1381--redpanda-docs-preview.netlify.app/current/get-started/intro-to-events/ (can you please point me to the exact URL). |
|
@bharathv Hey Bharath. Changes are in the page previews section on the PR description. Copying them here too |
| @@ -0,0 +1,212 @@ | |||
| = Shadowing Runbook | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@paulohtb6 I think the page should be renamed too, so "emergency" is not in the URL. Also, the term runbook feels internal to me. What do you think about Failover for Disaster Recovery or Disaster Recovery Guide? cc @Feediver1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed to "Shadowing Guide". Let me know if it's ok or if I should change more.
modules/deploy/pages/redpanda/manual/resilience/emergency-shadowing.adoc
Outdated
Show resolved
Hide resolved
Co-authored-by: Joyce Fee <[email protected]> Co-authored-by: Michele Cyran <[email protected]>
Co-authored-by: Trevor Blackford <[email protected]>
Co-authored-by: Michele Cyran <[email protected]> Co-authored-by: Joyce Fee <[email protected]>
|
|
||
| Redpanda v25.3 introduces xref:deploy:redpanda/manual/resilience/shadowing.adoc[Shadowing], an Enterprise-licensed disaster recovery solution that provides asynchronous, offset-preserving replication between distinct Redpanda clusters. Shadowing enables cross-region data protection by replicating topic data, configurations, consumer group offsets, ACLs, and Schema Registry data with byte-level fidelity. | ||
|
|
||
| The shadow cluster operates in read-only mode while continuously receiving updates from the source cluster. During a disaster, you can fail over individual topics or an entire shadow link to make resources fully writable for production traffic. See xref:deploy:redpanda/manual/resilience/shadowing-guide.adoc[Emergency Shadowing Guide] for emergency procedures. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggest shadowing-guide.adoc[] since this keeps getting renamed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also suggest failover (one word) here, since most doc uses that. I still think we should discuss overall usage, but for now I'd keep it all consistent.
Description
Adds Shadowing docs.
Adds emergency runbook.
Resolves https://redpandadata.atlassian.net/browse/DOC-1665
Review deadline: Oct 17th
Page previews
Shadowing
Shadowing guide
Checks