-
Notifications
You must be signed in to change notification settings - Fork 512
Fix MongoDB plugin early exit on secondary nodes #850
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
The plugin was unconditionally exiting when connected to secondary MongoDB nodes, causing monitoring failures with Azure Private Endpoints and load balancers that route to secondary nodes. Fix: Only exit if there is truly no primary node available.
All contributors have signed the CLA ✍️ ✅ |
I have read the CLA Document and I hereby sign the CLA or my organization already has a signed CLA. |
Thanks for your contribution! I'm trying to fully understand your situation: If i read the code and the comment correctly, the primary node vanishes completely, is this correct? So it's not just that the load balancer switches to another node, but the primary node is removed from the cluster? |
The Node does not vanish from the cluster. The Load Balancer switches to a secondary Node which causes the MongoDB Plugin to exit early. It does therefore not generate Piggyback Data for my MongoDB Dummy-Host which results in the loss of all CheckMK Services that this Plugin provides. |
Thanks for the clarification. We will have an internal discussion about this, and then come back to you. |
if "primary" in repl_info and not repl_info.get("primary"): | ||
_write_section_replica(None) | ||
return | ||
# Fixed: Only return if there is truly no primary | ||
if "primary" in repl_info and not repl_info.get("primary"): | ||
return |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of duplicating the if check, how about just indenting the return statement?
if "primary" in repl_info and not repl_info.get("primary"): | |
_write_section_replica(None) | |
return | |
# Fixed: Only return if there is truly no primary | |
if "primary" in repl_info and not repl_info.get("primary"): | |
return | |
if "primary" in repl_info and not repl_info.get("primary"): | |
_write_section_replica(None) | |
return |
The plugin was unconditionally exiting when connected to secondary MongoDB nodes, causing no piggybacking data generation and the loss off All MongoDB Services on the Dummy Host.
Fix: Only exit if there is truly no primary node available.
Expected vs Observed Behavior
Expected: Plugin should generate piggybacking data for MongoDB monitoring services even when connected to secondary nodes, as long as a primary exists in the replica set.
Observed: Plugin exits early with unconditional
return
statement when connected to secondary nodes, preventing any piggybacking data generation and causing complete loss of all MongoDB Services on the dummy host.Operating System
Ubuntu 20.04/22.04 LTS (CheckMK agent host)
CheckMK version: 2.4.0p7.cce
Local Setup
Reproduce (routing is managed by Mongo - so its only reproducable if you are being routed to a secondary node)
Root Cause
Line 985 contains
return
without checking if a primary exists in the replica set.Solution
Replace unconditional return with conditional logic that only exits if no primary is available.
Changes
agents/plugins/mk_mongodb.py
Testing
Impact
Fixes MongoDB Atlas monitoring for users with: