-
Notifications
You must be signed in to change notification settings - Fork 21
[DPE-3684] Implement DA139 #663
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## dpe-3684-reinitialise-raft #663 +/- ##
==============================================================
- Coverage 72.18% 71.85% -0.33%
==============================================================
Files 15 15
Lines 3426 3464 +38
Branches 528 535 +7
==============================================================
+ Hits 2473 2489 +16
- Misses 827 844 +17
- Partials 126 131 +5 ☔ View full report in Codecov by Sentry. |
self.framework.observe( | ||
self.charm.on.promote_to_primary_action, self._on_promote_to_primary | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved to the main charm code, since it's no longer used only for async promotion.
try: | ||
health_status = self.get_patroni_health() | ||
except Exception: | ||
logger.warning("Remove raft member: Unable to get health status") | ||
health_status = {} | ||
if health_status.get("role") in ("leader", "master") or health_status.get( | ||
"sync_standby" | ||
): | ||
logger.info(f"{self.charm.unit.name} is raft candidate") | ||
data_flags["raft_candidate"] = "True" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wait for the action to start reinit
@@ -746,15 +747,18 @@ def stop_patroni(self) -> bool: | |||
logger.exception(error_message, exc_info=e) | |||
return False | |||
|
|||
def switchover(self) -> None: | |||
def switchover(self, candidate: str | None = None) -> None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pass a candidate when promoting a specific unit.
for unit in units: | ||
logger.info(f"Stopping unit {unit}") | ||
await stop_machine(ops_test, await get_machine_from_unit(ops_test, unit)) | ||
await sleep(15) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sleep for the Juju leadership to drift.
# Check if Patroni self healed | ||
assert ( | ||
left_unit.workload_status == "active" | ||
and left_unit.workload_status_message == "Primary" | ||
) | ||
logger.warning(f"Patroni self-healed without raft reinitialisation for roles {roles}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sometimes when removing the primary and async replica, Patroni manages to survive, so adding an exception for this case. Should I nail it down further?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there is no need for that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work, Dragomir!
# Check if Patroni self healed | ||
assert ( | ||
left_unit.workload_status == "active" | ||
and left_unit.workload_status_message == "Primary" | ||
) | ||
logger.warning(f"Patroni self-healed without raft reinitialisation for roles {roles}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there is no need for that.
9b0b37a
to
adb8ba9
Compare
Mereged into #611 manually. |
Implement DA139:
promote-to-primary
to promote units and reinitialise RAFT