Prevent LoadBalancer updates on follower.#6872
Prevent LoadBalancer updates on follower.#6872sunjayBhatia merged 1 commit intoprojectcontour:mainfrom
Conversation
8b2af7d to
920d0ce
Compare
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #6872 +/- ##
==========================================
- Coverage 80.77% 80.73% -0.05%
==========================================
Files 131 131
Lines 19936 19946 +10
==========================================
Hits 16104 16104
- Misses 3540 3550 +10
Partials 292 292
🚀 New features to boost your workflow:
|
internal/k8s/statusaddress.go
Outdated
|
|
||
| func (s *ServiceStatusLoadBalancerWatcher) notify(lbstatus core_v1.LoadBalancerStatus) { | ||
| s.LBStatus <- lbstatus | ||
| if s.leader.Load() { |
There was a problem hiding this comment.
Is it possible to lose the envoy service event in this way?
There was a problem hiding this comment.
True. Currently the event processing logic relies on Added/Updated/Deleted events alone. It works for other resource types as they will be processed by all Contour instances but I think new approach is needed with the load balancer status updates.
There was a problem hiding this comment.
Maybe we can simply handle it. Notify the latest envoy service when it becomes Leader.
There was a problem hiding this comment.
I've changed the approach to start the service watcher only after the instance becomes the leader.
|
The Contour project currently lacks enough contributors to adequately respond to all PRs. This bot triages PRs according to the following rules:
You can:
Please send feedback to the #contour channel in the Kubernetes Slack |
|
The Contour project currently lacks enough contributors to adequately respond to all PRs. This bot triages PRs according to the following rules:
You can:
Please send feedback to the #contour channel in the Kubernetes Slack |
|
The Contour project currently lacks enough contributors to adequately respond to all PRs. This bot triages PRs according to the following rules:
You can:
Please send feedback to the #contour channel in the Kubernetes Slack |
|
The Contour project currently lacks enough contributors to adequately respond to all PRs. This bot triages PRs according to the following rules:
You can:
Please send feedback to the #contour channel in the Kubernetes Slack |
fc9581e to
a2a792e
Compare
Start watching Service updates only after becoming leader, when starting loadBalancerStatusWriter. This avoids piling up unprocessed Service updates on follower instance, consuming memory and eventually causing an out-of-memory condition and killing Contour process. Signed-off-by: Tero Saarni <tero.saarni@est.tech>
a2a792e to
0ea6f5c
Compare
|
Some notes for manual testing: Prepare environment with modified Contour Then create Set IP address for service Check that the address is reflected on Scale down replicas to zero and change address while Contour is down Scale up and check that leader picks up the latest address and updates the |
sunjayBhatia
left a comment
There was a problem hiding this comment.
Nice fix! yeah the load balancer status writer already has other informers for updating resources that need the load balancer status and it makes sense to start the load balancer service watcher there too, since it only needs to run on the leader instance
Start watching Service updates only after becoming leader, when starting loadBalancerStatusWriter. This avoids piling up unprocessed Service updates on follower instance, consuming memory and eventually causing an out-of-memory condition and killing Contour process. Signed-off-by: Tero Saarni <tero.saarni@est.tech>
This PR fixes a memory leak that was triggered by LoadBalancer status updates. Only the leader instance runs
loadBalancerStatusWriterand therefore follower does not have anyone reading from the channel that receives status updates. The LoadBalancer status updates are still watched by followers and sent to the channel, causing the go routine callingServiceStatusLoadBalancerWatcher.notify()to block. This led toLoadBalancerStatusupdates piling up and consuming memory, eventually causing an out-of-memory condition and killing the Contour process.With this change,
Serviceupdates are watched only after the instance becomes the leader and starts theloadBalancerStatusWriter.Fixes #6860