-
Notifications
You must be signed in to change notification settings - Fork 5
Floating IP support for NodeDNSRecordSet #44
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
do-joe
merged 5 commits into
digitalocean:main
from
do-joe:nodedns_controller_reserved_ip-rebase
May 9, 2025
Merged
Changes from 4 commits
Commits
Show all changes
5 commits
Select commit
Hold shift + click to select a range
40f478b
add annotation when IP assigned or change in assignments
do-joe 9ed7db0
only ignore updates if both match status and reserved IP are not changed
do-joe 7f339c2
ndoedns controller supports flipop.digitalocean.com/ipv4-reserved-ip …
do-joe fa69f82
README update
do-joe 79e2ec7
annotationUpdater now uses NodeInformer cache to validate current ann…
do-joe File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,83 +1,200 @@ | ||
| # FLIPOP - Floating IP Operator | ||
| # Floating IP Operator (FLIPOP) | ||
|
|
||
| ## What? | ||
| This tool watches Kubernetes nodes and adjusts cloud network resources (floating IPs and DNS, currently) to target matching nodes. Nodes can be targeted based labels + taints and their pods (health, namespace, and labels). | ||
| FLIPOP is a Kubernetes operator that manages cloud-native Floating IPs (also referred to as Reserved IPs) and DNS records for targeted nodes and pods. It provides advanced traffic steering for workloads—especially latency-sensitive or UDP traffic—where built-in Kubernetes LoadBalancer services may not suffice. | ||
|
|
||
| ## Why? | ||
| Kubernetes nodes and the pods they host are ephemeral and replaced in case of failure, update, or operational convenience. Kubernetes LoadBalancer type services are the traditional tool pivoting cluster traffic in these cases, but don't suit all workloads (ex. latency sensitive workloads, UDP, etc.). This tool aims to provide similar functionality through floating IPs and/or DNS. | ||
| --- | ||
|
|
||
| ## Config | ||
| ## Features | ||
|
|
||
| * Assign and unassign Floating IPs to Kubernetes nodes based on pod and node selectors. | ||
| * Manage DNS A records containing floating or node IPs. | ||
| * Support for multiple DNS providers (e.g., DigitalOcean, Cloudflare). | ||
| * Expose rich Prometheus metrics for observability. | ||
| * Graceful reconciliation loops with configurable retry/backoff. | ||
| * Leader election for high-availability. | ||
|
|
||
| --- | ||
|
|
||
| ## Architecture | ||
|
|
||
| 1. **CRD Watchers**: Informers monitor `FloatingIPPool` and `NodeDNSRecordSet` resources. | ||
| 2. **Match Controller** (`nodematch`): Evaluates pods and nodes against label/taint-based criteria. | ||
| 3. **IP Controller** (`ip_controller`): Reconciles Floating IP assignments and updates status & annotations. | ||
| 4. **DNS Enabler/Disabler** (`nodedns`): Updates DNS records for matching nodes. | ||
| 5. **Metrics Collector** (`metrics`): Implements Prometheus `Collector` interfaces for each controller. | ||
| 6. **Leader Election** (`leaderelection`): Ensures only one active control loop per cluster. | ||
|
|
||
| --- | ||
|
|
||
| ## Custom Resources | ||
|
|
||
| ### FloatingIPPool | ||
| ``` | ||
|
|
||
| Manage Floating IPs and optional DNS records for pods matching specified criteria. | ||
|
|
||
| ```yaml | ||
| apiVersion: flipop.digitalocean.com/v1alpha1 | ||
| kind: FloatingIPPool | ||
| metadata: | ||
| name: ingress-pool | ||
| spec: | ||
| provider: digitalocean | ||
| region: nyc3 | ||
| desiredIPs: 3 | ||
| assignmentCoolOffSeconds: 20 | ||
| ips: | ||
| - 192.168.1.1 | ||
| - 192.168.2.1 | ||
| dnsRecordSet: | ||
| recordName: hello-world.example.com | ||
| zone: abcdefghijklmnopqrstuvwxyz012345 | ||
| ttl: 30 | ||
| provider: cloudflare | ||
| match: | ||
| spec: | ||
| provider: digitalocean # IP provider | ||
| region: nyc3 # Cloud region | ||
| desiredIPs: 3 # Total IPs to allocate | ||
| assignmentCoolOffSeconds: 20 # Seconds to wait between ip assignments, defaults to 0 if not set | ||
| ips: # Static IP list (optional) | ||
| - 192.168.1.1 | ||
| - 192.168.2.1 | ||
| dnsRecordSet: # Optional DNS configuration (defaults to digitalocean) | ||
| recordName: hello | ||
| zone: example.com | ||
| ttl: 30 | ||
| provider: digitalocean | ||
| match: # Node/pod matching criteria | ||
| podNamespace: ingress | ||
| podLabel: app=nginx-ingress,component=controller | ||
| nodeLabel: doks.digitalocean.com/node-pool=work | ||
| podLabel: app=nginx,component=controller | ||
| nodeLabel: doks.digitalocean.com/node-pool=work | ||
| tolerations: | ||
| - effect: NoSchedule | ||
| key: node.kubernetes.io/unschedulable | ||
| - key: node.kubernetes.io/unschedulable | ||
| effect: NoSchedule | ||
| ``` | ||
|
|
||
| **Behavior**: | ||
|
|
||
| * Allocates a number of Floating IPs equal to `desiredIPs`. | ||
| * By default, new floating IPs will be created | ||
| * If you wish to use existing Floating IPs specify them in the list of `ips` | ||
| * Assigns IPs to matching nodes (see Matching section below) | ||
| * Updates DNS A record (if configured) using FloatingIPPool’s reserved IPs by default. | ||
| * Note this behavior is slightly different than how `NodeDNSRecordSet` works. `dnsRecordSet` will always update the DNS record with the nodes Floating IP address, where `NodeDNSRecordSet` must be configured to use the Floating IP address. | ||
| * The annotation `flipop.digitalocean.com/ipv4-reserved-ip` is added to each node with the assigned Floating IP address as the value. | ||
|
|
||
| --- | ||
|
|
||
| ### NodeDNSRecordSet | ||
| ``` | ||
|
|
||
| Manage DNS A records for nodes matching specified criteria. | ||
|
|
||
| ```yaml | ||
| apiVersion: flipop.digitalocean.com/v1alpha1 | ||
| kind: NodeDNSRecordSet | ||
| metadata: | ||
| name: ingress-nodes | ||
| spec: | ||
| provider: digitalocean # DNS provider (defaults to digitalocean) | ||
| dnsRecordSet: | ||
| recordName: nodes | ||
| zone: example.com | ||
| ttl: 120 | ||
| provider: digitalocean | ||
| recordName: nodes.example.com | ||
| zone: example.com | ||
| ttl: 120 | ||
| addressType: flipop.digitalocean.com/ipv4-reserved-ip # Use the node’s reserved IPv4 address (via annotation) | ||
| match: | ||
| nodeLabel: doks.digitalocean.com/node-pool=work | ||
| podNamespace: ingress | ||
| podLabel: app=nginx-ingress,component=controller | ||
| nodeLabel: doks.digitalocean.com/node-pool=work | ||
| podLabel: app=nginx | ||
| tolerations: | ||
| - effect: NoSchedule | ||
| key: node.kubernetes.io/unschedulable | ||
| - key: node.kubernetes.io/unschedulable | ||
| effect: NoSchedule | ||
| ``` | ||
|
|
||
| **Field**: | ||
|
|
||
| * `addressType`: Specifies which node address to publish in DNS. Options: | ||
| * `ExternalIP` (default): Uses each node’s external/public IP. | ||
| * `flipop.digitalocean.com/ipv4-reserved-ip`: Uses the node’s reserved IPv4 address assigned by a FloatingIPPool. Must be set explicitly when DNS should point to reserved IPs. When this addressType is specified that controller will look for the value of this annotation on each node to determine the reserved IP for the node. | ||
| * `InternalIP`: Uses the node’s internal Kubernetes cluster IP. | ||
|
|
||
| **Behavior**: | ||
|
|
||
| * Watches nodes matching `match` criteria. | ||
| * Collects the specified address type from each node. | ||
| * Updates the DNS A record with the collected addresses. | ||
|
|
||
| --- | ||
|
|
||
| ## Matching Behavior | ||
|
|
||
| FLIPOP uses `spec.match` fields to determine which nodes receive Floating IPs: | ||
|
|
||
| 1. **Pod Matching**: The controller watches pods in the specified `podNamespace` with labels matching `podLabel`. Only nodes running at least one matching pod are candidates. | ||
| 2. **Node Matching**: Nodes are filtered by `nodeLabel` and `tolerations`. If a node’s labels and taints match, it passes the node filter. | ||
|
|
||
| **Assignment Logic**: | ||
|
|
||
| * On each reconciliation, the IP Controller collects all candidate nodes. | ||
| * If the number of assigned IPs is less than `desiredIPs`, it assigns IPs to the top candidates (sorted by name) until the quota is met. | ||
| * If nodes no longer host matching pods or no longer match node criteria, then the annotation is removed and any DNS records are updated. | ||
| * Note that the controller will only unassign a Floating IP address from a Droplet if that node no longer matches AND it needs to assign the Floating IP to another node. This means that if a Floating IP is no longer needed it will stay attached to a Droplet to avoid any costs associated with a unassigned Floating IP address. | ||
| * Reassignments respect `assignmentCoolOffSeconds` to avoid rapid churn. | ||
| * When assigning an IP, the controller: | ||
| 1. Requests an available IP from the provider or uses an assigned one from its list. | ||
| 2. Annotates the node with `flipop.digitalocean.com/ipv4-reserved-ip: <IP>`. | ||
| 3. Optionally updates DNS via `dnsRecordSet`. | ||
|
|
||
| --- | ||
|
|
||
| ## Metrics | ||
|
|
||
| FLIPOP exports Prometheus metrics for both controllers and underlying provider calls. | ||
|
|
||
| ### FloatingIPPool Controller Metrics | ||
|
|
||
| Collected by `pkg/floatingip/metrics.go`: | ||
|
|
||
| * `flipop_floatingippoolcontroller_node_status{namespace,name,provider,dns,status}`: Gauge of node counts by status (`available`, `assigned`). | ||
| * `flipop_floatingippoolcontroller_ip_assignment_errors{namespace,name,ip,provider,dns}`: Counter of IP assignment failures. | ||
| * `flipop_floatingippoolcontroller_ip_assignments{namespace,name,ip,provider,dns}`: Counter of successful assignments. | ||
| * `flipop_floatingippoolcontroller_ip_node{namespace,name,ip,provider,dns,provider_id,node}`: Gauge mapping IP to node. | ||
| * `flipop_floatingippoolcontroller_ip_state{namespace,name,ip,provider,dns,state}`: Gauge of each IP’s current state. | ||
| * `flipop_floatingippoolcontroller_unfulfilled_ips{namespace,name,provider,dns}`: Gauge of desired minus actual acquired IPs. | ||
|
|
||
| ### NodeDNSRecordSet Controller Metrics | ||
|
|
||
| Exposed via `pkg/nodedns/metrics.go`: | ||
|
|
||
| * `flipop_nodednsrecordset_records{namespace,name,provider,dns}`: Gauge of total DNS records managed. | ||
|
|
||
| ### Provider Call Metrics | ||
|
|
||
| Each provider instruments calls in `pkg/provider/metrics.go`: | ||
|
|
||
| * `flipop_<subsystem>_calls_total{provider,call,outcome,kind,namespace,name}`: Counter of provider API invocations, labeled by outcome (`success` or `error`). | ||
| * `flipop_<subsystem>_call_duration_seconds{provider,call,kind,namespace,name}`: Histogram of call latencies. | ||
|
|
||
| --- | ||
|
|
||
| ## Providers | ||
| Flipop supports DNS providers and Floating IP providers. FloatingIPPool resources require a Floating IP provider, and can optionally leverage an additional DNS provider. NodeDNSRecordSet providers require a DNS provider. | ||
| | Provider | IP Provider | DNS Provider | Config | | ||
| |--------------|:-----------:|:------------:|------------------------------------| | ||
| | digitalocean | X | X | env var: DIGITALOCEAN_ACCESS_TOKEN | | ||
| | cloudflare | | X | env var: CLOUDFLARE_TOKEN | | ||
|
|
||
| | Provider | IP Provider | DNS Provider | Configuration | | ||
| | ------------ | :---------: | :----------: | --------------------------- | | ||
| | digitalocean | ✅ | ✅ | `DIGITALOCEAN_ACCESS_TOKEN` | | ||
| | cloudflare | ❌ | ✅ | `CLOUDFLARE_TOKEN` | | ||
|
|
||
| Set credentials as environment variables in your operator namespace. | ||
|
|
||
| ** Note: ** For large clusters, it's recommended to request an increase in your API rate limit to mitigate any API throttling due to DNS updates. Large number of DNS updates can be made during events, such as a cluster upgrade, where nodes matching status changes frequently. | ||
|
|
||
| --- | ||
|
|
||
| ## Installation | ||
| ``` | ||
| kubectl create namespace flipop | ||
| kubectl create secret generic flipop -n flipop --from-literal=DIGITALOCEAN_ACCESS_TOKEN="CENSORED" | ||
| kubectl apply -n flipop -f k8s/* | ||
| ``` | ||
|
|
||
| ```bash | ||
| kubectl create namespace flipop | ||
| kubectl create secret generic flipop -n flipop --from-literal=DIGITALOCEAN_ACCESS_TOKEN="CENSORED" | ||
| kubectl apply -n flipop -f k8s | ||
| ``` | ||
| --- | ||
|
|
||
| ## Why not operator-framework/kubebuilder? | ||
|
|
||
| This operator is concerned with the relationships between FloatingIPPool, Node, and Pod resources. The controller-runtime (leveraged by kubebuilder) and operator-framework assume related objects are owned by the controller objects. OwnerReferences trigger garbage collection, which is a non-starter for this use-case. Deleting a FloatingIPPool shouldn't delete the Pods and Nodes its concerned with. The controller-runtime also assumes we're interested in all resources we "own". While controllers can be constrained with label selectors and namespaces, controllers can only be added to manager, not removed. In the case of this controller, we're likely only interested a small subset of pods and nodes, but those subscriptions may change based upon the definition in the FloatingIPPool resource. | ||
|
|
||
| --- | ||
|
|
||
| ## TODO | ||
| - __Grace-periods__ - Moving IPs has a cost. It breaks all active connections, has a momentary period where connections will fail, and risks errors. In some cases it may be better to give the node a chance to recover. | ||
|
|
||
| --- | ||
|
|
||
| ## Bugs / PRs / Contributing | ||
|
|
||
| At DigitalOcean we value and love our community! If you have any issues or would like to contribute, see [CONTRIBUTING.md](CONTRIBUTING.md). | ||
| At DigitalOcean we value and love our community! If you have any issues or would like to contribute, see [CONTRIBUTING.md](CONTRIBUTING.md). |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a small nit--only a matter of efficiency and pattern. Feel free to leave this as is if changing it turns into a mess.
It'd be ideal to pull this from the nodeinformer in the node match controller. That will pull the latest node resource from our local memory, rather than fetching from the Kubernetes API. I made a similar call in my (closed) first attempt at this which might be relevant. In that case I was getting all nodes, but there is also a Get() option I believe
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the review and feedback. I agree that we should get the data from cache where we can. It wasn't very straight forward as the node match controllers are properties of FloatingIPPools and the floating ip controller needs some way to find the node in the various node match controller instances.
I took the approach to just iterate through all the node match controllers. I assume that most deployments wont have many FloatingIPPools so this iteration isn't a big deal. I felt this was the more straight forward approach.
An alternative is to keep a NodeNameToPool mapping in the floating ip controller. This would involve passing in a NodeNameToPoolUpdater function when creating the FloatingIPPool and node match controller. I didn't feel like that was as straight forward, and ended up with having to worry about ensuring the mapping was always up to date. Not that its particularly hard, but it was more state to manage, where the other approach keeps all the state in the node match controller.
Happy to switch to use the mapping approach if you think it's best.