Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -28,5 +28,5 @@ go.work.sum
.env

# Editor/IDE
.idea/
.vscode/
.idea/
.vscode/
161 changes: 4 additions & 157 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,10 +2,6 @@

A flexible, memory efficient Prometheus `GaugeVec` wrapper for managing **sets** of metrics.

---

## GaugeVecSet

The `GaugeVecSet` is a high-performance wrapper around Prometheus `GaugeVec` that enables bulk operations on series
by specified index and grouping labels.

Expand Down Expand Up @@ -105,162 +101,13 @@ deleted := PodPhase.DeleteByIndex("prod")

### GaugeVecSet: DeleteByGroup

Delete all series that match the given (index, group)
Delete all series that match the given (index, group). The number of index and group values this method requires
coincides with the number of values the gauge was initialized with, meaning you cannot specify partial values for
deletion.

```go
deleted := PodPhase.DeleteByGroup(
[]string{"prod"}, // index
"nginx-6f4c", // group
)
```

## ConditionMetricsRecorder

The `ConditionMetricsRecorder` is an implementation of `GaugeVecSet` for kubernetes operators. It enables
controllers to record metrics for it's kubernetes `metav1.Conditions` on custom resources.

It is inspired by kube-state-metrics patterns for metrics such as `kube_pod_status_phase`. KSM exports one time series
per phase for each (namespace, pod), and marks exactly one as active (1) while the others are inactive (0). This metric
can be thought of as a `GaugeVecSet` with the index label `namespace`, the group `pod` and the `extra` labels
(i.e. variants per group) as the options for `phase`.

Example:

```
kube_pod_status_phase{namespace="default", pod="nginx", phase="Running"} 1
kube_pod_status_phase{namespace="default", pod="nginx", phase="Pending"} 0
kube_pod_status_phase{namespace="default", pod="nginx", phase="Failed"} 0
```

We adopt the same pattern for controller Conditions, but we export only one time series per (status, reason) variant,
meaning we delete all other variants in the group when we set the metric, ensuring the cardinality stays under control.
Additionally, rather than return 1/0 indicating the activeness of the metric, we set the last transition time of the
condition as the value (unix timestamp).

Example metric:

```
operator_controller_condition{
controller="my_controller",
resource_kind="MyCR",
resource_name="my-cr",
resource_namespace="default",
condition="Ready",
status="False",
reason="FailedToProvision"
} 17591743210
```

- **Index**: controller, resource_kind, resource_name, resource_namespace
- **Group**: condition
- **Extra**: status, reason
- **Metric Value**: Unix timestamp of last transition of given condition

### Initialization

The metric should be initialized and registered once.

You can embed the `ControllerMetricsRecorder` in your controller's recorder.

```go
package my_metrics

import (
controllermetrics "sigs.k8s.io/controller-runtime/pkg/metrics"
ocg "github.com/sourcehawk/go-prometheus-gaugevecset/pkg/operator_condition_metrics"
)

// We need this variable later to create the ConditionMetricsRecorder
var OperatorConditionsGauge *ocg.OperatorConditionsGauge

// Initialize the operator condition gauge once
func init() {
OperatorConditionsGauge = ocg.NewOperatorConditionsGauge("my-operator")
controllermetrics.Registry.MustRegister(OperatorConditionsGauge)
}

// Embed in existing metrics recorder
type MyControllerRecorder struct {
ocg.ConditionMetricRecorder
}
```

When constructing your reconciler, initialize the condition metrics recorder with the
operator conditions gauge and a unique name for each controller.

_cmd/main.go_
```go
package main

import (
mymetrics "path/to/pkg/my_metrics"
ocg "github.com/sourcehawk/go-prometheus-gaugevecset/pkg/operator_condition_metrics"
)

func main() {
// ...
recorder := mymetrics.MyControllerRecorder{
ConditionMetricRecorder: ocg.ConditionMetricRecorder{
Controller: "my-controller", // unique name per reconciler
OperatorConditionsGauge: mymetrics.OperatorConditionsGauge,
},
}

reconciler := &MyReconciler{
Recorder: recorder,
}
// ...
}
```

## Usage

The easiest drop-in way to start using the metrics recorder is by creating a `SetStatusCondition` wrapper, which
comes instead of `meta.SetStatusCondition`.

To delete the metrics for a given custom resource, simply call `RemoveConditionsFor` and pass the object.

```go
const (
kind = "MyCR"
)

// SetStatusCondition utility function which replaces and wraps meta.SetStatusCondition calls
func (r *MyReconciler) SetStatusCondition(cr *v1.MyCR, cond metav1.Condition) bool {
changed := meta.SetStatusCondition(&cr.Status.Conditions, cond)
if changed {
// refetch the condition to get the updated version
updated := meta.FindStatusCondition(cr.Status.Conditions, cond.Type)
if updated != nil {
r.Recorder.RecordConditionFor(
kind, cr, updated.Type, string(updated.Status), updated.Reason, updated.LastTransitionTime,
)
}
}
return changed
}

func (r *MyReconciler) Reconcile(ctx context.Context, req ctrl.Request) (ctrl.Result, error) {
// Get the resource we're reconciling
cr := new(v1.MyCR)
if err = r.Get(ctx, req.NamespacedName, cr); err != nil {
return ctrl.Result{}, client.IgnoreNotFound(err)
}

// Remove the metrics when the CR is deleted
if cr.DeletionTimestamp != nil {
r.Recorder.RemoveConditionsFor(kind, cr)
}

// ...

// Update the status conditions using the recorder (it records the metric if changed)
if r.SetStatusCondition(cr, condition) {
if err = r.Status().Update(ctx, cr); err != nil {
return ctrl.Result{}, err
}
}

return ctrl.Result{}, nil
}
```
```
202 changes: 0 additions & 202 deletions pkg/operator_condition_metrics/operator_condition_metrics.go

This file was deleted.

Loading