From 7a62a819136f5344ef0f0c95d5691a00db03b0eb Mon Sep 17 00:00:00 2001 From: Shwetha Rao Date: Thu, 18 Dec 2025 08:02:36 +0530 Subject: [PATCH 1/9] Added content from pr 48 --- .../ROOT/pages/tutorial-avx2-scheduling.adoc | 503 ++++++++++++++++++ 1 file changed, 503 insertions(+) create mode 100644 modules/ROOT/pages/tutorial-avx2-scheduling.adoc diff --git a/modules/ROOT/pages/tutorial-avx2-scheduling.adoc b/modules/ROOT/pages/tutorial-avx2-scheduling.adoc new file mode 100644 index 0000000..140a6dc --- /dev/null +++ b/modules/ROOT/pages/tutorial-avx2-scheduling.adoc @@ -0,0 +1,503 @@ += AVX2-Aware Scheduling for Couchbase Server + +[abstract] +This tutorial covers how to detect AVX2 CPU extension / x86-64-v3 microarchitecture on Kubernetes nodes, label nodes accordingly, and configure CouchbaseCluster resources to schedule pods only on compatible nodes. + +include::partial$tutorial.adoc[] + +== Background and Motivation + +Starting with **Couchbase Server 8.0**, vector search performance (FTS/GSI) benefits significantly from **AVX2-capable CPUs** on x86-64 nodes. + +=== What is AVX2? + +AVX2 (Advanced Vector Extensions 2) is: + +* A SIMD instruction set available on modern Intel and AMD x86-64 CPUs +* Required for high-performance vectorized operations +* Part of the x86-64-v3 microarchitecture level (along with BMI1, BMI2, and FMA) +* **Not guaranteed** on all cloud VM types +* **Not automatically enforced** by Kubernetes scheduling + +[IMPORTANT] +==== +Kubernetes clusters *must explicitly detect CPU capabilities and constrain scheduling* to ensure Couchbase Server pods land on AVX2-capable nodes. +==== + +== Solution Overview + +This tutorial solves the problem in three layers: + +1. **Node labeling** — detect which nodes support AVX2 +2. **Scheduler constraints** — ensure pods only land on valid nodes +3. **Cloud provisioning** — ensure node pools contain AVX2-capable CPUs + +Two node-labeling approaches are covered: + +* A **simple custom DaemonSet** (lightweight, minimal dependencies) +* **Node Feature Discovery (NFD)** (recommended for production) + +== Method 1: Simple AVX2 Node Labeling via DaemonSet + +This is a lightweight solution when NFD is unavailable or when you prefer minimal dependencies. + +=== How It Works + +* Runs on every node as a DaemonSet +* Reads `/proc/cpuinfo` from the host +* Checks for the `avx2` flag +* Labels the node if AVX2 is present + +=== Label Applied + +[source] +---- +cpu.feature/AVX2=true +---- + +=== DaemonSet YAML + +Create a file named `avx2-node-labeler.yaml`: + +[source,yaml] +---- +apiVersion: v1 +kind: ServiceAccount +metadata: + name: avx2-labeler-sa + namespace: kube-system +--- +apiVersion: rbac.authorization.k8s.io/v1 +kind: ClusterRole +metadata: + name: avx2-labeler-role +rules: +- apiGroups: [""] + resources: ["nodes"] + verbs: ["get", "patch", "update"] +--- +apiVersion: rbac.authorization.k8s.io/v1 +kind: ClusterRoleBinding +metadata: + name: avx2-labeler-binding +subjects: +- kind: ServiceAccount + name: avx2-labeler-sa + namespace: kube-system +roleRef: + kind: ClusterRole + name: avx2-labeler-role + apiGroup: rbac.authorization.k8s.io +--- +apiVersion: apps/v1 +kind: DaemonSet +metadata: + name: avx2-node-labeler + namespace: kube-system +spec: + selector: + matchLabels: + app: avx2-node-labeler + template: + metadata: + labels: + app: avx2-node-labeler + spec: + serviceAccountName: avx2-labeler-sa + containers: + - name: labeler + image: bitnami/kubectl:latest + command: + - /bin/bash + - -c + - | + if grep -qi "avx2" /host/proc/cpuinfo; then + kubectl label node "$NODE_NAME" cpu.feature/AVX2=true --overwrite + fi + sleep infinity + env: + - name: NODE_NAME + valueFrom: + fieldRef: + fieldPath: spec.nodeName + volumeMounts: + - name: host-proc + mountPath: /host/proc + readOnly: true + volumes: + - name: host-proc + hostPath: + path: /proc +---- + +=== Apply the DaemonSet + +[source,console] +---- +kubectl apply -f avx2-node-labeler.yaml +---- + +=== Verify Labels + +[source,console] +---- +kubectl get nodes -L cpu.feature/AVX2 +---- + +== Method 2: Node Feature Discovery (NFD) — Recommended + +**Node Feature Discovery (NFD)** is a Kubernetes SIG project that automatically detects hardware features and labels nodes. + +=== NFD AVX2 Label + +NFD uses the following standardized label for AVX2: + +[source] +---- +feature.node.kubernetes.io/cpu-cpuid.AVX2=true +---- + +This label is standardized and safe to rely on across all environments. + +=== Install NFD Using kubectl + +[source,console] +---- +kubectl apply -k "https://github.com/kubernetes-sigs/node-feature-discovery/deployment/overlays/default?ref=v0.18.3" +---- + +Replace `v0.18.3` with the latest release tag from the https://github.com/kubernetes-sigs/node-feature-discovery/releases[NFD releases page]. + +=== Install NFD Using Helm + +[source,console] +---- +helm install nfd \ + oci://registry.k8s.io/nfd/charts/node-feature-discovery \ + --version 0.18.3 \ + --namespace node-feature-discovery \ + --create-namespace + +---- + +Replace `v0.18.3` with the latest release tag from the https://github.com/kubernetes-sigs/node-feature-discovery/releases[NFD releases page]. + +=== Verify NFD Labels + +[source,console] +---- +kubectl get nodes -L feature.node.kubernetes.io/cpu-cpuid.AVX2 +---- + +== Pod Scheduling with nodeAffinity + +Once nodes are labeled, configure your CouchbaseCluster to schedule pods only on AVX2-capable nodes. + +=== Strict AVX2 Scheduling (Recommended) + +Use `requiredDuringSchedulingIgnoredDuringExecution` to enforce AVX2 requirements: + +[source,yaml] +---- +spec: + servers: + - name: data-nodes + size: 3 + services: + - data + - index + - query + pod: + spec: + affinity: + nodeAffinity: + requiredDuringSchedulingIgnoredDuringExecution: + nodeSelectorTerms: + - matchExpressions: + - key: feature.node.kubernetes.io/cpu-cpuid.AVX2 + operator: In + values: + - "true" +---- + +=== Soft Preference (Fallback Allowed) + +Use `preferredDuringSchedulingIgnoredDuringExecution` if you want AVX2 to be preferred but not required: + +[source,yaml] +---- +spec: + servers: + - name: data-nodes + size: 3 + services: + - data + pod: + spec: + affinity: + nodeAffinity: + preferredDuringSchedulingIgnoredDuringExecution: + - weight: 100 + preference: + matchExpressions: + - key: feature.node.kubernetes.io/cpu-cpuid.AVX2 + operator: In + values: + - "true" +---- + +== Google Kubernetes Engine (GKE) + +GKE requires special care because node pools may use mixed CPU generations and AVX2 is not guaranteed by default. + +=== GKE AVX2 Guarantees + +[cols="1,1"] +|=== +|Guarantee |Status + +|AVX2 by machine type +|Not guaranteed + +|AVX2 by region +|Not guaranteed + +|AVX2 by default +|Not guaranteed + +|AVX2 via min CPU platform +|Guaranteed +|=== + +=== Creating a GKE Node Pool with AVX2 + +**Step 1:** Choose a modern machine family (`n2`, `c2`, `c3`, `n4`, `m2`, `m3`, ...) + +**Step 2:** Enforce minimum CPU platform: + +[source,console] +---- +gcloud container node-pools create avx2-pool \ + --cluster=my-cluster \ + --region=us-central1 \ + --machine-type=n2-standard-4 \ + --min-cpu-platform="Intel Cascade Lake" \ + --num-nodes=3 \ + --node-labels=cpu=avx2 +---- + +Pin min-cpu-platform ≥ Intel Haswell or AMD Rome +Verify online for a comprehensive list of AVX2-capable VM series. + +This guarantees AVX2 at the infrastructure level. + +=== GKE Automatic Node Labels + +GKE automatically applies the following label: + +[source] +---- +cloud.google.com/gke-nodepool= +---- + +=== GKE nodeAffinity Pattern + +[source,yaml] +---- +spec: + servers: + - name: data-nodes + size: 3 + services: + - data + - index + - query + pod: + spec: + affinity: + nodeAffinity: + requiredDuringSchedulingIgnoredDuringExecution: + nodeSelectorTerms: + - matchExpressions: + - key: cloud.google.com/gke-nodepool + operator: In + values: + - avx2-pool + +---- + +== Amazon EKS + +=== AVX2-Capable Instance Types + +The following EC2 instance families support AVX2: + +* **Intel**: M5, C5, R5, M6i, C6i, R6i, M7i, C7i (and newer) +* **AMD**: M5a, C5a, R5a, M6a, C6a, R6a (and newer) + +Verify online for a comprehensive list of AVX2-capable instance types. + +=== Creating an EKS Node Group + +[source,console] +---- +eksctl create nodegroup \ + --cluster my-cluster \ + --name avx2-ng \ + --node-type c6i.large \ + --nodes 3 \ + --node-labels cpu=avx2 +---- + +=== EKS nodeAffinity Pattern + +[source,yaml] +---- +spec: + servers: + - name: data-nodes + size: 3 + services: + - data + - index + - query + pod: + spec: + affinity: + nodeAffinity: + requiredDuringSchedulingIgnoredDuringExecution: + nodeSelectorTerms: + - matchExpressions: + - key: cpu + operator: In + values: + - avx2 +---- + +You can also use the automatic instance type label: + +[source,yaml] +---- +- key: node.kubernetes.io/instance-type + operator: In + values: + - c6i.large + - c6i.xlarge +---- + +== Azure AKS + +=== AVX2-Capable VM Series + +The following Azure VM series support AVX2: + +* **Dv3, Ev3** (Haswell/Broadwell) +* **Dv4, Ev4** (Cascade Lake) +* **Dv5, Ev5** (Ice Lake) + +Verify online for a comprehensive list of AVX2-capable VM series. + +=== Creating an AKS Node Pool + +[source,console] +---- +az aks nodepool add \ + --resource-group rg \ + --cluster-name my-aks \ + --name avx2pool \ + --node-vm-size Standard_D8s_v5 \ + --node-count 3 \ + --labels cpu=avx2 +---- + +=== AKS nodeAffinity Pattern + +[source,yaml] +---- +spec: + servers: + - name: data-nodes + size: 3 + services: + - data + - index + - query + pod: + spec: + affinity: + nodeAffinity: + requiredDuringSchedulingIgnoredDuringExecution: + nodeSelectorTerms: + - matchExpressions: + - key: cpu + operator: In + values: + - avx2 +---- + +== Complete CouchbaseCluster Example + +Here is a complete example combining all best practices: + +[source,yaml] +---- +apiVersion: v1 +kind: Secret +metadata: + name: cb-example-auth +type: Opaque +data: + username: QWRtaW5pc3RyYXRvcg== + password: cGFzc3dvcmQ= +--- +apiVersion: couchbase.com/v2 +kind: CouchbaseCluster +metadata: + name: cb-example +spec: + image: couchbase/server:8.0.0 + security: + adminSecret: cb-example-auth + buckets: + managed: true + servers: + - name: data-nodes + size: 3 + services: + - data + - index + - query + pod: + spec: + affinity: + nodeAffinity: + requiredDuringSchedulingIgnoredDuringExecution: + nodeSelectorTerms: + - matchExpressions: + - key: feature.node.kubernetes.io/cpu-cpuid.AVX2 + operator: In + values: + - "true" + # Alternative using custom DaemonSet label: + # - key: cpu.feature/AVX2 + # operator: In + # values: + # - "true" +---- + +== Troubleshooting + + +=== Verify Node Labels + +[source,console] +---- +# For NFD labels +kubectl get nodes -o custom-columns=\ +NAME:.metadata.name,\ +AVX2:.metadata.labels."feature\.node\.kubernetes\.io/cpu-cpuid\.AVX2" + +# For custom labels (Using the DaemonSet) +kubectl get nodes -L cpu.feature/AVX2 +---- + From c6302a22c545f93a16c0a518fc8ef75bd3e9c182 Mon Sep 17 00:00:00 2001 From: Shwetha Rao Date: Thu, 18 Dec 2025 09:07:14 +0530 Subject: [PATCH 2/9] Updated nav n prerequisite-and-setup files --- modules/ROOT/nav.adoc | 2 ++ modules/ROOT/pages/prerequisite-and-setup.adoc | 2 ++ 2 files changed, 4 insertions(+) diff --git a/modules/ROOT/nav.adoc b/modules/ROOT/nav.adoc index 3771de2..2cc92aa 100644 --- a/modules/ROOT/nav.adoc +++ b/modules/ROOT/nav.adoc @@ -147,6 +147,8 @@ include::partial$autogen-reference.adoc[] ** xref:tutorial-kubernetes-network-policy.adoc[Kubernetes Network Policies Using Deny-All Default] * Persistent Volumes ** xref:tutorial-volume-expansion.adoc[Persistent Volume Expansion] +* Scheduling + ** xref:tutorial-avx2-scheduling.adoc[AVX2-Aware Scheduling for Couchbase Server] * Sync Gateway ** xref:tutorial-sync-gateway.adoc[Connecting Sync-Gateway to a Couchbase Cluster] ** xref:tutorial-sync-gateway-clients.adoc[Exposing Sync-Gateway to Couchbase Lite Clients] diff --git a/modules/ROOT/pages/prerequisite-and-setup.adoc b/modules/ROOT/pages/prerequisite-and-setup.adoc index 3d4354a..44674c5 100644 --- a/modules/ROOT/pages/prerequisite-and-setup.adoc +++ b/modules/ROOT/pages/prerequisite-and-setup.adoc @@ -177,6 +177,8 @@ The architecture of each node must be uniform across the cluster as the use of m NOTE: The official Couchbase docker repository contains multi-arch images which do not require explicit references to architecture tags when being pulled and deployed. However, when pulling from a private repository, or performing intermediate processing on a machine with a different architecture than the deployed cluster, the use of explicit tags may be required to ensure the correct images are deployed. +IMPORTANT: For optimal performance with Couchbase Server 8.0+, especially for vector search (FTS/GSI) workloads, ensure your nodes support AVX2 CPU instructions (x86-64-v3 microarchitecture). Refer to xref:tutorial-avx2-scheduling.adoc[AVX2-Aware Scheduling for Couchbase Server] for detailed guidance on detecting and scheduling pods on AVX2-capable nodes. + == RBAC and Networking Requirements Preparing the Kubernetes cluster to run the Operator may require setting up proper RBAC and network settings in your Kubernetes cluster. From 0633d9c408c28ed6bee0970a34b83803fc6a98ec Mon Sep 17 00:00:00 2001 From: Shwetha Rao Date: Thu, 18 Dec 2025 09:20:47 +0530 Subject: [PATCH 3/9] Added preview yml --- modules/ROOT/pages/prerequisite-and-setup.adoc | 3 ++- preview/HEAD.yml | 2 +- 2 files changed, 3 insertions(+), 2 deletions(-) diff --git a/modules/ROOT/pages/prerequisite-and-setup.adoc b/modules/ROOT/pages/prerequisite-and-setup.adoc index 44674c5..6bea8fe 100644 --- a/modules/ROOT/pages/prerequisite-and-setup.adoc +++ b/modules/ROOT/pages/prerequisite-and-setup.adoc @@ -177,7 +177,8 @@ The architecture of each node must be uniform across the cluster as the use of m NOTE: The official Couchbase docker repository contains multi-arch images which do not require explicit references to architecture tags when being pulled and deployed. However, when pulling from a private repository, or performing intermediate processing on a machine with a different architecture than the deployed cluster, the use of explicit tags may be required to ensure the correct images are deployed. -IMPORTANT: For optimal performance with Couchbase Server 8.0+, especially for vector search (FTS/GSI) workloads, ensure your nodes support AVX2 CPU instructions (x86-64-v3 microarchitecture). Refer to xref:tutorial-avx2-scheduling.adoc[AVX2-Aware Scheduling for Couchbase Server] for detailed guidance on detecting and scheduling pods on AVX2-capable nodes. +IMPORTANT: For optimal performance with Couchbase Server 8.0 and later versions, in particular for vector search (FTS and GSI) workloads, use nodes that support AVX2 CPU instructions (x86-64-v3 Microarchitecture). +For guidance on detecting AVX2 support and scheduling pods on AVX2-capable nodes, see xref:tutorial-avx2-scheduling.adoc[AVX2-Aware Scheduling for Couchbase Server]. == RBAC and Networking Requirements diff --git a/preview/HEAD.yml b/preview/HEAD.yml index 3736c35..a29fd69 100644 --- a/preview/HEAD.yml +++ b/preview/HEAD.yml @@ -3,4 +3,4 @@ sources: branches: [release/8.0] docs-operator: - branches: [DOC-13656-Create-release-note-for-Couchbase-Operator-2.9.0, release/2.8] \ No newline at end of file + branches: [DOC-13857-tutorial-to-detect-avx2, release/2.8] \ No newline at end of file From 201029468508ed6802a1c00d70bfc6d393d1ba57 Mon Sep 17 00:00:00 2001 From: Shwetha Rao Date: Thu, 18 Dec 2025 14:14:10 +0530 Subject: [PATCH 4/9] Edited-structured-added-lead-in-and-then-rewrote --- modules/ROOT/nav.adoc | 2 +- .../ROOT/pages/tutorial-avx2-scheduling.adoc | 317 +++++++++++------- 2 files changed, 203 insertions(+), 116 deletions(-) diff --git a/modules/ROOT/nav.adoc b/modules/ROOT/nav.adoc index 2cc92aa..6698900 100644 --- a/modules/ROOT/nav.adoc +++ b/modules/ROOT/nav.adoc @@ -148,7 +148,7 @@ include::partial$autogen-reference.adoc[] * Persistent Volumes ** xref:tutorial-volume-expansion.adoc[Persistent Volume Expansion] * Scheduling - ** xref:tutorial-avx2-scheduling.adoc[AVX2-Aware Scheduling for Couchbase Server] + ** xref:tutorial-avx2-scheduling.adoc[AVX2-Aware Scheduling for Couchbase Server] * Sync Gateway ** xref:tutorial-sync-gateway.adoc[Connecting Sync-Gateway to a Couchbase Cluster] ** xref:tutorial-sync-gateway-clients.adoc[Exposing Sync-Gateway to Couchbase Lite Clients] diff --git a/modules/ROOT/pages/tutorial-avx2-scheduling.adoc b/modules/ROOT/pages/tutorial-avx2-scheduling.adoc index 140a6dc..e29b9c5 100644 --- a/modules/ROOT/pages/tutorial-avx2-scheduling.adoc +++ b/modules/ROOT/pages/tutorial-avx2-scheduling.adoc @@ -1,63 +1,141 @@ = AVX2-Aware Scheduling for Couchbase Server [abstract] -This tutorial covers how to detect AVX2 CPU extension / x86-64-v3 microarchitecture on Kubernetes nodes, label nodes accordingly, and configure CouchbaseCluster resources to schedule pods only on compatible nodes. +This tutorial explains how to detect the AVX2 CPU extension and x86-64-v3 Microarchitecture on Kubernetes nodes, label nodes accordingly, and configure CouchbaseCluster resources to schedule pods only on compatible nodes. include::partial$tutorial.adoc[] -== Background and Motivation +== Background -Starting with **Couchbase Server 8.0**, vector search performance (FTS/GSI) benefits significantly from **AVX2-capable CPUs** on x86-64 nodes. +Starting with Couchbase Server 8.0, Vector Search (FTS and GSI) performance benefits from AVX2-capable CPUs on x86-64 nodes. -=== What is AVX2? +=== What's Advanced Vector Extensions 2 (AVX2) -AVX2 (Advanced Vector Extensions 2) is: +AVX2 is: -* A SIMD instruction set available on modern Intel and AMD x86-64 CPUs +* An SIMD instruction set available on modern Intel and AMD x86-64 CPUs * Required for high-performance vectorized operations -* Part of the x86-64-v3 microarchitecture level (along with BMI1, BMI2, and FMA) -* **Not guaranteed** on all cloud VM types -* **Not automatically enforced** by Kubernetes scheduling +* Part of the x86-64-v3 Microarchitecture level, along with BMI1, BMI2, and FMA +* Not guaranteed on all cloud VM types +* Not enforced by default in Kubernetes scheduling -[IMPORTANT] -==== -Kubernetes clusters *must explicitly detect CPU capabilities and constrain scheduling* to ensure Couchbase Server pods land on AVX2-capable nodes. -==== +IMPORTANT: Kubernetes clusters must explicitly detect CPU capabilities and restrict scheduling to make sure Couchbase Server pods run on AVX2-capable nodes. -== Solution Overview +== AVX2-Aware Scheduling Approach -This tutorial solves the problem in three layers: +This tutorial approaches the problem through the following layers: -1. **Node labeling** — detect which nodes support AVX2 -2. **Scheduler constraints** — ensure pods only land on valid nodes -3. **Cloud provisioning** — ensure node pools contain AVX2-capable CPUs +* <<#node-labeling-methods,*Node labeling*>>: Detect nodes that support AVX2. +* <<#pod-scheduling-with-nodeaffinity,*Scheduler constraints*>>: Schedule pods only on compatible nodes. +* <<#cloud-specific-node-provisioning,*Cloud provisioning*>>: Make sure node pools use AVX2-capable CPUs. -Two node-labeling approaches are covered: +[#node-labeling-methods] +== Node Labeling Methods -* A **simple custom DaemonSet** (lightweight, minimal dependencies) -* **Node Feature Discovery (NFD)** (recommended for production) +Use one of the following methods to label Kubernetes nodes that support AVX2: -== Method 1: Simple AVX2 Node Labeling via DaemonSet +* <<#node-labeling-via-nfd, *Node Feature Discovery (NFD)*>>: Recommended for production environments +* <<#node-labeling-via-daemonset, *A custom DaemonSet*>>: Provides a direct, lightweight option with minimal dependencies -This is a lightweight solution when NFD is unavailable or when you prefer minimal dependencies. +[#node-labeling-via-nfd] +=== Method 1: Node Feature Discovery (Recommended) -=== How It Works +Node Feature Discovery (NFD) is a Kubernetes SIG project that detects hardware features and labels nodes automatically. -* Runs on every node as a DaemonSet +IMPORTANT: Couchbase recommends this method for production environments. + +Use the following steps to label Kubernetes nodes that support AVX2 using NFD: + +. <<#avx2-node-label-used-by-nfd, NFD to detect AVX2 support>> +. Install NFD by using your preferred method +** <<#install-nfd-kubectl, Install NFD by Using kubectl>> +** <<#install-nfd-helm, Install NFD by Using Helm>> +. <<#verify-nfd-node-labels, Verify NFD Node Labels>> + +[#avx2-node-label-used-by-nfd] +==== AVX2 Node Label Used by NFD + +NFD applies the following standardized node label to indicate AVX2 support. + +[source] +---- +feature.node.kubernetes.io/cpu-cpuid.AVX2=true +---- + +This label follows a standard format and is safe to use across environments. + +[#install-nfd-kubectl] +==== Install NFD by Using kubectl + +Install NFD on the cluster by using `kubectl`. +Replace `v0.18.3` with the latest release tag from the https://github.com/kubernetes-sigs/node-feature-discovery/releases[NFD releases page]. + +[source,console] +---- +kubectl apply -k "https://github.com/kubernetes-sigs/node-feature-discovery/deployment/overlays/default?ref=v0.18.3" +---- + +[#install-nfd-helm] +==== Install NFD by Using Helm + +Install NFD on the cluster by using Helm. +Replace `v0.18.3` with the latest release tag from the https://github.com/kubernetes-sigs/node-feature-discovery/releases[NFD releases page]. + +[source,console] +---- +helm install nfd \ + oci://registry.k8s.io/nfd/charts/node-feature-discovery \ + --version 0.18.3 \ + --namespace node-feature-discovery \ + --create-namespace + +---- + +[#verify-nfd-node-labels] +==== Verify NFD Node Labels + +Verify that NFD applies the AVX2 label to supported nodes. + +[source,console] +---- +kubectl get nodes -L feature.node.kubernetes.io/cpu-cpuid.AVX2 +---- + +[#node-labeling-via-daemonset] +=== Method 2: AVX2 Node Labeling via DaemonSet + +This approach provides a lightweight option when NFD is unavailable or when you want to limit dependencies. + +==== AVX2 Node Labeling Process + +The DaemonSet uses the following process to detect AVX2 support and label nodes: + +* Runs as a DaemonSet on every node * Reads `/proc/cpuinfo` from the host * Checks for the `avx2` flag -* Labels the node if AVX2 is present +* Labels the node when AVX2 support is present -=== Label Applied +Use the following steps to label Kubernetes nodes that support AVX2: + +. <<#define-avx2-label, Define the AVX2 node label>> +. <<#create-daemonset-manifest, Create the DaemonSet manifest>> +. <<#deploy-daemonset, Deploy the DaemonSet>> +. <<#verify-node-labels, Verify node labels>> + +[#define-avx2-label] +==== Define the AVX2 Node Label + +Define the AVX2 node label to identify nodes that support the AVX2 CPU extension. [source] ---- cpu.feature/AVX2=true ---- -=== DaemonSet YAML +[#create-daemonset-manifest] +==== Create the DaemonSet Manifest -Create a file named `avx2-node-labeler.yaml`: +Create a DaemonSet manifest named `avx2-node-labeler.yaml` with the following content that detects AVX2 support and applies the node label. [source,yaml] ---- @@ -130,72 +208,38 @@ spec: path: /proc ---- -=== Apply the DaemonSet +[#deploy-daemonset] +==== Deploy the DaemonSet -[source,console] ----- -kubectl apply -f avx2-node-labeler.yaml ----- - -=== Verify Labels - -[source,console] ----- -kubectl get nodes -L cpu.feature/AVX2 ----- - -== Method 2: Node Feature Discovery (NFD) — Recommended - -**Node Feature Discovery (NFD)** is a Kubernetes SIG project that automatically detects hardware features and labels nodes. - -=== NFD AVX2 Label - -NFD uses the following standardized label for AVX2: - -[source] ----- -feature.node.kubernetes.io/cpu-cpuid.AVX2=true ----- - -This label is standardized and safe to rely on across all environments. - -=== Install NFD Using kubectl +Deploy the DaemonSet to run the AVX2 detection process on all nodes. [source,console] ---- -kubectl apply -k "https://github.com/kubernetes-sigs/node-feature-discovery/deployment/overlays/default?ref=v0.18.3" +kubectl apply -f avx2-node-labeler.yaml ---- -Replace `v0.18.3` with the latest release tag from the https://github.com/kubernetes-sigs/node-feature-discovery/releases[NFD releases page]. +[#verify-node-labels] +==== Verify Node Labels -=== Install NFD Using Helm +Verify that Kubernetes correctly applies the AVX2 label to supported nodes. [source,console] ---- -helm install nfd \ - oci://registry.k8s.io/nfd/charts/node-feature-discovery \ - --version 0.18.3 \ - --namespace node-feature-discovery \ - --create-namespace - +kubectl get nodes -L cpu.feature/AVX2 ---- -Replace `v0.18.3` with the latest release tag from the https://github.com/kubernetes-sigs/node-feature-discovery/releases[NFD releases page]. - -=== Verify NFD Labels +[#pod-scheduling-with-nodeaffinity] +== Pod Scheduling by Using nodeAffinity -[source,console] ----- -kubectl get nodes -L feature.node.kubernetes.io/cpu-cpuid.AVX2 ----- - -== Pod Scheduling with nodeAffinity +After you label nodes, configure the CouchbaseCluster resource to restrict pod scheduling to AVX2-capable nodes in one of the following ways: -Once nodes are labeled, configure your CouchbaseCluster to schedule pods only on AVX2-capable nodes. +* <<#enforce-avx2-scheduling, *Enforce AVX2 Scheduling*>>: Recommended +* <<#prefer-avx2-scheduling, *Prefer AVX2 Scheduling*>>: Fallback allowed -=== Strict AVX2 Scheduling (Recommended) +[#enforce-avx2-scheduling] +=== Enforce AVX2 Scheduling (Recommended) -Use `requiredDuringSchedulingIgnoredDuringExecution` to enforce AVX2 requirements: +Use `requiredDuringSchedulingIgnoredDuringExecution` to enforce AVX2 requirements during pod scheduling. [source,yaml] ---- @@ -220,9 +264,10 @@ spec: - "true" ---- -=== Soft Preference (Fallback Allowed) +[#prefer-avx2-scheduling] +=== Prefer AVX2 Scheduling (Fallback Allowed) -Use `preferredDuringSchedulingIgnoredDuringExecution` if you want AVX2 to be preferred but not required: +Use `preferredDuringSchedulingIgnoredDuringExecution` to prefer AVX2-capable nodes while allowing scheduling on other nodes. [source,yaml] ---- @@ -246,11 +291,21 @@ spec: - "true" ---- -== Google Kubernetes Engine (GKE) +[#cloud-specific-node-provisioning] +== Cloud-Specific Node Provisioning + +Cloud providers expose CPU capabilities and node selection options differently. +Use the following cloud platform-specific guidance to provision nodes with AVX2 support. -GKE requires special care because node pools may use mixed CPU generations and AVX2 is not guaranteed by default. +[#google-gke] +=== Google Kubernetes Engine (GKE) -=== GKE AVX2 Guarantees +GKE requires additional consideration because node pools can include mixed CPU generations and do not guarantee AVX2 support by default. + +[#gke-avx2-guarantees] +==== AVX2 Support Guarantees in GKE + +The following table summarizes how GKE guarantees AVX2 support under different configurations. [cols="1,1"] |=== @@ -269,12 +324,16 @@ GKE requires special care because node pools may use mixed CPU generations and A |Guaranteed |=== -=== Creating a GKE Node Pool with AVX2 +[#creating-gke-node-pool-with-avx2] +==== Create a GKE Node Pool with AVX2 Support -**Step 1:** Choose a modern machine family (`n2`, `c2`, `c3`, `n4`, `m2`, `m3`, ...) +Use the following steps to create a GKE node pool that guarantees AVX2 support. -**Step 2:** Enforce minimum CPU platform: +. Select a compatible machine family, such as `n2`, `c2`, `c3`, `n4`, `m2`, `m3`, and so on. +. Enforce a minimum CPU platform that supports AVX2. ++ +-- [source,console] ---- gcloud container node-pools create avx2-pool \ @@ -285,22 +344,28 @@ gcloud container node-pools create avx2-pool \ --num-nodes=3 \ --node-labels=cpu=avx2 ---- +-- + +. Set the minimum CPU platform (`min-cpu-platform`) to Intel Haswell or AMD Rome, or a newer generation. -Pin min-cpu-platform ≥ Intel Haswell or AMD Rome -Verify online for a comprehensive list of AVX2-capable VM series. +. Verify the selected VM series supports AVX2 by referring to the provider documentation. -This guarantees AVX2 at the infrastructure level. +This configuration guarantees AVX2 support at the infrastructure level. -=== GKE Automatic Node Labels +[#gke-automatic-node-labels] +==== GKE Automatic Node Labels -GKE automatically applies the following label: +GKE automatically applies node labels that identify the node pool associated with each node. [source] ---- cloud.google.com/gke-nodepool= ---- -=== GKE nodeAffinity Pattern +[#gke-node-affinity-pattern] +==== GKE nodeAffinity Pattern + +Use node affinity to restrict pod scheduling to a specific GKE node pool. [source,yaml] ---- @@ -326,18 +391,25 @@ spec: ---- -== Amazon EKS +[#amazon-eks] +=== Amazon Elastic Kubernetes Service (EKS) + +Use the following sections to provision AVX2-capable nodes and configure pod scheduling in Amazon Elastic Kubernetes Service (EKS). + +[#eks-avx2-capable-instance-types] +==== AVX2-Capable EC2 Instance Types -=== AVX2-Capable Instance Types +The following EC2 instance families support AVX2 instructions: -The following EC2 instance families support AVX2: +* *Intel*: M5, C5, R5, M6i, C6i, R6i, M7i, C7i and newer generations +* *AMD*: M5a, C5a, R5a, M6a, C6a, R6a and newer generations -* **Intel**: M5, C5, R5, M6i, C6i, R6i, M7i, C7i (and newer) -* **AMD**: M5a, C5a, R5a, M6a, C6a, R6a (and newer) +Verify the selected instance type supports AVX2 by referring to the provider documentation. -Verify online for a comprehensive list of AVX2-capable instance types. +[#creating-eks-node-group-with-avx2] +==== Create an EKS Node Group with AVX2 Support -=== Creating an EKS Node Group +Create an EKS node group by using AVX2-capable instance types and apply a node label to identify supported nodes. [source,console] ---- @@ -349,7 +421,10 @@ eksctl create nodegroup \ --node-labels cpu=avx2 ---- -=== EKS nodeAffinity Pattern +[#eks-node-affinity-configuration] +==== EKS nodeAffinity Configuration + +Use node affinity to restrict pod scheduling to AVX2-capable nodes. [source,yaml] ---- @@ -374,7 +449,7 @@ spec: - avx2 ---- -You can also use the automatic instance type label: +You can also restrict scheduling by using the automatic instance type label: [source,yaml] ---- @@ -385,19 +460,26 @@ You can also use the automatic instance type label: - c6i.xlarge ---- -== Azure AKS +[#azure-aks] +=== Azure Kubernetes Service (AKS) + +Use the following sections to provision AVX2-capable nodes and configure pod scheduling in Azure AKS. + +[#aks-avx2-capable-vm-series] +==== AVX2-Capable Azure VM Series -=== AVX2-Capable VM Series +The following Azure VM series support AVX2 instructions: -The following Azure VM series support AVX2: +* Dv3 and Ev3 VM series, based on Intel Haswell and Broadwell processors +* Dv4 and Ev4 VM series, based on Intel Cascade Lake processors +* Dv5 and Ev5 VM series, based on Intel Ice Lake processors -* **Dv3, Ev3** (Haswell/Broadwell) -* **Dv4, Ev4** (Cascade Lake) -* **Dv5, Ev5** (Ice Lake) +Verify the selected VM series supports AVX2 by referring to the Azure documentation. -Verify online for a comprehensive list of AVX2-capable VM series. +[#creating-aks-node-pool-with-avx2] +==== Create an AKS Node Pool with AVX2 Support -=== Creating an AKS Node Pool +Create an AKS node pool by using an AVX2-capable VM series and apply a node label to identify supported nodes. [source,console] ---- @@ -410,7 +492,10 @@ az aks nodepool add \ --labels cpu=avx2 ---- -=== AKS nodeAffinity Pattern +[#aks-node-affinity-pattern] +==== AKS nodeAffinity Configuration + +Use node affinity to restrict pod scheduling to AVX2-capable nodes. [source,yaml] ---- @@ -435,9 +520,9 @@ spec: - avx2 ---- -== Complete CouchbaseCluster Example +== A Complete CouchbaseCluster Example -Here is a complete example combining all best practices: +Here's a complete example combining all best practices. [source,yaml] ---- @@ -487,8 +572,11 @@ spec: == Troubleshooting +Use the following checks to confirm that Kubernetes applies AVX2 node labels as expected. + +=== Verify AVX2 Node Labels -=== Verify Node Labels +Verify that nodes expose the expected AVX2 labels, based on the labeling method you use. [source,console] ---- @@ -500,4 +588,3 @@ AVX2:.metadata.labels."feature\.node\.kubernetes\.io/cpu-cpuid\.AVX2" # For custom labels (Using the DaemonSet) kubectl get nodes -L cpu.feature/AVX2 ---- - From bda3ddf7601381e312af0f23acd2f1dcf420af87 Mon Sep 17 00:00:00 2001 From: Shwetha Rao Date: Thu, 18 Dec 2025 14:36:52 +0530 Subject: [PATCH 5/9] Set toc levels --- modules/ROOT/pages/tutorial-avx2-scheduling.adoc | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/modules/ROOT/pages/tutorial-avx2-scheduling.adoc b/modules/ROOT/pages/tutorial-avx2-scheduling.adoc index e29b9c5..328edc1 100644 --- a/modules/ROOT/pages/tutorial-avx2-scheduling.adoc +++ b/modules/ROOT/pages/tutorial-avx2-scheduling.adoc @@ -1,5 +1,8 @@ = AVX2-Aware Scheduling for Couchbase Server +:page-toclevels: 2 +:page-category: Tutorials + [abstract] This tutorial explains how to detect the AVX2 CPU extension and x86-64-v3 Microarchitecture on Kubernetes nodes, label nodes accordingly, and configure CouchbaseCluster resources to schedule pods only on compatible nodes. @@ -115,7 +118,7 @@ The DaemonSet uses the following process to detect AVX2 support and label nodes: * Checks for the `avx2` flag * Labels the node when AVX2 support is present -Use the following steps to label Kubernetes nodes that support AVX2: +Use the following steps to label Kubernetes nodes that support AVX2 by using a custom DaemonSet: . <<#define-avx2-label, Define the AVX2 node label>> . <<#create-daemonset-manifest, Create the DaemonSet manifest>> From 21a68b5c878d4c5445410582ea87765df2547cfa Mon Sep 17 00:00:00 2001 From: Shwetha Rao Date: Thu, 18 Dec 2025 14:46:03 +0530 Subject: [PATCH 6/9] Fixed the header --- modules/ROOT/nav.adoc | 2 +- modules/ROOT/pages/tutorial-avx2-scheduling.adoc | 1 - 2 files changed, 1 insertion(+), 2 deletions(-) diff --git a/modules/ROOT/nav.adoc b/modules/ROOT/nav.adoc index 6698900..6a13c23 100644 --- a/modules/ROOT/nav.adoc +++ b/modules/ROOT/nav.adoc @@ -148,7 +148,7 @@ include::partial$autogen-reference.adoc[] * Persistent Volumes ** xref:tutorial-volume-expansion.adoc[Persistent Volume Expansion] * Scheduling - ** xref:tutorial-avx2-scheduling.adoc[AVX2-Aware Scheduling for Couchbase Server] + ** xref:tutorial-avx2-scheduling.adoc[AVX2-Aware Scheduling] * Sync Gateway ** xref:tutorial-sync-gateway.adoc[Connecting Sync-Gateway to a Couchbase Cluster] ** xref:tutorial-sync-gateway-clients.adoc[Exposing Sync-Gateway to Couchbase Lite Clients] diff --git a/modules/ROOT/pages/tutorial-avx2-scheduling.adoc b/modules/ROOT/pages/tutorial-avx2-scheduling.adoc index 328edc1..42fedd4 100644 --- a/modules/ROOT/pages/tutorial-avx2-scheduling.adoc +++ b/modules/ROOT/pages/tutorial-avx2-scheduling.adoc @@ -1,5 +1,4 @@ = AVX2-Aware Scheduling for Couchbase Server - :page-toclevels: 2 :page-category: Tutorials From 207ddb493ae08118eb8fe80645fe6cb798ea9fe4 Mon Sep 17 00:00:00 2001 From: Shwetha Rao Date: Thu, 18 Dec 2025 15:45:35 +0530 Subject: [PATCH 7/9] Removed page category variable --- modules/ROOT/pages/tutorial-avx2-scheduling.adoc | 1 - 1 file changed, 1 deletion(-) diff --git a/modules/ROOT/pages/tutorial-avx2-scheduling.adoc b/modules/ROOT/pages/tutorial-avx2-scheduling.adoc index 42fedd4..320a347 100644 --- a/modules/ROOT/pages/tutorial-avx2-scheduling.adoc +++ b/modules/ROOT/pages/tutorial-avx2-scheduling.adoc @@ -1,6 +1,5 @@ = AVX2-Aware Scheduling for Couchbase Server :page-toclevels: 2 -:page-category: Tutorials [abstract] This tutorial explains how to detect the AVX2 CPU extension and x86-64-v3 Microarchitecture on Kubernetes nodes, label nodes accordingly, and configure CouchbaseCluster resources to schedule pods only on compatible nodes. From 345cd678b11765e1d5582c4b0d6301e127b7f725 Mon Sep 17 00:00:00 2001 From: Shwetha Rao Date: Thu, 18 Dec 2025 15:53:23 +0530 Subject: [PATCH 8/9] Minor edit --- modules/ROOT/pages/tutorial-avx2-scheduling.adoc | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/modules/ROOT/pages/tutorial-avx2-scheduling.adoc b/modules/ROOT/pages/tutorial-avx2-scheduling.adoc index 320a347..2597c0d 100644 --- a/modules/ROOT/pages/tutorial-avx2-scheduling.adoc +++ b/modules/ROOT/pages/tutorial-avx2-scheduling.adoc @@ -10,7 +10,7 @@ include::partial$tutorial.adoc[] Starting with Couchbase Server 8.0, Vector Search (FTS and GSI) performance benefits from AVX2-capable CPUs on x86-64 nodes. -=== What's Advanced Vector Extensions 2 (AVX2) +=== What's an Advanced Vector Extensions 2 (AVX2) AVX2 is: From c679ca1478b3adc9129efcda11ee6f2f0c3d0b97 Mon Sep 17 00:00:00 2001 From: Shwetha Rao Date: Thu, 18 Dec 2025 20:49:05 +0530 Subject: [PATCH 9/9] Implemented peer review comments --- .../ROOT/pages/tutorial-avx2-scheduling.adoc | 39 ++++++++++--------- 1 file changed, 20 insertions(+), 19 deletions(-) diff --git a/modules/ROOT/pages/tutorial-avx2-scheduling.adoc b/modules/ROOT/pages/tutorial-avx2-scheduling.adoc index 2597c0d..2fb0cc9 100644 --- a/modules/ROOT/pages/tutorial-avx2-scheduling.adoc +++ b/modules/ROOT/pages/tutorial-avx2-scheduling.adoc @@ -10,15 +10,15 @@ include::partial$tutorial.adoc[] Starting with Couchbase Server 8.0, Vector Search (FTS and GSI) performance benefits from AVX2-capable CPUs on x86-64 nodes. -=== What's an Advanced Vector Extensions 2 (AVX2) +=== What is Advanced Vector Extensions 2 (AVX2) AVX2 is: -* An SIMD instruction set available on modern Intel and AMD x86-64 CPUs -* Required for high-performance vectorized operations -* Part of the x86-64-v3 Microarchitecture level, along with BMI1, BMI2, and FMA -* Not guaranteed on all cloud VM types -* Not enforced by default in Kubernetes scheduling +* An SIMD instruction set available on modern Intel and AMD x86-64 CPUs. +* Required for high-performance vectorized operations. +* Part of the x86-64-v3 Microarchitecture level, along with BMI1, BMI2, and FMA. +* Not guaranteed on all cloud VM types. +* Not enforced by default in Kubernetes scheduling. IMPORTANT: Kubernetes clusters must explicitly detect CPU capabilities and restrict scheduling to make sure Couchbase Server pods run on AVX2-capable nodes. @@ -35,8 +35,8 @@ This tutorial approaches the problem through the following layers: Use one of the following methods to label Kubernetes nodes that support AVX2: -* <<#node-labeling-via-nfd, *Node Feature Discovery (NFD)*>>: Recommended for production environments -* <<#node-labeling-via-daemonset, *A custom DaemonSet*>>: Provides a direct, lightweight option with minimal dependencies +* <<#node-labeling-via-nfd, *Node Feature Discovery (NFD)*>>: Recommended for production environments. +* <<#node-labeling-via-daemonset, *A custom DaemonSet*>>: Provides a direct, lightweight option with minimal dependencies. [#node-labeling-via-nfd] === Method 1: Node Feature Discovery (Recommended) @@ -111,10 +111,10 @@ This approach provides a lightweight option when NFD is unavailable or when you The DaemonSet uses the following process to detect AVX2 support and label nodes: -* Runs as a DaemonSet on every node -* Reads `/proc/cpuinfo` from the host -* Checks for the `avx2` flag -* Labels the node when AVX2 support is present +* Runs as a DaemonSet on every node. +* Reads `/proc/cpuinfo` from the host. +* Checks for the `avx2` flag. +* Labels the node when AVX2 support is present. Use the following steps to label Kubernetes nodes that support AVX2 by using a custom DaemonSet: @@ -234,8 +234,8 @@ kubectl get nodes -L cpu.feature/AVX2 After you label nodes, configure the CouchbaseCluster resource to restrict pod scheduling to AVX2-capable nodes in one of the following ways: -* <<#enforce-avx2-scheduling, *Enforce AVX2 Scheduling*>>: Recommended -* <<#prefer-avx2-scheduling, *Prefer AVX2 Scheduling*>>: Fallback allowed +* <<#enforce-avx2-scheduling, *Enforce AVX2 Scheduling*>>: Recommended. +* <<#prefer-avx2-scheduling, *Prefer AVX2 Scheduling*>>: Fallback allowed. [#enforce-avx2-scheduling] === Enforce AVX2 Scheduling (Recommended) @@ -333,6 +333,7 @@ Use the following steps to create a GKE node pool that guarantees AVX2 support. . Select a compatible machine family, such as `n2`, `c2`, `c3`, `n4`, `m2`, `m3`, and so on. . Enforce a minimum CPU platform that supports AVX2. +For example: + -- [source,console] @@ -402,8 +403,8 @@ Use the following sections to provision AVX2-capable nodes and configure pod sch The following EC2 instance families support AVX2 instructions: -* *Intel*: M5, C5, R5, M6i, C6i, R6i, M7i, C7i and newer generations -* *AMD*: M5a, C5a, R5a, M6a, C6a, R6a and newer generations +* *Intel*: M5, C5, R5, M6i, C6i, R6i, M7i, C7i and newer generations. +* *AMD*: M5a, C5a, R5a, M6a, C6a, R6a and newer generations. Verify the selected instance type supports AVX2 by referring to the provider documentation. @@ -471,9 +472,9 @@ Use the following sections to provision AVX2-capable nodes and configure pod sch The following Azure VM series support AVX2 instructions: -* Dv3 and Ev3 VM series, based on Intel Haswell and Broadwell processors -* Dv4 and Ev4 VM series, based on Intel Cascade Lake processors -* Dv5 and Ev5 VM series, based on Intel Ice Lake processors +* Dv3 and Ev3 VM series, based on Intel Haswell and Broadwell processors. +* Dv4 and Ev4 VM series, based on Intel Cascade Lake processors. +* Dv5 and Ev5 VM series, based on Intel Ice Lake processors. Verify the selected VM series supports AVX2 by referring to the Azure documentation.