Skip to content

Commit 2d574e2

Browse files
banjohclaude
andauthored
feat: add k0s 1.36 support and refresh 1.34/1.35 patches (#3771)
* feat: add k0s 1.36 support, refresh 1.34/1.35 patches Adds k0s 1.36 to the supported window (default 1.36.1+k0s.0; 1.32-1.35 keep rolling via the relative CI matrix) and bumps 1.34->v1.34.8 / 1.35->v1.35.5. Regenerates go.mod/go.sum (+kinds) k8s.io replace directives -> v0.36.1, CRDs, and the metadata-1_3{4,5,6}.yaml image lists. The load-bearing change is containerd v1 -> v2 (k0s 1.36 ships containerd 2.x), which rejects the legacy version=2 / io.containerd.grpc.v1.cri drop-in schema. - Image metadata pinned to the mirrored proxy.replicated.com tags that match each k0s release (e.g. kube-proxy tracks the cluster's Kubernetes version) rather than the greatest available patch. - k0s airgap.GetImageURIs signature changed in 1.35 (GetImageURIs(spec, all) -> GetImageURIs(TargetEnv{Platform, Spec}, all)). EC builds one binary per supported minor and 1.34/1.33 still use the old API, so the call is isolated in a build-tagged allK0sImageURIs (images_targetenv.go default / images_legacy.go behind k0s_legacy_airgap, gated by the Makefile on K0S_MINOR_VERSION < 35). The new path also skips k0s 1.36's Traefik NLLB image. - Containerd v2 -> v3: AddInsecureRegistry emits the v3 schema (config_path drop-in + hosts.toml skip_verify) for k0s 1.36+, legacy below. Because k0s is upgraded via autopilot (no per-node Plan hook), the stale v2 embedded-registry.toml on airgap nodes is migrated by a new idempotent `local-artifact-mirror migrate-containerd-config` command run by the per-node copy-artifacts job (now mounting /etc/k0s) before k0s 1.36 starts. V2-only paths tagged TODO(k0s-1.36-oldest). - Host preflights: add a kernel >= 4.5 floor and strict cgroup v2 for k8s 1.35+; legacy 3.10 / "v1 or v2" checks gated to older minors. - Remove canDisableUpdateProber: the update prober is disableable on all supported k0s versions (k0sproject/k0s#6326). Signed-off-by: Evans Mungai <evans@replicated.com> Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * chore: add cleanup TODO for etc-k0s upgrade-job mount Note that the /etc/k0s mount and the migrate-containerd-config step in the airgap copy-artifacts job can be dropped once the oldest supported k0s minor is >= 1.36 (no v2 containerd drop-ins left to migrate). Signed-off-by: Evans Mungai <evans@replicated.com> Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * fix(build): gate k0s_legacy_airgap on the go.mod module, not K0S_MINOR_VERSION The installer (and buildtools during a normal build) always compile against the committed go.mod (k0s 1.36, new airgap.GetImageURIs API); K0S_MINOR_VERSION only selects the embedded k0s binary, not the Go module. Gating the legacy build tag on K0S_MINOR_VERSION therefore wrongly tagged the previous-k0s-2/-3 CI legs (1.34/1.33) as legacy, compiling the old-API images_legacy.go against the 1.36 module -> build failure. Key the tag off the k0s version actually pinned in go.mod instead. The legacy path now only engages when the k0s-update scripts temporarily re-pin go.mod to a < 1.35 module while regenerating that minor's metadata. Signed-off-by: Evans Mungai <evans@replicated.com> Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * refactor(config): drop k0s_legacy_airgap build tag, always use the new API ListK0sImages now calls airgap.GetImageURIs with the k0s 1.35+ TargetEnv signature directly. The installer and buildtools always compile against the committed go.mod (current k0s minor), so the legacy build-tag split (images_legacy.go / images_targetenv.go and the Makefile gate) was unnecessary and broke the previous-k0s CI legs. Removes the split, the build tag, and the Makefile machinery. Signed-off-by: Evans Mungai <evans@replicated.com> Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * Use constant Signed-off-by: Evans Mungai <evans@replicated.com> * test(e2e): shorten containerd v3 migration assertion comment Signed-off-by: Evans Mungai <evans@replicated.com> Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * chore(operator): shorten etc-k0s mount comment Signed-off-by: Evans Mungai <evans@replicated.com> Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * chore(operator): document autopilot containerd-migration timing window Signed-off-by: Evans Mungai <evans@replicated.com> Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * Fix formatting Signed-off-by: Evans Mungai <evans@replicated.com> * Update dockerfiles to 1.26.4 Signed-off-by: Evans Mungai <evans@replicated.com> * fix(config): restore k0s_legacy_airgap build tag, gated at k0s < 1.36 Only k0s 1.36 has the new airgap.GetImageURIs(TargetEnv, all) signature; 1.33/ 1.34/1.35 still use GetImageURIs(spec, all). The binary build re-pins go.mod to K0S_GO_VERSION (via build-deps -> go.mod) for the selected K0S_MINOR_VERSION, so each per-minor build leg compiles against that minor's k0s module. Using the new API unconditionally broke the < 1.36 build legs. Isolate the version-specific call in a build-tagged allK0sImageURIs (images_targetenv.go for 1.36+ / images_legacy.go behind k0s_legacy_airgap for <= 1.35); the Makefile applies the tag when K0S_MINOR_VERSION < 36. The Traefik skip stays in the shared ListK0sImages via a "/traefik" string match since the typed constant only exists in the 1.36 module. Signed-off-by: Evans Mungai <evans@replicated.com> Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * fix(build): apply k0s_legacy_airgap gate in shared versions.mk The gate lived only in the root Makefile, so the operator image build (operator/Makefile `build`, run via melange with `-tags ...,$(GO_BUILD_TAGS)`) never received k0s_legacy_airgap. On the previous-k0s legs (go.mod re-pinned to < 1.36) it compiled the new-API images_targetenv.go and failed with `undefined: airgap.TargetEnv`. Move the gate to versions.mk, which the root, operator, and local-artifact-mirror Makefiles all include (after common.mk), so every build that compiles pkg/config picks up the tag. dagger propagates K0S_MINOR_VERSION into the operator/LAM melange envs, so the gate evaluates correctly there. LAM doesn't import pkg/config, so it's unaffected. Signed-off-by: Evans Mungai <evans@replicated.com> Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * fix(ci): fall back to oldest previous-minor release in find-previous-stable When the previous k0s minor is newly added it may have fewer than previous_release_gap (5) stable releases, so `.[5]` resolved to null and EC_VERSION came back empty — ci-release-app.sh then fell back to `git describe` on the shallow checkout and failed with "No names found". Adding 1.36 makes the previous minor 1.35, which currently has a single stable release. Fall back to the oldest available release (`.[-1]`) when `.[gap]` is absent. Signed-off-by: Evans Mungai <evans@replicated.com> Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * fix: pin k0s 1.36 sandbox image by plain tag for containerd 2.x The pause/sandbox image was pinned as <tag>-<arch>@<digest>. containerd 2.x (k0s 1.36) pulls it by digest but then resolves the sandbox image by its full reference; the synthetic arch-suffixed tag does not exist in the registry, so the lookup fails ("failed to get sandbox image ...: not found") and no pod sandbox can start (node never becomes ready). Earlier k0s/containerd tolerated the phantom tag. Emit the sandbox image tag as-is (no arch/digest suffix) via a usePlainTag flag on the buildtools pause component, and update metadata-1_36.yaml to pause:3.10.2 to match. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * Pin helm to same version as embedded-cluster Signed-off-by: Evans Mungai <evans@replicated.com> * Fix tests Signed-off-by: Evans Mungai <evans@replicated.com> * Fix tests Signed-off-by: Evans Mungai <evans@replicated.com> * Add logging Signed-off-by: Evans Mungai <evans@replicated.com> * Add missing k0s version Signed-off-by: Evans Mungai <evans@replicated.com> * Remove embedded-registry.toml in online installs Signed-off-by: Evans Mungai <evans@replicated.com> * Do not create embedded-registry.toml for online installs Signed-off-by: Evans Mungai <evans@replicated.com> * Update pause image before k0s upgrade Signed-off-by: Evans Mungai <evans@replicated.com> * Fix config loading Signed-off-by: Evans Mungai <evans@replicated.com> * Update docs strings Signed-off-by: Evans Mungai <evans@replicated.com> --------- Signed-off-by: Evans Mungai <evans@replicated.com> Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
1 parent bec5d94 commit 2d574e2

41 files changed

Lines changed: 936 additions & 767 deletions

File tree

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

.github/workflows/ci.yaml

Lines changed: 12 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -25,6 +25,7 @@ jobs:
2525
outputs:
2626
git_sha: ${{ steps.git_sha.outputs.git_sha }}
2727
ec_version: ${{ steps.output_vars.outputs.ec_version }}
28+
helm_version: ${{ steps.output_vars.outputs.helm_version }}
2829
steps:
2930
- uses: actions/checkout@v7
3031
with:
@@ -39,6 +40,10 @@ jobs:
3940
EC_VERSION=$(./scripts/print-ec-version.sh "$ec_version" "$k0s_minor_version")
4041
echo "EC_VERSION=\"$EC_VERSION\""
4142
echo "ec_version=$EC_VERSION" >> $GITHUB_OUTPUT
43+
# Pin the helm binary used by tests to the same version embedded-cluster ships
44+
# (versions.mk), instead of latest.
45+
helm_version=$(make print-HELM_VERSION)
46+
echo "helm_version=$helm_version" >> $GITHUB_OUTPUT
4247
4348
sanitize:
4449
name: Sanitize
@@ -155,7 +160,7 @@ jobs:
155160
- name: Setup Helm
156161
uses: azure/setup-helm@v5
157162
with:
158-
version: latest
163+
version: ${{ needs.output-vars.outputs.helm_version }}
159164
- name: Run tests
160165
run: |
161166
export VERSION=${{ needs.output-vars.outputs.ec_version }}
@@ -231,7 +236,7 @@ jobs:
231236
- name: Setup Helm
232237
uses: azure/setup-helm@v5
233238
with:
234-
version: latest
239+
version: ${{ needs.output-vars.outputs.helm_version }}
235240
- name: Run tests
236241
run: |
237242
export VERSION=${{ needs.output-vars.outputs.ec_version }}
@@ -261,7 +266,7 @@ jobs:
261266
- name: Setup Helm
262267
uses: azure/setup-helm@v5
263268
with:
264-
version: latest
269+
version: ${{ needs.output-vars.outputs.helm_version }}
265270
- name: Run tests
266271
run: |
267272
export VERSION=${{ needs.output-vars.outputs.ec_version }}
@@ -528,9 +533,12 @@ jobs:
528533
k0s_minor_version=$(make print-K0S_MINOR_VERSION)
529534
previous_k0s_minor_version=$(($k0s_minor_version - 1))
530535
k0s_majmin_version="1.${previous_k0s_minor_version}"
536+
# Pick the release `previous_release_gap` versions back, falling back to the
537+
# oldest available when the previous minor has fewer releases than that (e.g.
538+
# right after a new minor is added).
531539
EC_VERSION="$(gh release list --repo replicatedhq/embedded-cluster \
532540
--exclude-drafts --exclude-pre-releases --json name --order desc \
533-
--jq "[.[] | select(.name | contains(\"k8s-${k0s_majmin_version}\"))] | .[${previous_release_gap}] | .name")"
541+
--jq "[.[] | select(.name | contains(\"k8s-${k0s_majmin_version}\"))] | (.[${previous_release_gap}] // .[-1]) | .name")"
534542
535543
gh release download "$EC_VERSION" --repo replicatedhq/embedded-cluster --pattern 'metadata.json'
536544
K0S_VERSION="$(jq -r '.Versions.Kubernetes' metadata.json)"

Makefile

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -267,7 +267,7 @@ embedded-cluster-darwin-arm64: embedded-cluster
267267
.PHONY: embedded-cluster
268268
embedded-cluster: build-deps
269269
CGO_ENABLED=0 GOOS=$(OS) GOARCH=$(ARCH) go build \
270-
-tags osusergo,netgo \
270+
-tags $(GO_INSTALLER_BUILD_TAGS) \
271271
-ldflags="-s -w $(LD_FLAGS) -extldflags=-static" \
272272
-o ./build/embedded-cluster-$(OS)-$(ARCH) \
273273
./cmd/installer

cmd/buildtools/addon.go

Lines changed: 14 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -17,6 +17,11 @@ type addonComponent struct {
1717
getWolfiPackageVersion func(opts addonComponentOptions) string
1818
upstreamVersionInputOverride string
1919
useUpstreamImage bool
20+
// usePlainTag emits the image tag as-is (no "-<arch>@<digest>" suffix). Required for
21+
// the sandbox/pause image: containerd 2.x (k0s >= 1.36) pulls it by digest but then
22+
// resolves the sandbox image by its full reference, and a synthetic arch-suffixed tag
23+
// that does not exist in the registry fails that lookup ("failed to get sandbox image").
24+
usePlainTag bool
2025
}
2126

2227
type addonComponentOptions struct {
@@ -105,13 +110,17 @@ func (c *addonComponent) resolveCustomImageRepoAndTag(ctx context.Context, image
105110
return "", "", fmt.Errorf("failed to get image name for %s: %w", c.name, err)
106111
}
107112

108-
digest, err := GetImageDigest(ctx, customImage, arch)
109-
if err != nil {
110-
return "", "", fmt.Errorf("failed to get image %s digest: %w", customImage, err)
113+
var tag string
114+
if c.usePlainTag {
115+
tag = TagFromImage(customImage)
116+
} else {
117+
digest, err := GetImageDigest(ctx, customImage, arch)
118+
if err != nil {
119+
return "", "", fmt.Errorf("failed to get image %s digest: %w", customImage, err)
120+
}
121+
tag = fmt.Sprintf("%s-%s@%s", TagFromImage(customImage), arch, digest)
111122
}
112123

113-
tag := fmt.Sprintf("%s-%s@%s", TagFromImage(customImage), arch, digest)
114-
115124
repo := FamiliarImageName(RemoveTagFromImage(customImage))
116125
repo = addProxyAnonymousPrefix(repo)
117126

cmd/buildtools/k0s.go

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -88,6 +88,9 @@ var k0sImageComponents = map[string]addonComponent{
8888

8989
var pauseComponent = addonComponent{
9090
name: "pause",
91+
// Pin the sandbox image by plain tag (no arch suffix / digest) so containerd 2.x can
92+
// resolve it; see addonComponent.usePlainTag.
93+
usePlainTag: true,
9194
getCustomImageName: func(opts addonComponentOptions) (string, error) {
9295
k0sConfig := k0sv1beta1.DefaultClusterConfig()
9396
pauseVersion := k0sConfig.Spec.Images.Pause.Version

cmd/installer/cli/install.go

Lines changed: 11 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -673,13 +673,17 @@ func runInstall(ctx context.Context, flags installFlags, installCfg *installConf
673673

674674
// TODO (@salah): update installation status to reflect what's happening
675675

676-
logrus.Debugf("adding insecure registry")
677-
registryIP, err := registry.GetRegistryClusterIP(rc.ServiceCIDR())
678-
if err != nil {
679-
return fmt.Errorf("failed to get registry cluster IP: %w", err)
680-
}
681-
if err := hostutils.AddInsecureRegistry(fmt.Sprintf("%s:5000", registryIP)); err != nil {
682-
return fmt.Errorf("failed to add insecure registry: %w", err)
676+
// Only airgap installs use the in-cluster registry; online installs don't
677+
// need the insecure-registry drop-in (and k0s 1.36+ rejects its legacy v1 format).
678+
if installCfg.isAirgap {
679+
logrus.Debugf("setup internal registry config for containerd to pull from the in-cluster registry")
680+
registryIP, err := registry.GetRegistryClusterIP(rc.ServiceCIDR())
681+
if err != nil {
682+
return fmt.Errorf("failed to get registry cluster IP: %w", err)
683+
}
684+
if err := hostutils.AddInsecureRegistry(fmt.Sprintf("%s:5000", registryIP)); err != nil {
685+
return fmt.Errorf("failed to add insecure registry: %w", err)
686+
}
683687
}
684688

685689
helmOpts := buildHelmClientOptions(installCfg, rc)
Lines changed: 32 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,32 @@
1+
package main
2+
3+
import (
4+
"fmt"
5+
6+
"github.com/replicatedhq/embedded-cluster/pkg-new/hostutils"
7+
"github.com/spf13/cobra"
8+
)
9+
10+
// MigrateContainerdConfigCmd reconciles the containerd registry drop-in with the
11+
// k0s 1.36+ schema (see hostutils.MigrateContainerdConfigToV3): airgap migrates it
12+
// to the containerd 2.x schema, online deletes it if present. Run per node on upgrade.
13+
func MigrateContainerdConfigCmd(cli *CLI) *cobra.Command {
14+
cmd := &cobra.Command{
15+
Use: "migrate-containerd-config",
16+
Short: "Migrate the containerd registry drop-in to the containerd 2.x (k0s 1.36+) schema",
17+
PreRunE: func(cmd *cobra.Command, args []string) error {
18+
cli.bindFlags(cmd.Flags())
19+
return nil
20+
},
21+
RunE: func(cmd *cobra.Command, args []string) error {
22+
if err := hostutils.MigrateContainerdConfigToV3(cli.V.GetBool("airgap")); err != nil {
23+
return fmt.Errorf("migrate containerd config: %w", err)
24+
}
25+
return nil
26+
},
27+
}
28+
29+
cmd.Flags().Bool("airgap", false, "Whether this is an airgap install: airgap migrates the registry drop-in to the containerd 2.x (k0s 1.36+) schema; online deletes it if present (online installs don't use the in-cluster registry)")
30+
31+
return cmd
32+
}

cmd/local-artifact-mirror/root.go

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -24,6 +24,7 @@ func RootCmd(cli *CLI) *cobra.Command {
2424

2525
cmd.AddCommand(ServeCmd(cli))
2626
cmd.AddCommand(PullCmd(cli))
27+
cmd.AddCommand(MigrateContainerdConfigCmd(cli))
2728

2829
cmd.PersistentFlags().String("data-dir", ecv1beta1.DefaultDataDir, "Path to the data directory")
2930

dev/dockerfiles/local-artifact-mirror/Dockerfile.ttlsh

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
FROM golang:1.26.2 AS build
1+
FROM golang:1.26.4 AS build
22

33
WORKDIR /app
44

dev/dockerfiles/operator/Dockerfile.local

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
FROM golang:1.26.2-alpine AS build
1+
FROM golang:1.26.4-alpine AS build
22

33
RUN apk add --no-cache ca-certificates curl git make bash helm
44

dev/dockerfiles/operator/Dockerfile.ttlsh

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
FROM golang:1.26.2 AS build
1+
FROM golang:1.26.4 AS build
22

33
WORKDIR /app
44

0 commit comments

Comments
 (0)