CLOUDP-377831: Rewrite prepare local e2e run script#773
CLOUDP-377831: Rewrite prepare local e2e run script#773Julien-Ben wants to merge 15 commits intomasterfrom
Conversation
MCK 1.7.1 Release NotesOther Changes
|
| mtls: | ||
| mode: STRICT`, namespace) | ||
|
|
||
| cmd := exec.Command("kubectl", "--context", cluster, "-n", namespace, "apply", "-f", "-") |
There was a problem hiding this comment.
what the hack! 😅
let's use GroupVersionKind and Unstructured data with the yaml payload, essentially mimic kubectl apply but with the API server. We don't need to fallback to kubectl
There was a problem hiding this comment.
Oh, I did it initially to to avoid initializing a second (untyped) client, out of laziness.
But here is the fix: 318e91e
| orgId: "${OM_ORGID:-}" | ||
| EOF | ||
|
|
||
| # Note: create_image_registries_secret is already called by configure_operator.sh |
There was a problem hiding this comment.
do you think this comment is useful in a later time, assuming this is getting merges?
| # We always create the image pull secret from the docker config.json which gives access to all necessary image repositories | ||
| create_image_registries_secret | ||
| # In local runs, configure_container_auth.sh already creates the pull secret in all clusters | ||
| # before this script runs. Skip the redundant call to avoid a ~0.9s delete+recreate cycle |
There was a problem hiding this comment.
i am not sure whether those comments make sense here rather than in the pr as inline comments. All of those changes are tight to prior implementation
| # across all clusters. In CI (Evergreen), this script is the only place the pull secret is | ||
| # created (e2e.sh sources it directly without configure_container_auth.sh). | ||
| if [[ "${RUNNING_IN_EVG:-false}" == "true" ]]; then | ||
| create_image_registries_secret |
There was a problem hiding this comment.
can't we rather update create_image_registries_secret to skip if the secret already exists? That removes the running in evg check
| } | ||
|
|
||
| // Phase 7: Copy or build kubectl-mongodb binary | ||
| func setupKubectlMongodb(cfg config) error { |
There was a problem hiding this comment.
nit: I think we could split the function into 2 subfunctions for readability:
- checkExists()
- buildFromSource()
basically your comments but as subfunctions.
| } | ||
|
|
||
| // Write central_cluster name | ||
| writeConfigFile(cfg.multiClusterConfigDir, "central_cluster", secret.Data["central_cluster"], collectError) |
There was a problem hiding this comment.
where do we write to here? Do we even save anything in doing this concurrently?
Summary
The local e2e preparation script (
prepare_local_e2e_run.sh) was taking ~45-51s on multi-cluster setups, mostly due to sequential kubectl subprocess spawning. This PR brings it down to ~15-19s.Every local e2e run starts with this script, so saving 30s in each iteration adds up fast during development.
What changed
Go rewrite of multi-cluster prep: The biggest win. Replaced the bash functions
prepare_multi_cluster_e2e_runandconfigure_multi_cluster_environmentwith a Go binary (scripts/dev/prepare-multi-cluster/main.go) that usesclient-godirectly instead of shelling out to kubectl. Operations across clusters run in parallel via goroutines. This alone took the multi-cluster prep step from ~11s to ~1.4s.The bash version used a helm chart to template RBAC resources (a ServiceAccount and a ClusterRoleBinding). The Go version creates these directly via the typed API with idempotent get-or-create-or-update logic. The helm chart is unchanged and still used by CI; only the local dev path bypasses it.
Bash parallelization:
make installanddelete_om_projects.share backgrounded early in the script and waited on before the deploy step.Redundant work elimination:
create_image_registries_secretwas called twice per run (once inconfigure_container_auth.sh, again inconfigure_operator.sh). Gated the second call behindRUNNING_IN_EVGso it only runs in CI. Similarly,my-project/my-credentialscreation is skipped in local multi-cluster runs since the Go binary handles it.Smaller wins:
kubectl delete + createconverted tokubectl applyeverywhere. Makefile sentinel file skips CRD regeneration whenapi/*.gohasn't changed.go build -o bin/Xinstead ofgo run(build cache makes repeated runs instant).CI impact
All changes are gated to only affect local runs. CI paths (
e2e.sh,single_e2e.sh) are unchanged in behavior —RUNNING_IN_EVGguards ensure CI still creates pull secrets and config resources through the existing shell path. The Go binary is only invoked fromprepare_local_e2e_run.sh.Follow-ups
Two more changes can be considered to further reduce the total time (by ~3-5s):
--parallelflag).reset.go(not only across clusters, but also within each cluster).Proof of Work
Manually ran different e2e tests in different variants to assert correctness. This script doesn't run in CI.
Performance
Ran locally against a 4-cluster KIND setup (1 central + 3 members), where the changes matter most. Timed 3 runs of each version, both from a cold start (no compilation cache, CRDs need regenerating) and warm (caches populated).
Checklist