wanda shortcut #307

aslonnie · 2025-10-08T07:23:24Z

No description provided.

gitar-bot · 2025-10-08T07:23:26Z

Gitar automatically fixes CI failures and addresses comments starting with Gitar.

⚙️ Options:

Fix CI failures*

🔄 To revert changes, post a comment:

Gitar revert to commit <desired commit SHA>

📚 Docs

* Gitar never force pushes to your branch

gemini-code-assist · 2025-10-08T07:23:40Z

Summary of Changes

Hello @aslonnie, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This PR enhances wanda steps in the CI pipeline by adding cache hit prediction and step-level priority configuration. It introduces a new predict_cache.go file with logic to determine if a wanda build will hit the cache by checking registry tags. The raycicmd package is updated to integrate this prediction, marking cached steps in Buildkite and allowing priority to be set per wanda step, overriding global settings. Relevant tests have been added for both new features.

Highlights

Intent: This pull request introduces a 'wanda shortcut' by implementing cache hit prediction for wanda steps in the CI pipeline. This allows the system to determine if a wanda build would result in a cache hit without actually executing the build, potentially speeding up CI runs by marking steps as cached. Additionally, it adds support for specifying a step-level priority for wanda steps, which can override the global builder priority.
Key Changes: 1. Cache Hit Prediction Logic: A new file wanda/predict_cache.go is added, containing PredictCacheHit and resolveBases functions. PredictCacheHit calculates the build input digest for a given wanda spec file and checks for the existence of the corresponding cache tag in the container registry using go-containerregistry. It handles scenarios like disabled caching, forced rebuilds, or local-only builds.

Integration into raycicmd: The wandaStep struct in raycicmd/wanda.go now includes priority (as an optional integer) and cacheHit (boolean) fields. The convert function is updated to parse the priority from the step configuration and call the new predictCacheHit function. The buildkiteStep function is modified to apply the step-level priority if provided, and to append [cache hit] to the Buildkite step label if cacheHit is true.
Configuration Update: The wandaStepAllowedKeys list in raycicmd/rayci_pipeline.go is updated to include "priority", enabling its use in wanda step definitions.
Testing: New unit tests are added in raycicmd/wanda_test.go to verify the priority handling and in wanda/predict_cache_test.go to cover various PredictCacheHit scenarios.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

Copilot

Pull Request Overview

Adds predictive cache hit functionality for Wanda builds and introduces step-level priority overrides with corresponding Buildkite label annotation for cache hits. Key changes include implementing PredictCacheHit with base image resolution and integrating cache prediction into pipeline conversion, plus allowing per-step priority to override global builder priority.

Add PredictCacheHit logic to pre-check remote cache availability.
Add step-level priority and cache hit label decoration to Buildkite steps.
Extend allowed wanda step keys and tests for priority and negative cache prediction scenarios.

Reviewed Changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 5 comments.

Show a summary per file

File	Description
wanda/predict_cache.go	Adds cache prediction logic and base image resolution utilities.
wanda/predict_cache_test.go	Adds tests for disabled caching, local-only builds, and forced rebuild behavior.
raycicmd/wanda.go	Integrates cache prediction and adds step-level priority & cache-hit labeling.
raycicmd/wanda_test.go	Tests step-level priority override behavior.
raycicmd/rayci_pipeline.go	Allows priority key in wanda step configuration.

_{Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.}

Copilot · 2025-10-08T07:25:09Z

wanda/predict_cache.go

+	for _, from := range froms {
+		// Local images (prefixed with @) - skip for prediction
+		// We can't reliably predict local image digests without docker
+		if from[0] == '@' {


Indexing from[0] without checking length will panic if an empty string is present in froms. Add a length guard (e.g., if len(from) > 0 && from[0] == '@').

Suggested change

if from[0] == '@' {

if len(from) > 0 && from[0] == '@' {

Copilot · 2025-10-08T07:25:10Z

wanda/predict_cache.go

+		// We can't reliably predict local image digests without docker
+		if from[0] == '@' {
+			log.Printf("skipping local image %s for cache prediction", from)
+			return nil, fmt.Errorf("cannot predict cache for local images: %s", from)


Comment says 'skip for prediction' but the code returns an error, aborting prediction. Either update the comment to reflect the behavior or change the code to actually skip (e.g., continue) without returning an error.

Suggested change

return nil, fmt.Errorf("cannot predict cache for local images: %s", from)

continue

Copilot · 2025-10-08T07:25:10Z

wanda/predict_cache.go

+}
+
+// resolveBases resolves base images for cache prediction
+func resolveBases(froms []string, config *ForgeConfig, workDir string) (map[string]*imageSource, error) {


The workDir parameter is never used in this function. Remove it if unnecessary, or incorporate it if it was intended for resolving relative references.

Suggested change

func resolveBases(froms []string, config *ForgeConfig, workDir string) (map[string]*imageSource, error) {

func resolveBases(froms []string, config *ForgeConfig) (map[string]*imageSource, error) {

Copilot · 2025-10-08T07:25:11Z

raycicmd/wanda.go

+	// Set environment variables for the prediction
+	// This allows the wanda package to expand variables in the spec file
+	for k, v := range envs {
+		_ = setEnvIfNotSet(k, v)


Ignoring the error from setEnvIfNotSet discards potential OS-level failures. Capture and log (or propagate) the error instead of using the blank identifier.

Suggested change

_ = setEnvIfNotSet(k, v)

if err := setEnvIfNotSet(k, v); err != nil {

log.Printf("failed to set environment variable %s=%s: %v", k, v, err)

}

Copilot · 2025-10-08T07:25:11Z

raycicmd/wanda.go

+func setEnvIfNotSet(key, value string) error {
+	if os.Getenv(key) == "" {
+		return os.Setenv(key, value)
+	}
+	return nil
+}


This function returns an error that all callers currently ignore. Either handle the error at call sites or document that failures are intentionally suppressed.

gemini-code-assist

Code Review

This pull request introduces a feature to predict wanda cache hits and allow setting step-level priorities. The implementation is mostly solid, with good test coverage for the new functionality. I have a few suggestions to improve robustness and maintainability, mainly around avoiding global state mutation and improving error handling. Please see my detailed comments below.

gemini-code-assist · 2025-10-08T07:25:45Z

raycicmd/wanda.go

+func (c *wandaConverter) predictCacheHit(file string, envs map[string]string) bool {
+	// Only predict cache hits if we have the necessary config
+	if c.config.CIWorkRepo == "" {
+		return false
+	}
+
+	// Set environment variables for the prediction
+	// This allows the wanda package to expand variables in the spec file
+	for k, v := range envs {
+		_ = setEnvIfNotSet(k, v)
+	}
+
+	forgeConfig := &wanda.ForgeConfig{
+		WorkDir:    ".",
+		WorkRepo:   c.config.CIWorkRepo,
+		NamePrefix: c.config.ForgePrefix,
+		BuildID:    c.info.buildID,
+		Epoch:      wanda.DefaultCacheEpoch(),
+		RayCI:      true,
+		Rebuild:    false,
+	}
+
+	cacheHit, err := wanda.PredictCacheHit(file, forgeConfig)
+	if err != nil {
+		// If prediction fails, log the error but don't fail the build
+		log.Printf("failed to predict cache hit for %s: %v", file, err)
+		return false
+	}
+
+	return cacheHit
+}


Modifying the process-wide environment using os.Setenv is not thread-safe and can lead to race conditions and unpredictable behavior if rayci ever runs converters concurrently. The environment is a global state.

A safer approach is to pass the environment information explicitly without mutating the global state. The wanda.PredictCacheHit function (also introduced in this PR) can be modified to accept a lookup function for environment variables.

Suggested Refactoring:

Export lookupFunc in wanda/spec.go:
Rename lookupFunc to LookupFunc to make it public.

Update wanda.PredictCacheHit in wanda/predict_cache.go:
Modify its signature to accept the lookup function.

// In wanda/predict_cache.go // The wanda.LookupFunc type needs to be exported from the wanda package. func PredictCacheHit(specFile string, config *ForgeConfig, lookupEnv wanda.LookupFunc) (bool, error) { // ... spec, err := parseSpecFile(specFile) if err != nil { return false, fmt.Errorf("parse spec file: %w", err) } if lookupEnv == nil { lookupEnv = os.LookupEnv } // Expand env variables just like the actual build does spec = spec.expandVar(lookupEnv) // ... }

Update predictCacheHit in this file (raycicmd/wanda.go):
This suggestion replaces the function body to create a lookup function and pass it to wanda.PredictCacheHit, removing the need for setEnvIfNotSet.

func (c *wandaConverter) predictCacheHit(file string, envs map[string]string) bool { // Only predict cache hits if we have the necessary config if c.config.CIWorkRepo == "" { return false } // Create a lookup function for environment variables to avoid mutating global state. // This allows the wanda package to expand variables in the spec file safely. lookup := func(key string) (string, bool) { if val, ok := envs[key]; ok { return val, true } return os.LookupEnv(key) } forgeConfig := &wanda.ForgeConfig{ WorkDir: ".", WorkRepo: c.config.CIWorkRepo, NamePrefix: c.config.ForgePrefix, BuildID: c.info.buildID, Epoch: wanda.DefaultCacheEpoch(), RayCI: true, Rebuild: false, } // Assumes PredictCacheHit is modified to accept a lookup function. cacheHit, err := wanda.PredictCacheHit(file, forgeConfig, lookup) if err != nil { // If prediction fails, log the error but don't fail the build log.Printf("failed to predict cache hit for %s: %v", file, err) return false } return cacheHit }

gemini-code-assist · 2025-10-08T07:25:45Z

raycicmd/wanda.go

+func setEnvIfNotSet(key, value string) error {
+	if os.Getenv(key) == "" {
+		return os.Setenv(key, value)
+	}
+	return nil
+}


This function will no longer be needed after refactoring predictCacheHit as suggested in my other comment to avoid using global environment variables.

As a side note, os.Getenv(key) == "" is not a reliable way to check if an environment variable is unset, as it could be explicitly set to an empty string. The correct way is to use _, ok := os.LookupEnv(key).

gemini-code-assist · 2025-10-08T07:25:45Z

wanda/predict_cache.go

+	_, err = remote.Get(ct, remoteOpts...)
+	if err != nil {
+		// Cache miss or error checking
+		return false, nil
+	}


This logic swallows all errors from remote.Get, treating them as a cache miss. This can hide underlying issues like authentication failures or network problems, making them hard to debug. It would be better to distinguish between a 'not found' error (which is a cache miss) and other unexpected errors.

You can achieve this by inspecting the error type and status code. You will need to add the following imports:

import ( "errors" "net/http" "github.com/google/go-containerregistry/pkg/v1/remote/transport" )

_, err = remote.Get(ct, remoteOpts...) if err != nil { var terr *transport.Error if errors.As(err, &terr) && terr.StatusCode == http.StatusNotFound { // This is a cache miss, not an error. return false, nil } // Any other error is unexpected and should be returned to be logged by the caller. return false, fmt.Errorf("failed to check remote cache tag %q: %w", cacheTag, err) }

aslonnie added 2 commits October 8, 2025 07:07

[rayci] allow wanda field to specify priority

40d2512

[rayci] predict wanda cache result

95109ab

Copilot AI review requested due to automatic review settings October 8, 2025 07:23

Copilot AI reviewed Oct 8, 2025

View reviewed changes

gemini-code-assist bot reviewed Oct 8, 2025

View reviewed changes

aslonnie marked this pull request as draft October 8, 2025 07:26

	return nil, fmt.Errorf("cannot predict cache for local images: %s", from)
	continue

	func resolveBases(froms []string, config ForgeConfig, workDir string) (map[string]imageSource, error) {
	func resolveBases(froms []string, config ForgeConfig) (map[string]imageSource, error) {

-		_ = setEnvIfNotSet(k, v)
+		if err := setEnvIfNotSet(k, v); err != nil {
+			log.Printf("failed to set environment variable %s=%s: %v", k, v, err)
+		}

wanda shortcut #307

Are you sure you want to change the base?

wanda shortcut #307

Uh oh!

Conversation

aslonnie commented Oct 8, 2025

Uh oh!

gitar-bot bot commented Oct 8, 2025

⚙️ Options:

Uh oh!

gemini-code-assist bot commented Oct 8, 2025

Summary of Changes

Highlights

Footnotes

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Copilot AI Oct 8, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Oct 8, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Oct 8, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Oct 8, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Oct 8, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Oct 8, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Oct 8, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Oct 8, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants