Skip to content

test: add tart VM lifecycle integration tests#55

Open
STRRL wants to merge 4 commits intomainfrom
feat/tart-integration-tests
Open

test: add tart VM lifecycle integration tests#55
STRRL wants to merge 4 commits intomainfrom
feat/tart-integration-tests

Conversation

@STRRL
Copy link
Copy Markdown
Member

@STRRL STRRL commented Apr 7, 2026

Summary

  • Add BDD integration tests for tart VM lifecycle (pull, clone, start, IP, exec, stop, delete, cleanup)
  • Uses ghcr.io/cirruslabs/macos-tahoe-base:latest as smaller test image (no Xcode)
  • Tart steps auto-skip when tart binary is not available
  • CI matrix adds "Tart" job on self-hosted macOS arm64 runner with nested virtualization

Test plan

  • Verify tart tests pass on self-hosted macOS arm64 runner
  • Verify non-tart integration tests still pass
  • Verify tart tests skip gracefully on runners without tart installed

@cloudflare-workers-and-pages
Copy link
Copy Markdown

cloudflare-workers-and-pages bot commented Apr 7, 2026

Deploying elastic-fruit-runner with  Cloudflare Pages  Cloudflare Pages

Latest commit: 90f9e15
Status: ✅  Deploy successful!
Preview URL: https://571fd2af.elastic-fruit-runner.pages.dev
Branch Preview URL: https://feat-tart-integration-tests.elastic-fruit-runner.pages.dev

View logs

Add BDD scenarios for tart manager operations with nested virtualization:
- Pull, clone, start, IP discovery, SSH exec, stop, delete
- List and cleanup orphaned VMs with prefix matching

Uses ghcr.io/cirruslabs/macos-tahoe-base:latest as a smaller test image
(configurable via EFR_TEST_TART_IMAGE). Tart steps auto-skip via
godog.ErrPending when tart binary is not available.

CI matrix gains a "Tart" job running on self-hosted macOS arm64 runner
with nested virtualization enabled.
@STRRL STRRL force-pushed the feat/tart-integration-tests branch from 3a115bb to 660b772 Compare April 7, 2026 05:52
@chatgpt-codex-connector
Copy link
Copy Markdown

💡 Codex Review


P1 Badge Run Tart integration matrix on a macOS runner

The integration matrix now runs entirely on [self-hosted, linux, arm64], including the new Tart case, so the Tart lifecycle scenario is not actually exercised on the intended platform. In this repo Tart is the macOS VM path (README), and the new step a tart manager explicitly marks the scenario pending when tart is unavailable (steps_test.go), which means Linux runners can report a passing/green job without validating Tart behavior. Use a macOS label for the Tart matrix entry (or matrix-specific runs-on) so this test provides real coverage.


sc.Step(`^I stop and delete the VM$`, func() error {
if err := state.tartMgr.Stop(context.Background(), state.tartVMName); err != nil {
return fmt.Errorf("stop VM: %w", err)
}
return state.tartMgr.Delete(context.Background(), state.tartVMName)

P2 Badge Ensure Tart VMs are torn down on step failure

VM cleanup is only performed in this explicit scenario step, so if any earlier Tart step fails (pull/clone/start/ip/exec), the remaining steps are skipped and stop/delete never runs. Because these tests run on persistent self-hosted machines, leaked VMs can accumulate and interfere with subsequent jobs. Add unconditional teardown in sc.After (or equivalent) that attempts to stop/delete state.tartVMName whenever it was created.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@codecov
Copy link
Copy Markdown

codecov bot commented Apr 7, 2026

Codecov Report

❌ Patch coverage is 66.66667% with 1 line in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
internal/api/server.go 66.66% 1 Missing ⚠️
Flag Coverage Δ
integration-GitHub App 42.50% <66.66%> (+5.07%) ⬆️
integration-PAT 41.21% <66.66%> (-0.72%) ⬇️
integration-Tart ?
unittests 20.42% <0.00%> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
internal/management/jobs.go 73.91% <ø> (ø)
internal/api/server.go 78.90% <66.66%> (ø)

... and 1 file with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

STRRL added 2 commits April 6, 2026 22:55
Use per-matrix runs-on so PAT/GitHub App tests run on Linux arm64
while Tart tests run on macOS arm64 (required for Apple
Virtualization.framework and nested virtualization).
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 45b7b78937

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".


Scenario: list and cleanup orphaned VMs
Given a tart manager
When I clone a VM with a random name
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Pull image before cloning in orphan-cleanup scenario

The list and cleanup orphaned VMs scenario clones immediately without first ensuring the test image is present locally, so on a clean runner (or when this scenario is run in isolation) Clone can fail before any cleanup assertions run. This makes the scenario order-dependent on pull, clone, start, exec, and cleanup a VM instead of being self-contained.

Useful? React with 👍 / 👎.

… API

GitHub Actions API sends job results in lowercase ("succeeded", "failed")
but the mapping code expected Title Case ("Succeeded", "Failed"), causing
all completed jobs to show as FAIL in the dashboard (issue #68).
"failed": {},
"Succeeded": {},
"Failed": {},
}
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

😮‍💨 slop

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 90f9e155d2

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +762 to +763
_ = state.tartMgr.Stop(context.Background(), name)
_ = state.tartMgr.Delete(context.Background(), name)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Propagate tart cleanup failures instead of discarding them

This step ignores both Stop and Delete errors, so the scenario can still pass even when cleanup fails for prefixed VMs (especially stale orphans from prior runs). Because the next assertion only verifies the current cloned VM name, failures on other matched VMs are silently masked, which can leave orphaned Tart VMs behind and hide regressions in the cleanup path.

Useful? React with 👍 / 👎.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant