chore(ci): Prepare macOS workflows for dual Cirrus/Bitrise runners#6274
chore(ci): Prepare macOS workflows for dual Cirrus/Bitrise runners#6274itaybre wants to merge 1 commit into
Conversation
Add runner_provider matrix dimension to all macOS CI jobs so they can run on both Cirrus and Bitrise. Bitrise jobs use continue-on-error so they won't block CI. Jobs won't actually run on Bitrise yet until the pool is provisioned — this prepares the ground to enable it. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Semver Impact of This PR⚪ None (no version bump detected) 📋 Changelog PreviewThis is how your changes will appear in the changelog.
🤖 This preview updates automatically when you update the PR. |
iOS (legacy) Performance metrics 🚀
|
| Revision | Plain | With Sentry | Diff |
|---|---|---|---|
| 7ac3378+dirty | 1213.37 ms | 1218.15 ms | 4.78 ms |
| 1122a96+dirty | 3823.10 ms | 1218.64 ms | -2604.46 ms |
| 5748023+dirty | 3840.49 ms | 1227.43 ms | -2613.05 ms |
| b0d3373+dirty | 3831.75 ms | 1227.29 ms | -2604.46 ms |
| 6177334+dirty | 3834.85 ms | 1217.58 ms | -2617.28 ms |
| 3ce5254+dirty | 1219.93 ms | 1221.90 ms | 1.96 ms |
| 5569641+dirty | 3839.22 ms | 1231.30 ms | -2607.91 ms |
| 5257d80+dirty | 3854.39 ms | 1234.28 ms | -2620.11 ms |
| 882f8ae+dirty | 3840.30 ms | 1224.41 ms | -2615.88 ms |
| 5c1e987+dirty | 1204.30 ms | 1222.15 ms | 17.85 ms |
App size
| Revision | Plain | With Sentry | Diff |
|---|---|---|---|
| 7ac3378+dirty | 3.38 MiB | 4.76 MiB | 1.38 MiB |
| 1122a96+dirty | 5.15 MiB | 6.68 MiB | 1.53 MiB |
| 5748023+dirty | 5.15 MiB | 6.68 MiB | 1.53 MiB |
| b0d3373+dirty | 5.15 MiB | 6.68 MiB | 1.53 MiB |
| 6177334+dirty | 5.15 MiB | 6.68 MiB | 1.53 MiB |
| 3ce5254+dirty | 3.38 MiB | 4.76 MiB | 1.38 MiB |
| 5569641+dirty | 5.15 MiB | 6.67 MiB | 1.51 MiB |
| 5257d80+dirty | 5.15 MiB | 6.69 MiB | 1.54 MiB |
| 882f8ae+dirty | 5.15 MiB | 6.70 MiB | 1.54 MiB |
| 5c1e987+dirty | 3.38 MiB | 4.73 MiB | 1.35 MiB |
📲 Install BuildsAndroid
|
| rn-architecture: ['legacy', 'new'] | ||
| platform: ["ios", "android"] | ||
| runner_provider: ["cirrus", "bitrise"] | ||
| include: |
There was a problem hiding this comment.
max-parallel: 2 can be consumed by queued unprovisioned bitrise jobs, starving cirrus work
The metrics job sets max-parallel: 2 to stay under Sauce Labs' 3-session limit. With the new runner_provider dimension, the iOS matrix now creates 2 bitrise jobs (one per rn-architecture) that target the unprovisioned bitrise_pool_name:tahoe runner. GitHub Actions counts jobs that are dispatched-but-queued (waiting for a runner) against max-parallel. If GitHub schedules the 2 bitrise iOS jobs first, they can occupy both parallel slots while sitting in the queue indefinitely (no runner exists), preventing the executable cirrus iOS/Android jobs from starting until the queued jobs time out. continue-on-error: true handles eventual failure but does not relieve the wall-clock blocking while the bitrise jobs wait. Note: scheduling order is non-deterministic, so this starvation will not occur on every run.
Evidence
e2e-v2.ymlmetricsjob setsmax-parallel: 2with matrixrunner_provider: ["cirrus", "bitrise"]and excludes onlyandroid+bitrise, yielding 2 bitrise iOS jobs (one perrn-architecture).runs-onresolves to["bitrise_pool_name:tahoe"]for bitrise jobs, an unprovisioned pool per the PR description, so those jobs queue with no runner to pick them up.- The per-platform skip is implemented as step-level
if:conditions (e.g.platform-check), not a job-level gate, so all matrix jobs (including bitrise) are still created and occupy scheduling slots. - GitHub Actions counts dispatched-but-queued jobs against
max-parallel; if the 2 bitrise jobs are scheduled first they fill both slots until they time out, blocking cirrus jobs that could actually run.
Identified by Warden code-review · CY2-CUC
There was a problem hiding this comment.
Unprovisioned bitrise matrix entries in build jobs may delay or block downstream cirrus test jobs that needs them
The PR adds runner_provider: bitrise matrix entries to build-ios (sample-application.yml:64) and react-native-build (e2e-v2.yml:259), targeting runner labels (bitrise_pool_name:tahoe/{macos_version}) for pools the PR states are not provisioned in this repo. Downstream test-ios (needs: [..., build-ios], sample-application.yml:333) and react-native-test (needs: [react-native-build, ...], e2e-v2.yml:426) wait for the ENTIRE upstream job — all matrix combinations — to reach a terminal state before any downstream matrix entry starts. continue-on-error: true on the bitrise entries only takes effect after a runner picks up the job and it fails; it does nothing while a job sits in the queued/waiting-for-runner state. If the bitrise entries queue waiting for a runner that never appears (rather than failing fast on an unmatched label), the cirrus test-ios/react-native-test jobs are delayed until those entries hit their runner-acquisition timeout, potentially stalling iOS/e2e CI on every PR. The exact behavior depends on how GitHub/Cirrus/Bitrise runner-group label matching handles labels with no registered runner (fail-fast vs. queue-and-timeout), which cannot be confirmed from the repo.
Evidence
test-iosdeclaresneeds: [diff_check, detect-changes, build-ios](sample-application.yml:333); GitHub Actions waits for every matrix combination ofbuild-ios(including the newbitriseentries from line 64) before starting anytest-iosentry.react-native-testdeclaresneeds: [react-native-build, diff_check, detect-changes](e2e-v2.yml:426); same all-matrix wait applies to thebitriseentries added at line 259.continue-on-error: ${{ matrix.runner_provider == 'bitrise' }}(e.g. sample-application.yml:48, e2e-v2.yml:240) only suppresses failure after a runner runs the job; a job still waiting for a runner is not terminal and cannot satisfyneeds.- The PR description explicitly states the Bitrise runner pools are not provisioned for this repo, so no runner matches
bitrise_pool_name:*, meaning those entries cannot run normally and rely on a queue/timeout to become terminal. - Whether an unmatched runner label fails fast or queues until a timeout is infra-dependent and not determinable from the repo, so the magnitude of the delay (and whether downstream is meaningfully blocked) is uncertain.
New actions/cache@v4 steps use floating version tag instead of pinned commit SHA (.github/workflows/native-tests.yml:59)
The newly added Cache Ruby steps reference actions/cache@v4 without a commit SHA, while every other action reference in these workflows (e.g. actions/cache@27d5ce7f…, actions/checkout@df4cb1c…) uses a pinned SHA. A tag mutation or compromise of the v4 ref could execute arbitrary code on the runner with access to the environment, including secrets like SENTRY_AUTH_TOKEN and signing credentials (MATCH_PASSWORD, MATCH_GIT_PRIVATE_KEY).
Evidence
testflight.yml:28(a changed file) adds- uses: actions/cache@v4for the newCache Rubystep, while the adjacentactions/checkout@df4cb1c069e1874edd31b4311f1884172cec0e10 # v6andruby/setup-ruby@afeafc3d1ab54a631816aba4c914a0081c12ff2f # v1are pinned to SHAs.- The same unpinned
actions/cache@v4pattern is added across changed workflows:sample-application.yml:78,263,size-analysis.yml:116,e2e-v2.yml:121,379,sample-application-expo.yml:69— contrasting with the pre-existing pinnede2e-v2.yml:153(actions/cache@27d5ce7f107fe9357f9df03efb73ab90386fccae # v5). - These workflows handle signing/upload secrets (
testflight.ymlrunsbundle installand TestFlight upload), so a compromised floating tag runs in the same context as those secrets.
Both runner providers upload the same TestFlight build number, causing a duplicate-build conflict (.github/workflows/sample-application.yml:69)
In testflight.yml the upload_to_testflight job runs a runner_provider: ['cirrus', 'bitrise'] matrix. Both matrix legs execute yarn set-build-number ${{ github.run_number }} (identical across all matrix combinations in one workflow run) and then bundle exec fastlane ios upload_react_native_sample_to_testflight. Once Bitrise pools are provisioned, both legs will attempt to upload a build with the same build number; Apple App Store Connect rejects a build number that has already been processed, so the second upload will fail. Because continue-on-error is true only for bitrise, whichever provider uploads second determines the failure: if bitrise wins the race, the cirrus leg fails and breaks the workflow. There is no if: guard designating a single authoritative uploader.
Evidence
- testflight.yml line 24: matrix
runner_provider: ['cirrus', 'bitrise']produces two parallel jobs from one workflow run. - Line 69:
yarn set-build-number ${{ github.run_number }}usesgithub.run_number, which is identical across all matrix legs of a single run, so both jobs set the same build number. - Line 94:
bundle exec fastlane ios upload_react_native_sample_to_testflightruns unconditionally in both legs with noif: matrix.runner_provider == 'cirrus'guard to pick one uploader. continue-on-error: ${{ matrix.runner_provider == 'bitrise' }}masks only the bitrise failure; if bitrise uploads first, the cirrus leg's duplicate upload fails and surfaces as a workflow failure.- Currently latent: the PR notes Bitrise pools are not yet provisioned, so the conflict only manifests once both providers are active.
Both cirrus and bitrise TestFlight jobs upload with the same build number, risking duplicate-build conflicts and cirrus flakiness
The upload_to_testflight job now runs across a runner_provider: ['cirrus', 'bitrise'] matrix. Both matrix entries run yarn set-build-number ${{ github.run_number }} (identical value per run) and then unconditionally call bundle exec fastlane ios upload_react_native_sample_to_testflight. Once the Bitrise pools are provisioned, both jobs will attempt to upload a build with the same build number for the same app version, which Apple rejects as a duplicate. Because the two jobs run in parallel, whichever loses the race fails its upload. The bitrise failure is tolerated via continue-on-error: ${{ matrix.runner_provider == 'bitrise' }}, but the cirrus job has no such guard, so if cirrus loses the race the whole workflow fails. The actual upload (and arguably the build-number bump) should be restricted to a single provider via an if: matrix.runner_provider == 'cirrus' guard.
Evidence
testflight.ymladdsrunner_provider: ['cirrus', 'bitrise']to theupload_to_testflightmatrix with no per-providerif:guard on the build/upload steps.Set Build Numberrunsyarn set-build-number ${{ github.run_number }}(maps toreact-native-version --set-buildinsamples/react-native/package.json);github.run_numberis identical for both matrix entries in the same run.Run Fastlanerunsbundle exec fastlane ios upload_react_native_sample_to_testflightunconditionally for both providers.continue-on-erroris set only forbitrise, so a cirrus upload that loses the duplicate-build race would fail the workflow.- Impact is deferred: the PR states Bitrise pools are not yet provisioned, so today only cirrus runs and the conflict is dormant.
Identified by Warden find-bugs
Android (legacy) Performance metrics 🚀
|
| Revision | Plain | With Sentry | Diff |
|---|---|---|---|
| 15d4514+dirty | 406.77 ms | 428.06 ms | 21.29 ms |
| 038a6d7+dirty | 524.82 ms | 531.92 ms | 7.10 ms |
| 4b87b12+dirty | 421.82 ms | 413.60 ms | -8.22 ms |
| 5ee78d6+dirty | 551.80 ms | 568.27 ms | 16.47 ms |
| 853723c+dirty | 405.54 ms | 440.08 ms | 34.54 ms |
| 4966363+dirty | 400.04 ms | 431.08 ms | 31.04 ms |
| 7ff4d0f+dirty | 413.81 ms | 450.64 ms | 36.83 ms |
| bc0d8cf+dirty | 412.37 ms | 466.26 ms | 53.89 ms |
| ef27341+dirty | 412.94 ms | 443.98 ms | 31.04 ms |
| 2c735cc+dirty | 414.09 ms | 438.47 ms | 24.38 ms |
App size
| Revision | Plain | With Sentry | Diff |
|---|---|---|---|
| 15d4514+dirty | 48.30 MiB | 53.60 MiB | 5.30 MiB |
| 038a6d7+dirty | 48.30 MiB | 53.60 MiB | 5.30 MiB |
| 4b87b12+dirty | 43.75 MiB | 48.14 MiB | 4.39 MiB |
| 5ee78d6+dirty | 48.30 MiB | 53.58 MiB | 5.28 MiB |
| 853723c+dirty | 48.30 MiB | 53.58 MiB | 5.28 MiB |
| 4966363+dirty | 48.30 MiB | 53.54 MiB | 5.24 MiB |
| 7ff4d0f+dirty | 48.30 MiB | 53.60 MiB | 5.30 MiB |
| bc0d8cf+dirty | 48.30 MiB | 53.48 MiB | 5.18 MiB |
| ef27341+dirty | 48.30 MiB | 53.54 MiB | 5.24 MiB |
| 2c735cc+dirty | 43.75 MiB | 48.08 MiB | 4.33 MiB |
iOS (new) Performance metrics 🚀
|
| Revision | Plain | With Sentry | Diff |
|---|---|---|---|
| 7ac3378+dirty | 1202.35 ms | 1198.31 ms | -4.04 ms |
| 1122a96+dirty | 3839.17 ms | 1219.23 ms | -2619.93 ms |
| 5748023+dirty | 3844.74 ms | 1225.49 ms | -2619.26 ms |
| b0d3373+dirty | 3842.49 ms | 1218.49 ms | -2624.00 ms |
| 6177334+dirty | 3851.52 ms | 1226.23 ms | -2625.29 ms |
| 3ce5254+dirty | 1217.70 ms | 1224.69 ms | 6.99 ms |
| 5569641+dirty | 3824.35 ms | 1210.78 ms | -2613.57 ms |
| 5257d80+dirty | 3845.40 ms | 1226.21 ms | -2619.19 ms |
| 882f8ae+dirty | 3842.51 ms | 1230.40 ms | -2612.11 ms |
| 5c1e987+dirty | 1208.43 ms | 1220.72 ms | 12.29 ms |
App size
| Revision | Plain | With Sentry | Diff |
|---|---|---|---|
| 7ac3378+dirty | 3.38 MiB | 4.76 MiB | 1.38 MiB |
| 1122a96+dirty | 5.15 MiB | 6.68 MiB | 1.53 MiB |
| 5748023+dirty | 5.15 MiB | 6.68 MiB | 1.53 MiB |
| b0d3373+dirty | 5.15 MiB | 6.68 MiB | 1.53 MiB |
| 6177334+dirty | 5.15 MiB | 6.68 MiB | 1.53 MiB |
| 3ce5254+dirty | 3.38 MiB | 4.76 MiB | 1.38 MiB |
| 5569641+dirty | 5.15 MiB | 6.67 MiB | 1.51 MiB |
| 5257d80+dirty | 5.15 MiB | 6.69 MiB | 1.54 MiB |
| 882f8ae+dirty | 5.15 MiB | 6.70 MiB | 1.54 MiB |
| 5c1e987+dirty | 3.38 MiB | 4.73 MiB | 1.35 MiB |
Android (new) Performance metrics 🚀
|
| Revision | Plain | With Sentry | Diff |
|---|---|---|---|
| 15d4514+dirty | 413.63 ms | 449.62 ms | 35.99 ms |
| df5d108+dirty | 434.82 ms | 447.39 ms | 12.57 ms |
| 038a6d7+dirty | 499.02 ms | 527.68 ms | 28.66 ms |
| 5ee78d6+dirty | 411.18 ms | 437.83 ms | 26.65 ms |
| 4953e94+dirty | 398.80 ms | 431.81 ms | 33.01 ms |
| 853723c+dirty | 415.82 ms | 460.94 ms | 45.12 ms |
| 4966363+dirty | 415.67 ms | 448.60 ms | 32.93 ms |
| 7ff4d0f+dirty | 403.38 ms | 427.06 ms | 23.68 ms |
| ef27341+dirty | 519.02 ms | 553.42 ms | 34.40 ms |
| a50b33d+dirty | 353.21 ms | 398.48 ms | 45.27 ms |
App size
| Revision | Plain | With Sentry | Diff |
|---|---|---|---|
| 15d4514+dirty | 48.30 MiB | 53.60 MiB | 5.30 MiB |
| df5d108+dirty | 43.94 MiB | 48.94 MiB | 5.00 MiB |
| 038a6d7+dirty | 48.30 MiB | 53.60 MiB | 5.30 MiB |
| 5ee78d6+dirty | 48.30 MiB | 53.58 MiB | 5.28 MiB |
| 4953e94+dirty | 43.94 MiB | 48.94 MiB | 5.00 MiB |
| 853723c+dirty | 48.30 MiB | 53.58 MiB | 5.28 MiB |
| 4966363+dirty | 48.30 MiB | 53.54 MiB | 5.24 MiB |
| 7ff4d0f+dirty | 48.30 MiB | 53.60 MiB | 5.30 MiB |
| ef27341+dirty | 48.30 MiB | 53.54 MiB | 5.24 MiB |
| a50b33d+dirty | 43.94 MiB | 48.94 MiB | 5.00 MiB |
📢 Type of change
📜 Description
Adds a
runner_providermatrix dimension (cirrus/bitrise) to all macOS CI jobs, replicating the approach from sentry-cocoa#7971.Jobs won't actually run on Bitrise yet, we need to install the Github app into the repo, which is still pending
Changes per job
native-tests.ymltest-iossample-application.ymlbuild-ios,build-macos,test-iossample-application-expo.ymlbuild-iose2e-v2.ymlmetrics,react-native-build,react-native-testsize-analysis.ymliostestflight.ymlupload_to_testflightPattern applied
runner_providermatrix:["cirrus", "bitrise"]added to every macOS job. For mixed-platform jobs (e2e), Android + Bitrise combinations are excluded.runs-on: Bitrise usesbitrise_pool_name:<macos_version>, Cirrus keeps the original runner image.continue-on-error: Bitrise jobs won't block CI.rbenvwith caching; Cirrus keepsruby/setup-rubyaction.runner_providerto avoid collisions.💡 Motivation and Context
Preparing to evaluate Bitrise as an alternative/additional macOS CI runner provider, matching the effort in sentry-cocoa.
💚 How did you test it?
Running workflows on this PR. Bitrise jobs are expected to be skipped (no pool provisioned) or fail gracefully (
continue-on-error).📝 Checklist
sendDefaultPIIis enabled🔮 Next steps