hot: GC-protect pending_internal_promise on every store path#32572
hot: GC-protect pending_internal_promise on every store path#32572robobun wants to merge 4 commits into
Conversation
The promise returned by JSModuleLoader::loadModule is not retained by JSC once it settles; the --hot run loop polls pending_internal_promise every tick via report_exception_in_hot_reloaded_module_if_needed and reload(), so the stored cell pointer must be a GC root. Only the preload path protected it; the common loadAndEvaluateModule / resolvedPromise / setOverrideModuleRunMainPromise paths stored the raw pointer with is_protected = false, leaving a window where a collection between ticks frees the cell and the next status() read dereferences a dead JSPromise (BUN-38CT: SIGSEGV in JSPromise::status -> WriteBarrierBase::get). Route every store through a set_pending_internal_promise helper that unprotects the outgoing value and protects the incoming one. The protect count is balanced across reloads (stays at 1).
WalkthroughThe PR centralizes GC-root management for ChangesCentralize GC rooting for pending_internal_promise
Suggested reviewers
🚥 Pre-merge checks | ✅ 4✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Comment |
|
Updated 11:58 AM PT - Jun 21st, 2026
❌ @robobun, your commit 08fce46 has 3 failures in
🧪 To try this PR locally: bunx bun-pr 32572That installs a local version of the PR into your bun-32572 --bun |
|
Found 2 issues this PR may fix:
🤖 Generated with Claude Code |
|
On the suggested linked issues:
|
…till_alive; make test assertion non-vacuous
…field Every store goes through set_pending_internal_promise, so the invariant is_protected == pending_internal_promise.is_some() always holds and the bool mirrors recoverable information. The previous readers in reload() and swap_global_for_test_isolation() were removed in 9ea8279, and VirtualMachine is repr(Rust) so there is no layout contract.
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/jsc/VirtualMachine.rs`:
- Around line 2275-2282: The set_pending_internal_promise method protects a
JSValue when storing a promise but this protection is never released during VM
teardown. In the destroy() method of VirtualMachine, add a call to
set_pending_internal_promise(None) at the very beginning while the JSC VM and
global objects are still alive. This will trigger the unprotect logic in
set_pending_internal_promise and ensure the protect reference count is balanced
on the terminal destruction path.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 813c15d9-794f-4472-99d8-4f06056e06b4
📒 Files selected for processing (5)
src/jsc/JSPromise.rssrc/jsc/VirtualMachine.rssrc/runtime/hw_exports.rssrc/runtime/jsc_hooks.rstest/cli/hot/hot.test.ts
There was a problem hiding this comment.
All prior feedback addressed and no new issues found — deferring to a human since this reworks GC protect/unprotect bookkeeping for a raw cell pointer across eight unsafe store sites in the core runtime.
Extended reasoning...
Overview
Centralizes GC rooting of VirtualMachine.pending_internal_promise into a new set_pending_internal_promise setter that unprotects the previous cell and protects the new one, routes all eight store sites through it (reload_entry_point, reload_entry_point_for_test_runner, load_preloads, set_override_module_run_main_promise, swap_global_for_test_isolation, the patched-runMain and HTML/no-transpile paths), drops the now-redundant pending_internal_promise_is_protected bool field, tightens the status_ptr doc contract, and adds a --hot test that asserts at least one Promise is in protectedObjectTypeCounts during and after entry-point evaluation.
Review history
I left four findings across two prior rounds; all are addressed in 3a0ad44 and 9129879:
- The scoped
_protectedguard inload_preloadsis retained alongside the setter (with a comment explaining the HMR-swap rationale), so the post-loopstatus()deref stays rooted. - The five now-redundant
ensure_still_alive()calls were removed. - The vacuous
expect.any(...)assertion was rewritten so the combined-objecttoEqualactually fails (and prints the counts map) on regression. - The
pending_internal_promise_is_protectedfield was deleted and the setter branches on theOptiondirectly; grep confirms no remaining.rsreferences.
Security risks
No new attack surface — this is internal GC-root bookkeeping, no parsing/auth/crypto. The risk class is memory safety: an unbalanced unprotect() would reintroduce the very use-after-free this fixes, and an unbalanced protect() would leak one promise per reload. The setter's invariant (Some ⇔ protected) looks correct on every path I traced, and the protect-count-across-reloads check in the PR description shows it stays at 1 over 10 reloads.
Level of scrutiny
High. This is core --hot run-loop code in VirtualMachine.rs/jsc_hooks.rs with pervasive unsafe raw-pointer derefs of GC-managed cells, the underlying crash is non-deterministic (conservative stack scanning masks it), and the test asserts an indirect invariant rather than reproducing the SIGSEGV. The reentrant load_preloads-inside-reload() path in particular is subtle enough that a maintainer familiar with the hot-reload state machine should sign off.
Other factors
Removing a public struct field is a small layout change (though repr(Rust), so no ABI contract). CI build #63812 is in progress. hot.test.ts, watch.test.ts, bun-main-entry-point.test.ts, and preload.test.ts reportedly pass locally.
There was a problem hiding this comment.
All four points from my earlier reviews are addressed (scoped guard restored in load_preloads, redundant ensure_still_alive dropped, test assertion made non-vacuous, _is_protected field removed) and I found nothing new — deferring only because GC-root lifecycle changes across the reentrant --hot reload path are the kind of thing a human should sign off on.
Extended reasoning...
Overview
This PR centralizes GC-root management for VirtualMachine.pending_internal_promise by introducing set_pending_internal_promise(), which atomically unprotects the outgoing pointer and protects the incoming one. All store sites in reload_entry_point, reload_entry_point_for_test_runner, reload, swap_global_for_test_isolation, set_override_module_run_main_promise, and load_preloads are routed through it. The now-redundant pending_internal_promise_is_protected bool is dropped, the status_ptr doc is tightened to state the caller's GC-root obligation, and a regression test in hot.test.ts asserts via heapStats().protectedObjectTypeCounts that at least one Promise is protected both during entry-module evaluation and on the next tick.
Security risks
None identified. This is internal GC-root bookkeeping for the --hot run loop; it does not touch auth, crypto, permissions, parsing of untrusted input, or any externally-reachable surface. The risk profile is memory-safety (use-after-free of a JSC heap cell), which the change strictly tightens relative to main.
Level of scrutiny
High. The change is small and well-reasoned, but it sits at the intersection of JSC's protect/unprotect refcounting, conservative stack scanning, and the reentrant --hot reload path (where tick() can re-enter reload_entry_point and swap the field underneath an in-flight load_preloads loop). My first-pass review caught exactly that reentrancy edge (the scoped guard removal in load_preloads), which the author fixed; that the edge existed at all is why I'd want a human familiar with the hot-reload state machine to confirm the protect/unprotect balance on every path before this ships against a 191-event Sentry crash.
Other factors
- All four of my earlier inline comments are addressed in commits 3a0ad44 and 9129879, and the bug-hunting pass on the current revision found nothing.
- CodeRabbit's note about clearing the slot in
destroy()was resolved by the author without a code change; that seems reasonable (worker VMs don't run--hotso the field is neverSomethere, and the main VM exits the process rather than callingdestroy()), but it's worth a second opinion. - The new test asserts the invariant rather than reproducing the race (which the PR description explains is not deterministically reproducible), and the author confirmed the protect count stays balanced at 1 across 10 reloads.
- No CODEOWNERS entries cover the touched files.
|
CI status: the diff itself is green. The new test in The hard failures on both build 63812 and 63820 are puppeteer's which is an infra/cache issue on that agent, not a result of this change. The remaining warnings are known flakes that passed on retry ( |
Crash
Sentry BUN-38CT (191 events, Windows, bun 1.3.14):
Cause
VirtualMachine.pending_internal_promise: Option<*mut JSInternalPromise>is a raw GC-cell pointer that the--hotrun loop polls every tick (report_exception_in_hot_reloaded_module_if_neededandreload). It was onlyprotect()'d on the preload-return path; every other store site (loadAndEvaluateModulefor the main module,JSC__JSInternalPromise__resolvedPromiseon the patched-runMainpath,Bun__VirtualMachine__setOverrideModuleRunMainPromise, and the per-preload store inload_preloads) left it unprotected.The promise returned by
JSModuleLoader::loadModuleis not retained by the module loader once it settles: it is a freshJSPromise::createwhose only inbound GC edge is a microtask reaction that detaches when it fires (JSModuleLoader.cpp:resultPromiseat the tail of bothloadModuleoverloads; never written tom_moduleMap/m_loadedModules/ModuleGraphLoadingState). So once the entry module finishes evaluating, the only thing keeping the cell alive is conservative stack scanning of the caller's dead locals. When the optimizer or a later call chain clobbers that slot, the nextBun.gc/incremental collection frees the cell and the per-tickstatus()read dereferences a deadJSPromise.Fix
Add
VirtualMachine::set_pending_internal_promisewhich unprotects the previously stored promise (if any) and protects the new one, and route all store sites through it:reload_entry_point(preload, patched-runMain,loadAndEvaluateModule, HTML-entry, no-transpile)reload_entry_point_for_test_runner(preload,loadAndEvaluateModule)load_preloadsper-iteration store (keeps an additional scoped.protected()guard so the loop-entry local survives an HMR field swap insidetick())set_override_module_run_main_promiseswap_global_for_test_isolationclearreload()'s explicit unprotect block is removed since the next store now balances itWith every store routed through the helper the field is protected exactly when
Some, so the separatepending_internal_promise_is_protectedbool is dropped. The now-redundantensure_still_alive()calls that followed each store are also removed.Verification
The race itself is not deterministically reproducible in-process: conservative stack scanning kept the cell alive in every GC-stress scenario tried (
useZombieMode,scribbleFreeCells,collectContinuously, aggressiveBun.gc(true)across hot reloads), confirmed by instrumentingreport_exception_in_hot_reloaded_module_if_neededto callcell->inherits<JSC::JSPromise>()after each collection. The test instead asserts the invariant the fix establishes: under--hot,heapStats().protectedObjectTypeCountsmust show at least one protectedPromiseboth during module evaluation and on the next tick.The protect is balanced across reloads (stays at 1 over 10 reloads).
protect-count across reloads
Full
hot.test.ts,watch.test.ts,bun-main-entry-point.test.ts, andpreload.test.tspass locally.