sql(postgres,mysql): guard ref_and_close against re-entrant TLS close dispatch#32573
sql(postgres,mysql): guard ref_and_close against re-entrant TLS close dispatch#32573robobun wants to merge 5 commits into
Conversation
… dispatch Closing a TLS socket from ref_and_close synchronously dispatches the on_handshake callback (when the handshake never completed) and then on_close. Both re-enter fail_with_js_value. When the outer entry was disconnect() the status is Disconnected, not Failed, so the status == Failed guard in fail_with_js_value does not trip and a nested ref_and_close runs on the same us_socket_t. The on_handshake dispatch happens before the C layer flips is_closed, so is_closed() alone does not prevent the second close. Add a monotonic CLOSE_INITIATED flag that ref_and_close sets before the first socket.close() and checks on every entry. Detach the stored socket handle in the socket-level on_close so nothing can read through it after the us_socket_t is freed at the end of the loop iteration. Apply the same copy-and-detach pattern to MySQLConnection::close, which had no is_closed() guard at all. Sentry BUN-3KKD / BUN-3GMS (29 events, macOS, 1.3.14): segfault at 0x0 in us_internal_ssl_close with two nested refAndClose frames on the stack.
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: ASSERTIVE Plan: Pro Run ID: 📒 Files selected for processing (2)
WalkthroughAdds a ChangesSQL TLS close re-entrancy guard
Possibly related PRs
Suggested reviewers
🚥 Pre-merge checks | ✅ 4✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Comment |
|
Updated 11:55 AM PT - Jun 21st, 2026
❌ @robobun, your commit c321423 has 1 failures in
🧪 To try this PR locally: bunx bun-pr 32573That installs a local version of the PR into your bun-32573 --bun |
|
Found 1 issue this PR may fix:
🤖 Generated with Claude Code |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@test/js/sql/sql-postgres-tls-close-reentry.test.ts`:
- Around line 48-50: The listen() function currently only handles the successful
case where the server starts listening, but does not handle errors emitted
during server startup. If server.listen() emits an error event, the Promise
never settles, causing the test to hang indefinitely until timeout. Wire the
Promise to reject when the server emits an error event, or refactor to use await
once(server, "listening") which automatically handles both success and failure
cases for server startup, ensuring errors are properly propagated instead of
hanging.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: cb23b864-0633-45b7-8b20-b62c94d48eac
📒 Files selected for processing (4)
src/sql/shared/ConnectionFlags.rssrc/sql_jsc/mysql/MySQLConnection.rssrc/sql_jsc/postgres/PostgresSQLConnection.rstest/js/sql/sql-postgres-tls-close-reentry.test.ts
There was a problem hiding this comment.
I didn't find any issues, but this touches native TLS socket-lifecycle and re-entrancy invariants across two drivers, so it's worth a human look.
Extended reasoning...
Overview
This PR adds a ConnectionFlags::CLOSE_INITIATED bitflag and uses it to make the socket-close path idempotent in both PostgresSQLConnection::ref_and_close and MySQLConnection::close, plus detaches the stored us_socket_t handle in SocketHandler::on_close so it can't dangle after the C layer frees it. A new mock-server regression test exercises both re-entry shapes (disconnect-after-Connected and timeout-during-pending-handshake).
Security risks
None introduced. The change is purely defensive — it narrows the set of states in which socket.close() can be called and clears a stored raw pointer earlier. No new inputs, parsing, auth, or trust boundaries.
Level of scrutiny
High. This is native Rust touching raw us_socket_t* lifetime, synchronous TLS callback re-entrancy, and the hasPendingActivity GC-keepalive invariant across two database drivers. The fix is small and the reasoning in the description is thorough, but the correctness depends on subtle ordering between us_internal_ssl_close's on_handshake/on_close dispatch and the is_closed flip in the C layer — and the author notes the regression test does not deterministically reproduce the crash on Linux CI (all 29 Sentry events were macOS/kqueue). A reviewer familiar with the uSockets close path should confirm the detach-in-on_close doesn't break any other reader of self.socket between close-initiation and on_close firing.
Other factors
- My earlier nit (per-test timeouts) and CodeRabbit's
listen()error-handling suggestion were both addressed; all inline comments are resolved. - CI on the latest commit shows musl build failures and an unrelated-looking S3 test SIGABRT — likely infra/flake, but worth confirming before merge.
- The MySQL
close()change also fixes a pre-existing missingis_closed()guard, which is a behavior change beyond the immediate crash fix.
|
CI status: the diff is green. Both CI runs failed on unrelated tests my change does not touch:
Neither failure mentions |
Crash
Sentry BUN-3KKD / BUN-3GMS (29 events, macOS, bun 1.3.14): segfault at address 0x0 in
us_internal_ssl_closewith two nestedrefAndCloseframes on the stack:Cause
ref_and_closecallssocket.close()on a TLS socket.us_internal_ssl_closesynchronously dispatcheson_handshake(when the handshake never completed) and thenon_close. Both re-enterfail_with_js_value. That function's re-entry guard isif status == Failed return, but when the outer entry wasdisconnect()the status isDisconnected, so the guard does not trip and a nestedref_and_closeruns on the sameus_socket_t.The
on_handshakedispatch happens beforeus_internal_socket_close_rawsetsis_closed = 1, so the nestedis_closed()check is not sufficient to prevent a secondclose().Traced with debug logging on Linux (the nested
ref_and_closeis reached; on Linux the inneris_closed()happens to be true by the time it runs so the second close is skipped):self.socketis also never detached after close, so onceus_internal_free_closed_socketsfrees theus_socket_tat the end of the loop iteration the stored handle dangles.Fix
ConnectionFlags::CLOSE_INITIATED: set once before the firstsocket.close(), checked on everyref_and_closeentry. Monotonic; never cleared. Combined with the existingis_closed()check this makes the close path idempotent regardless of which callback re-enters.SocketHandler::on_closedetachesself.socketso the stored handle never outlives theus_socket_t. ThehasPendingActivityinvariant for deferred TLS closes is preserved:self.socketstaysConnected(ptr)untilon_closeactually fires, so a graceful-close connection still reports pending activity until the peer'sclose_notifyarrives.MySQLConnection::closehad nois_closed()guard at all and thescopeguard::defer!infail_with_js_valueruns it on every exit path including the early return; it now detaches the stored handle first, closes through a local copy, and skips whenCLOSE_INITIATEDis already set.Tests
test/js/sql/sql-postgres-tls-close-reentry.test.tsexercises both re-entry shapes (disconnect on a Connected TLS connection; connection timeout during a stalled TLS handshake) against a Docker-free mock Postgres-over-TLS server, 30/10 iterations under ASAN, and verifies onclose fires exactly once and the wrappers are collectable afterwards.Gate note: the double-close only reaches
us_internal_ssl_closea second time when theon_handshakedispatch runs withstatus == Disconnectedandssl_handshake_state == PENDINGsimultaneously. On Linux/epoll those never overlap (by the time the connection isConnectedthe handshake state isCOMPLETED, andclose_rawflipsis_closedbefore dispatchingon_close); all 29 Sentry events are macOS/kqueue. A 30-iteration probe of both paths under ASAN on the unfixed build did not fire. The regression test is coverage for the re-entry path rather than a deterministic fail-before.Existing coverage (all green with the change):
test/js/sql/sql-close-pending-connection.test.tstest/js/sql/postgres-tls-ctx-leak.test.tstest/js/sql/sql-connect-error-reporting.test.tstest/js/sql/sql-onconnect-onclose-throw.test.tstest/js/sql/sql-mysql-tls-plaintext-injection.test.tstest/js/sql/sql-mysql-clean-reentry.test.ts