Skip to content

Conversation

@dprotaso
Copy link
Member

  • have the websocket server test image send a close message when shutting down gracefully
  • include an e2e test that asserts websockets connections are gracefully handled
  • include a special handler to track hijacked connections and create a drain method

@knative-prow knative-prow bot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Jan 27, 2026
@dprotaso dprotaso changed the title Shutdown Websocket Connections Gracefully [wip] Shutdown Websocket Connections Gracefully Jan 27, 2026
@knative-prow
Copy link

knative-prow bot commented Jan 27, 2026

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: dprotaso

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@knative-prow knative-prow bot added approved Indicates a PR has been approved by an approver from all required OWNERS files. do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. labels Jan 27, 2026
@dprotaso dprotaso changed the title [wip] Shutdown Websocket Connections Gracefully [wip] Shutdown Websocket Connections Gracefully in the queue-proxy Jan 27, 2026
@codecov
Copy link

codecov bot commented Jan 27, 2026

Codecov Report

❌ Patch coverage is 53.57143% with 13 lines in your changes missing coverage. Please review.
✅ Project coverage is 80.13%. Comparing base (4004491) to head (8edef60).

Files with missing lines Patch % Lines
pkg/queue/sharedmain/main.go 0.00% 7 Missing ⚠️
pkg/queue/sharedmain/handlers.go 0.00% 6 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main   #16362      +/-   ##
==========================================
- Coverage   80.17%   80.13%   -0.04%     
==========================================
  Files         216      217       +1     
  Lines       13440    13461      +21     
==========================================
+ Hits        10775    10787      +12     
- Misses       2300     2309       +9     
  Partials      365      365              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@dprotaso
Copy link
Member Author

One observation I have is that the activator will drop hijacked connections similar to the queue proxy.

Hijacking happens implicitly with the httputil.ReverseProxy - https://cs.opensource.google/go/go/+/refs/tags/go1.25.6:src/net/http/httputil/reverseproxy.go;l=504

The solution in the queue-proxy is straight forward because we know K8s will send a TERM signal to the user container - thus we can expect apps to return a close frame when that happens. Those are the changes I made to our websocket server. This isn't necessary but just maybe it's a good practice

With the activator though - there's no way to signal downstream to end the connection. We maybe could send a close frame on the connection but it makes the activator protocol aware and I don't think we can write a message on an upstream connection like this (it could be interleaved with another).

Given this - I don't think the activator issue should prevent this change from merging - but just the solution there isn't straighforward.

@dprotaso
Copy link
Member Author

/retest

hmm

websocket_test.go:162: Successfully wrote: "Hello, websocket"
websocket_test.go:190: error running test failed to read message: read tcp 10.250.69.110:60956->34.138.223.27:80: read: connection reset by peer

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant