Update to v1.34.6 #7

gavin-jeong · 2025-09-03T11:32:57Z

Commit Message:
Additional Description:
Risk Level:
Testing:
Docs Changes:
Release Notes:
Platform Specific Features:
[Optional Runtime guard:]
[Optional Fixes #Issue]
[Optional Fixes commit #PR or SHA]
[Optional Deprecated:]
[Optional API Considerations:]

@yanavlasov

Created by Envoy publish bot for @yanavlasov **Summary of changes**: * Security fixes: - Fix for OAuth cookie issue [CVE-2025-55162](GHSA-95j4-hw7f-v2rh). - Fix UAF in DNS resolution [CVE-2025-54588](GHSA-g9vw-6pvx-7gmw). **Docker images**: https://hub.docker.com/r/envoyproxy/envoy/tags?page=1&name=v1.34.6 **Docs**: https://www.envoyproxy.io/docs/envoy/v1.34.6/ **Release notes**: https://www.envoyproxy.io/docs/envoy/v1.34.6/version_history/v1.34/v1.34.6 **Full changelog**: envoyproxy/envoy@v1.34.5...v1.34.6

Include code formatting improvements for consistent style in trace test files.

Signed-off-by: Chanhun Jeong <[email protected]>

This commit introduces QUIC/HTTP3 keylog functionality in Envoy, enabling generation of NSS Key Log Format files for Wireshark and other debugging tools. - Keylog callback registration in OnNewSslCtx() - Implementation of EnvoyQuicProofSource::setupQuicKeylogCallback() and quicKeylogCallback() - TLS context–based keylog configuration with per–filter chain caching and thread safety - Address filtering via local/remote IP lists - Fallback to SSLKEYLOGFILE environment variable for compatibility with existing workflows - QuicKeylogBridge integration with Envoy’s existing TLS keylog infrastructure - RawBufferSocket fallback fix in QuicServerTransportSocketFactory::createDownstreamTransportSocket() - Comprehensive unit tests including edge cases Signed-off-by: Chanhun Jeong <[email protected]>

Protect all async callbacks from accessing deallocated cluster members during destruction by adding is_destroying_ atomic flag checks. Affected callbacks: - ClusterRefreshManager callbacks - DNS resolution callbacks - Connection event callbacks - Timer callbacks - Redis client response callbacks (onResponse, onFailure, onUnexpectedResponse) - Hostname resolution callbacks The race condition occurred when callbacks were already queued in the event loop when cluster destruction began, causing use-after-free access to parent cluster members like info_, redis_discovery_session_, and resolve_timer_. All callbacks now check is_destroying_ with memory_order_acquire before accessing any parent members, ensuring safe termination during destruction. Fixes segfaults that occurred when removing Redis service entries. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <[email protected]>

…ter destruction Problem: Segmentation faults occur when accessing member pointers in async callbacks during Redis cluster destruction, even with is_destroying_ flag checks. This happens because there's a race window between checking the flag and accessing the pointers. Solution: Add defensive null checks for all pointer accesses that could become invalid during destruction: 1. ClusterInfo pointer (info_): - Add null checks before all configUpdateStats() calls - Use safe access pattern for name() in log statements - Locations: startResolveRedis(), updateDnsStats(), DNS callbacks, onResponse(), onUnexpectedResponse(), onFailure() 2. DNS Resolver pointer (dns_resolver_): - Add null checks in startResolveDns() - Add checks in resolveClusterHostnames() and resolveReplicas() - Prevents crashes when DNS resolution is initiated during teardown 3. Timer pointer (resolve_timer_): - Add null checks before enableTimer() calls - Locations: finishClusterHostnameResolution(), onResponse(), onUnexpectedResponse(), onFailure() 4. Consistency fix: - Line 714: Changed parent_.info() to parent_.info_ to match null-checked pattern used elsewhere The pattern applied throughout: 1. Check is_destroying_ flag with memory_order_acquire 2. Verify each pointer is non-null before dereferencing 3. This dual-check handles the race window safely This prevents use-after-free crashes during Redis cluster teardown when async callbacks execute after partial destruction has begun.

The previous fix with null checks still had a race condition window between checking the pointer and using it. Even with the null check, the shared_ptr could be reset to null by another thread between the check and use. Solution: Make local copies of shared_ptr before use. This ensures the pointer remains valid throughout its usage in the current scope. Changes: 1. startResolveRedis(): Copy info_ to local variable before use 2. updateDnsStats(): Use local copy of info_ 3. DNS callbacks: Use local copy for stats updates 4. onResponse(), onUnexpectedResponse(), onFailure(): Use local copies 5. client_factory_.create(): Check and use local copy of info_ The pattern applied: auto info = parent_.info_; // Make local copy (ref count++) if (!info) { // Check if null return; } info->method(); // Safe to use - won't become null This prevents the crash at line 376 where info_ was becoming null between the check and the access, even with memory_order_acquire.

…ing timer callbacks The 5% crash rate was caused by timer callbacks executing after the RedisDiscoverySession was destroyed. Even though we checked is_destroying_, there was a race where: 1. Timer callback fires and enters the lambda 2. Destructor runs and deletes the session (unique_ptr reset) 3. Callback tries to access parent_.is_destroying_ → CRASH (use-after-free) Solution: - Move timer creation from constructor to initialize() method - Capture shared_from_this() in timer lambda instead of raw 'this' - Call initialize() after RedisDiscoverySession construction completes This ensures the session object stays alive as long as any timer callback is queued or executing, preventing the use-after-free. Pattern changed from: resolve_timer_ = dispatcher_.createTimer([this]() { ... }); To: auto self = shared_from_this(); resolve_timer_ = dispatcher_.createTimer([self]() { ... }); This should eliminate the remaining 5% crash rate during Redis cluster destruction.

…nt reference CRITICAL FIX: The previous approach had a fatal flaw - callbacks with shared_from_this() kept the session alive, but the session holds a reference to the parent RedisCluster. When the parent was destroyed, accessing parent_.is_destroying_ became use-after-free. The race condition: 1. Timer callback fires with shared_ptr<Session> (session kept alive) 2. RedisCluster destructor runs and completes 3. Callback tries to check parent_.is_destroying_ 4. CRASH - parent object destroyed, reference is dangling Solution: - Add parent_destroyed_ atomic flag IN THE SESSION - Parent sets this flag BEFORE destroying session - Callbacks check session-owned flag, never access parent directly - Also simplify all safety checks into helper methods This is the correct fix for the 5% crash rate when removing Redis services.

publish-envoy bot and others added 5 commits September 3, 2025 01:25

redis: Support eval_ro, evalsha_ro

3603184

use custom header for tracing

1d597ea

Include code formatting improvements for consistent style in trace test files.

redis: fix segfault at cluster removing

3f387b2

Signed-off-by: Chanhun Jeong <[email protected]>

gavin-jeong force-pushed the update_to_1_34_6 branch 2 times, most recently from 46ebf51 to d230b12 Compare September 30, 2025 07:52

keyolk and others added 2 commits October 1, 2025 09:52

gavin-jeong force-pushed the update_to_1_34_6 branch from bc5be03 to ab759f8 Compare October 1, 2025 03:59

gavin-jeong force-pushed the update_to_1_34_6 branch from 527fa9f to 40aa08d Compare October 1, 2025 14:13

gavin-jeong force-pushed the update_to_1_34_6 branch from eada6a5 to 8fb112c Compare October 2, 2025 12:21

gavin-jeong force-pushed the update_to_1_34_6 branch from 8fb112c to e7bc43e Compare October 3, 2025 01:13

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Update to v1.34.6 #7

Update to v1.34.6 #7

Uh oh!

gavin-jeong commented Sep 3, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Update to v1.34.6 #7

Are you sure you want to change the base?

Update to v1.34.6 #7

Uh oh!

Conversation

gavin-jeong commented Sep 3, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants