Skip to content

Conversation

@jlucaso1
Copy link
Collaborator

This pull request significantly refactors the noise-handler implementation to improve the handling of frame encryption/decryption, frame parsing, and to address concurrency/race conditions in frame processing. It introduces a new TransportState class to manage encryption state, simplifies buffer and frame management, and adds comprehensive tests to ensure correctness and robustness, especially under concurrent scenarios.

Core improvements to encryption and transport state:

  • Introduced the TransportState class in src/Utils/noise-handler.ts to encapsulate encryption/decryption counters and IV management, providing a cleaner separation between handshake and transport phases.
  • Refactored encryption and decryption logic to use TransportState after handshake, ensuring correct counter and IV usage for each frame.

Frame parsing and buffer management enhancements:

  • Rewrote frame parsing logic to use a while loop that processes all complete frames in the buffer, improving handling of multiple frames in a single buffer and frames split across multiple calls.
  • Unified intro header handling and frame encoding, and improved buffer allocation and data copying for efficiency and correctness.

Concurrency and race condition fixes:

  • Added logic to serialize concurrent decodeFrame calls, preventing race conditions and ensuring correct order and integrity of frame processing.

Testing improvements:

  • Added a comprehensive test suite in src/__tests__/Utils/noise-handler.test.ts to cover multiple scenarios, including multiple frames per buffer, split frames, concurrent calls, encrypted frame handling, and counter correctness.

Other minor improvements:

  • Cleaned up unused variables and improved code readability, such as removing unused leaf handling and improving type usage in HKDF key derivation. [1] [2]

@jlucaso1 jlucaso1 requested a review from purpshell December 14, 2025 22:53
@whiskeysockets-bot
Copy link
Contributor

whiskeysockets-bot commented Dec 14, 2025

Thanks for opening this pull request and contributing to the project!

The next step is for the maintainers to review your changes. If everything looks good, it will be approved and merged into the main branch.

In the meantime, anyone in the community is encouraged to test this pull request and provide feedback.

✅ How to confirm it works

If you’ve tested this PR, please comment below with:

Tested and working ✅

This helps us speed up the review and merge process.

📦 To test this PR locally:

# NPM
npm install @whiskeysockets/baileys@WhiskeySockets/Baileys#fix-noise-handler-race-condition

# Yarn (v2+)
yarn add @whiskeysockets/baileys@WhiskeySockets/Baileys#fix-noise-handler-race-condition

# PNPM
pnpm add @whiskeysockets/baileys@WhiskeySockets/Baileys#fix-noise-handler-race-condition

If you encounter any issues or have feedback, feel free to comment as well.

@jlucaso1 jlucaso1 requested a review from Copilot December 14, 2025 22:54
@purpshell
Copy link
Member

Everyone thank @jlucaso1 for this PR. He had to leave his girlfriend for a minute to make this PR 😆.

Been waiting for ages for this one 😅

Copy link
Member

@purpshell purpshell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Passes preliminary review, will test on bartender and real server. Thanks again

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request attempts to address race conditions in the noise-handler implementation and improve frame encryption/decryption handling. However, the PR has critical security and concurrency issues that must be addressed before merging.

Key Changes:

  • Introduces TransportState class to manage encryption/decryption counters and IV generation separately from handshake state
  • Refactors frame parsing to use a while loop for processing multiple frames in a single buffer
  • Adds test suite covering multiple frames, split frames, and concurrent scenarios

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 17 comments.

File Description
src/Utils/noise-handler.ts Refactors encryption state management with TransportState class and rewrites frame parsing logic, but fails to implement actual mutex-based serialization for concurrent calls
src/tests/Utils/noise-handler.test.ts Adds comprehensive test suite for frame handling scenarios, but tests for race conditions don't actually verify proper synchronization

Critical Issues Found:

  1. No actual race condition fix: Despite the PR description claiming to serialize concurrent decodeFrame calls, no mutex or locking mechanism is implemented. Concurrent calls will still race on shared state (inBytes, transport, pendingOnFrame).

  2. TransportState thread safety: The encrypt() and decrypt() methods are not thread-safe. Concurrent calls can read the same counter value before incrementing, resulting in IV reuse which completely breaks GCM encryption security.

  3. Lost callbacks: The pendingOnFrame variable can only store one callback, so concurrent decodeFrame calls during transport initialization will lose earlier callbacks.

  4. Misleading tests: Tests claim to verify concurrent behavior but don't actually prove synchronization works since JavaScript's Promise.all doesn't guarantee true concurrent execution and no mutex exists.

The PR requires significant rework to properly implement mutual exclusion using the existing makeMutex utility before it can safely address the race conditions it claims to fix.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 224 to 266
decodeFrame: async (newData: Buffer | Uint8Array, onFrame: (buff: Uint8Array | BinaryNode) => void) => {
// the binary protocol uses its own framing mechanism
// on top of the WS frames
// so we get this data and separate out the frames
const getBytesSize = () => {
if (inBytes.length >= 3) {
return (inBytes.readUInt8() << 16) | inBytes.readUInt16BE(1)
}
if (isWaitingForTransport) {
inBytes = Buffer.concat([inBytes, newData])
pendingOnFrame = onFrame
return
}

inBytes = Buffer.concat([inBytes, newData])

logger.trace(`recv ${newData.length} bytes, total recv ${inBytes.length} bytes`)

let size = getBytesSize()
while (size && inBytes.length >= size + 3) {
let frame: Uint8Array | BinaryNode = inBytes.slice(3, size + 3)
inBytes = inBytes.slice(size + 3)

if (isFinished) {
const result = decrypt(frame)
frame = await decodeBinaryNode(result)
}

logger.trace({ msg: (frame as BinaryNode)?.attrs?.id }, 'recv frame')

onFrame(frame)
size = getBytesSize()
if (inBytes.length === 0) {
inBytes = Buffer.isBuffer(newData) ? newData : Buffer.from(newData)
} else {
inBytes = Buffer.concat([inBytes, newData])
}

await processData(onFrame)
}
Copy link

Copilot AI Dec 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The decodeFrame function does not actually serialize concurrent calls as claimed in the PR description. Multiple concurrent calls to decodeFrame will still race on shared state (inBytes, transport, pendingOnFrame). The isWaitingForTransport check does not prevent race conditions - if two calls arrive simultaneously while not waiting for transport, both will modify inBytes and call processData concurrently, leading to corrupted buffer state and incorrect frame processing. To fix this, implement proper mutual exclusion using a mutex (similar to makeMutex utility) to serialize all decodeFrame calls.

Copilot uses AI. Check for mistakes.
Comment on lines +24 to +48
private readonly iv = new Uint8Array(IV_LENGTH)

constructor(
private readonly encKey: Buffer,
private readonly decKey: Buffer
) {}

encrypt(plaintext: Uint8Array): Uint8Array {
const c = this.writeCounter++
this.iv[8] = (c >>> 24) & 0xff
this.iv[9] = (c >>> 16) & 0xff
this.iv[10] = (c >>> 8) & 0xff
this.iv[11] = c & 0xff

return aesEncryptGCM(plaintext, this.encKey, this.iv, EMPTY_BUFFER)
}

decrypt(ciphertext: Uint8Array): Buffer {
const c = this.readCounter++
this.iv[8] = (c >>> 24) & 0xff
this.iv[9] = (c >>> 16) & 0xff
this.iv[10] = (c >>> 8) & 0xff
this.iv[11] = c & 0xff

return aesDecryptGCM(ciphertext, this.decKey, this.iv, EMPTY_BUFFER) as Buffer
Copy link

Copilot AI Dec 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The TransportState encrypt and decrypt methods are not thread-safe. The counter increment (readCounter++, writeCounter++) and IV modification operations are separate steps that can be interleaved by concurrent calls. If two encrypt calls execute concurrently, they could both read the same counter value before either increments it, resulting in the same IV being used for two different messages. This violates the security requirements of GCM mode (IV must be unique per message) and could lead to complete compromise of encryption. The same issue exists for concurrent decrypt calls, which would cause decryption failures and incorrect counter state.

Suggested change
private readonly iv = new Uint8Array(IV_LENGTH)
constructor(
private readonly encKey: Buffer,
private readonly decKey: Buffer
) {}
encrypt(plaintext: Uint8Array): Uint8Array {
const c = this.writeCounter++
this.iv[8] = (c >>> 24) & 0xff
this.iv[9] = (c >>> 16) & 0xff
this.iv[10] = (c >>> 8) & 0xff
this.iv[11] = c & 0xff
return aesEncryptGCM(plaintext, this.encKey, this.iv, EMPTY_BUFFER)
}
decrypt(ciphertext: Uint8Array): Buffer {
const c = this.readCounter++
this.iv[8] = (c >>> 24) & 0xff
this.iv[9] = (c >>> 16) & 0xff
this.iv[10] = (c >>> 8) & 0xff
this.iv[11] = c & 0xff
return aesDecryptGCM(ciphertext, this.decKey, this.iv, EMPTY_BUFFER) as Buffer
// Removed shared iv buffer; IV will be generated per call
constructor(
private readonly encKey: Buffer,
private readonly decKey: Buffer
) {}
private encryptMutex = new Mutex();
private decryptMutex = new Mutex();
async encrypt(plaintext: Uint8Array): Promise<Uint8Array> {
return this.encryptMutex.runExclusive(() => {
const c = this.writeCounter++;
const iv = generateIV(c);
return aesEncryptGCM(plaintext, this.encKey, iv, EMPTY_BUFFER);
});
}
async decrypt(ciphertext: Uint8Array): Promise<Buffer> {
return this.decryptMutex.runExclusive(() => {
const c = this.readCounter++;
const iv = generateIV(c);
return aesDecryptGCM(ciphertext, this.decKey, iv, EMPTY_BUFFER) as Buffer;
});

Copilot uses AI. Check for mistakes.
Comment on lines +260 to +302
it('should serialize concurrent decodeFrame calls (fix for race condition)', async () => {
// This test verifies that the lock mechanism correctly serializes
// concurrent decodeFrame calls, preventing race conditions

const keyPair = Curve.generateKeyPair()
const logger = createMockLogger()

const handler = makeNoiseHandler({
keyPair,
NOISE_HEADER: NOISE_WA_HEADER,
logger: logger as any
})

const payload1 = Buffer.from('first')
const payload2 = Buffer.from('second')
const payload3 = Buffer.from('third')

const frame1 = createFrame(payload1)
const frame2 = createFrame(payload2)
const frame3 = createFrame(payload3)

const receivedOrder: string[] = []

const onFrame = (frame: Uint8Array | BinaryNode) => {
const content = Buffer.from(frame as Uint8Array).toString()
receivedOrder.push(content)
}

// Start all three decodeFrame calls "simultaneously"
// With the lock fix, they should be processed in order
const p1 = handler.decodeFrame(frame1, onFrame)
const p2 = handler.decodeFrame(frame2, onFrame)
const p3 = handler.decodeFrame(frame3, onFrame)

await Promise.all([p1, p2, p3])

// With serialization, frames should be received in the order
// the decodeFrame calls were made
expect(receivedOrder).toHaveLength(3)
expect(receivedOrder[0]).toBe('first')
expect(receivedOrder[1]).toBe('second')
expect(receivedOrder[2]).toBe('third')
})
Copy link

Copilot AI Dec 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test claims to verify serialization of concurrent decodeFrame calls, but it has a fundamental flaw: Promise.all does not guarantee that the decodeFrame functions will execute concurrently at the JavaScript level. Since JavaScript is single-threaded, the promises will be scheduled but may execute sequentially. More importantly, even if they did execute concurrently, the current implementation doesn't actually serialize them (there's no mutex), so this test passing doesn't prove the race condition is fixed. The test may pass due to lucky timing rather than correct synchronization.

Copilot uses AI. Check for mistakes.
Comment on lines +225 to +256
if (isWaitingForTransport) {
inBytes = Buffer.concat([inBytes, newData])
pendingOnFrame = onFrame
return
Copy link

Copilot AI Dec 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The pendingOnFrame variable can only store a single callback, but if multiple decodeFrame calls occur while isWaitingForTransport is true, each subsequent call will overwrite the previous pendingOnFrame callback. This means only the last callback will be invoked when finishInit completes, and frames buffered by earlier calls will be lost. The implementation should either queue all callbacks or prevent concurrent calls during transport initialization.

Copilot uses AI. Check for mistakes.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this make sense? @jlucaso1

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This specific one no. Javascript is single thread so we are safe with this

let counter = 0
let sentIntro = false

let inBytes: Buffer<ArrayBufferLike> = Buffer.alloc(0)
Copy link

Copilot AI Dec 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Buffer type annotation 'Buffer<ArrayBufferLike>' is unusual and unnecessary. The generic parameter for Buffer is not commonly used and ArrayBufferLike is the default. Simply use 'Buffer' for the type annotation to follow standard TypeScript conventions.

Suggested change
let inBytes: Buffer<ArrayBufferLike> = Buffer.alloc(0)
let inBytes: Buffer = Buffer.alloc(0)

Copilot uses AI. Check for mistakes.
Comment on lines +148 to +153
let size: number | undefined

while (true) {
if (inBytes.length < 3) return

size = (inBytes[0]! << 16) | (inBytes[1]! << 8) | inBytes[2]!
Copy link

Copilot AI Dec 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The variable 'size' is declared outside the while loop but only assigned inside. This is unnecessary and potentially confusing. Declare 'size' inside the loop where it's used to improve code clarity and scope management.

Suggested change
let size: number | undefined
while (true) {
if (inBytes.length < 3) return
size = (inBytes[0]! << 16) | (inBytes[1]! << 8) | inBytes[2]!
while (true) {
if (inBytes.length < 3) return
const size = (inBytes[0]! << 16) | (inBytes[1]! << 8) | inBytes[2]!

Copilot uses AI. Check for mistakes.
if (inBytes.length < 3) return

size = (inBytes[0]! << 16) | (inBytes[1]! << 8) | inBytes[2]!

Copy link

Copilot AI Dec 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The frame size calculation uses bitwise OR operations but doesn't validate that the result is within reasonable bounds. A malicious or corrupted frame could specify a size larger than available memory (e.g., 0xFFFFFF = ~16MB). Add validation to ensure 'size' is within acceptable limits (e.g., < 10MB) before attempting to process the frame, to prevent memory exhaustion attacks.

Suggested change
// Validate that size is within reasonable bounds (e.g., < 10MB)
const MAX_FRAME_SIZE = 10 * 1024 * 1024; // 10MB
if (size <= 0 || size > MAX_FRAME_SIZE) {
logger.error({ size }, 'Frame size out of bounds, dropping frame');
throw new Boom('Frame size out of bounds', { statusCode: 400 });
}

Copilot uses AI. Check for mistakes.
Comment on lines +31 to +36
encrypt(plaintext: Uint8Array): Uint8Array {
const c = this.writeCounter++
this.iv[8] = (c >>> 24) & 0xff
this.iv[9] = (c >>> 16) & 0xff
this.iv[10] = (c >>> 8) & 0xff
this.iv[11] = c & 0xff
Copy link

Copilot AI Dec 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The writeCounter and readCounter in TransportState will overflow after 2^32 messages (approximately 4 billion). While this is unlikely in practice, the code should handle overflow gracefully. When the counter reaches MAX_UINT32, it will wrap to 0, potentially reusing IVs and breaking GCM security. Consider adding overflow detection or documenting the maximum message limit.

Copilot uses AI. Check for mistakes.
const frame = Buffer.alloc(introSize + 3 + data.byteLength)
const dataLen = data.byteLength
const introSize = sentIntro ? 0 : introHeader.length
const frame = Buffer.allocUnsafe(introSize + 3 + dataLen)
Copy link

Copilot AI Dec 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Buffer.allocUnsafe is used here for performance, but it doesn't zero-initialize the buffer. While the entire buffer is subsequently filled (introHeader, frame length bytes, and data), if there's any logic error that leaves gaps, uninitialized memory could leak. Consider using Buffer.alloc for security-sensitive operations, or add a comment explaining why allocUnsafe is safe here (all bytes are explicitly set).

Suggested change
const frame = Buffer.allocUnsafe(introSize + 3 + dataLen)
const frame = Buffer.alloc(introSize + 3 + dataLen)

Copilot uses AI. Check for mistakes.
Comment on lines +203 to +230
it('should maintain counter integrity with many frames in single buffer', async () => {
const keyPair = Curve.generateKeyPair()
const logger = createMockLogger()

const handler = makeNoiseHandler({
keyPair,
NOISE_HEADER: NOISE_WA_HEADER,
logger: logger as any
})

// Create 10 frames to stress test the while loop
const payloads = Array.from({ length: 10 }, (_, i) => Buffer.from(`frame-${i}-payload-data`))

const combinedBuffer = Buffer.concat(payloads.map(createFrame))

const receivedFrames: Buffer[] = []
const onFrame = (frame: Uint8Array | BinaryNode) => {
receivedFrames.push(Buffer.from(frame as Uint8Array))
}

await handler.decodeFrame(combinedBuffer, onFrame)

expect(receivedFrames).toHaveLength(10)
payloads.forEach((payload, i) => {
expect(receivedFrames[i]).toEqual(payload)
})
})
})
Copy link

Copilot AI Dec 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing test coverage for the edge case where a frame's size header indicates 0 bytes of payload. While unlikely, this edge case should be tested to ensure the code handles it correctly (extracting a 3-byte header with size 0, then processing a zero-length frame). Add a test case for this scenario.

Copilot uses AI. Check for mistakes.
Copy link
Member

@purpshell purpshell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could be interesting to mutex the write

@purpshell
Copy link
Member

Passes bartender test type 1 and 2 with minimal performance change, also passes real web test also known as pairing for 3 seconds and then unpairing

@github-actions
Copy link
Contributor

github-actions bot commented Jan 1, 2026

This PR is stale because it has been open for 14 days with no activity. Remove the stale label or comment or this will be closed in 14 days

@github-actions github-actions bot added the Stale label Jan 1, 2026
@purpshell
Copy link
Member

Fix conflict and will merge

@jlucaso1 jlucaso1 force-pushed the fix-noise-handler-race-condition branch from 9665a38 to c46e8a1 Compare January 8, 2026 20:30
@jlucaso1 jlucaso1 force-pushed the fix-noise-handler-race-condition branch from c46e8a1 to 1c7f376 Compare January 8, 2026 20:33
@github-actions github-actions bot removed the Stale label Jan 9, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants