Skip to content

Conversation

@majiayu000
Copy link

Summary

  • Fixed YAML parser incorrectly treating --- and ... as document separators when inside scalar values
  • Added check to ensure separators are only recognized at the start of a value

Test plan

  • Added regression test in test/regression/issue/25660.test.ts
  • All 191 existing YAML tests pass

Fixes #25660

…eparators

The YAML parser was incorrectly treating '---' and '...' as document
separators even when they appeared inside scalar values (e.g.,
'name: some-text---'). This happened because the check for document
separators only verified line_indent == .none but didn't check if
we had already scanned content for the current scalar.

Added ctx.str_builder.len() == 0 check to ensure document separators
are only recognized at the start of a value, not in the middle.

Fixes oven-sh#25660

Signed-off-by: lif <[email protected]>
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Dec 25, 2025

Walkthrough

Adds a line-start guard (nl) across the YAML scanner so document boundary markers (---, ...) and other boundary-sensitive tokens are only recognized at the start of a line; updates scanner state toggling and EOF checks accordingly. Adds regression tests for separator behavior inside scalar values and real line-start separators.

Changes

Cohort / File(s) Summary
YAML parser boundary logic
src/interchange/yaml.zig
Adds a local nl flag to track line-start state; gates recognition of --- and ... (and other boundary checks) on nl; updates branches for -, ., :, #, punctuation, whitespace/newline/comment handling to set/reset nl; adjusts EOF/boundary-evaluation helpers used by the scanner.
Regression tests for separator handling
test/regression/issue/25660.test.ts
Adds tests ensuring ---/... inside scalar values do not split documents, verifies true line-start separators create multiple documents, and covers edge cases (trailing/middle dashes/dots, trailing dots preserved).

Possibly related PRs

Suggested reviewers

  • dylan-conway

Pre-merge checks

✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately summarizes the main change: fixing YAML document separator detection to avoid treating --- or ... inside scalar values as separators.
Description check ✅ Passed The description covers both required template sections: what the PR does (YAML parser fix) and how it was verified (regression test added, 191 tests pass).
Linked Issues check ✅ Passed The changes fully address the requirements from issue #25660: the parser now correctly handles --- and ... inside scalar values by tracking line-start position, preventing mis-detection as document boundaries.
Out of Scope Changes check ✅ Passed All changes are within scope: YAML parser logic updates to fix document separator detection and a regression test file directly addressing the reported bug.

📜 Recent review details

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Disabled knowledge base sources:

  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 9dc9895 and 17ab57a.

📒 Files selected for processing (2)
  • src/interchange/yaml.zig
  • test/regression/issue/25660.test.ts
🧰 Additional context used
📓 Path-based instructions (7)
test/**/*.test.{ts,js,jsx,tsx,mjs,cjs}

📄 CodeRabbit inference engine (test/CLAUDE.md)

test/**/*.test.{ts,js,jsx,tsx,mjs,cjs}: Use bun:test with files that end in *.test.{ts,js,jsx,tsx,mjs,cjs}
Do not write flaky tests. Never wait for time to pass in tests; always wait for the condition to be met instead of using an arbitrary amount of time
Never use hardcoded port numbers in tests. Always use port: 0 to get a random port
Prefer concurrent tests over sequential tests using test.concurrent or describe.concurrent when multiple tests spawn processes or write files, unless it's very difficult to make them concurrent
When spawning Bun processes in tests, use bunExe and bunEnv from harness to ensure the same build of Bun is used and debug logging is silenced
Use -e flag for single-file tests when spawning Bun processes
Use tempDir() from harness to create temporary directories with files for multi-file tests instead of creating files manually
Prefer async/await over callbacks in tests
When callbacks must be used and it's just a single callback, use Promise.withResolvers to create a promise that can be resolved or rejected from a callback
Do not set a timeout on tests. Bun already has timeouts
Use Buffer.alloc(count, fill).toString() instead of 'A'.repeat(count) to create repetitive strings in tests, as ''.repeat is very slow in debug JavaScriptCore builds
Use describe blocks for grouping related tests
Always use await using or using to ensure proper resource cleanup in tests for APIs like Bun.listen, Bun.connect, Bun.spawn, Bun.serve, etc
Always check exit codes and test error scenarios in error tests
Use describe.each() for parameterized tests
Use toMatchSnapshot() for snapshot testing
Use beforeAll(), afterEach(), beforeEach() for setup/teardown in tests
Track resources (servers, clients) in arrays for cleanup in afterEach()

Files:

  • test/regression/issue/25660.test.ts
test/regression/issue/**/*.test.ts

📄 CodeRabbit inference engine (test/CLAUDE.md)

Regression tests for specific issues go in /test/regression/issue/${issueNumber}.test.ts. Do not put tests without issue numbers in the regression directory

Files:

  • test/regression/issue/25660.test.ts
**/*.test.ts?(x)

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.test.ts?(x): Never use bun test directly - always use bun bd test to run tests with debug build changes
For single-file tests, prefer -e flag over tempDir
For multi-file tests, prefer tempDir and Bun.spawn over single-file tests
Use normalizeBunSnapshot to normalize snapshot output of tests
Never write tests that check for 'panic', 'uncaught exception', or similar strings in test output
Use tempDir from harness to create temporary directories - do not use tmpdirSync or fs.mkdtempSync
When spawning processes in tests, expect stdout before expecting exit code for more useful error messages on test failure
Do not write flaky tests - do not use setTimeout in tests; instead await the condition to be met
Verify tests fail with USE_SYSTEM_BUN=1 bun test <file> and pass with bun bd test <file> - tests are invalid if they pass with USE_SYSTEM_BUN=1
Test files must end with .test.ts or .test.tsx
Avoid shell commands like find or grep in tests - use Bun's Glob and built-in tools instead

Files:

  • test/regression/issue/25660.test.ts
test/regression/issue/*.test.ts

📄 CodeRabbit inference engine (CLAUDE.md)

Place regression tests for specific GitHub issues in test/regression/issue/${issueNumber}.test.ts with real issue numbers only

Files:

  • test/regression/issue/25660.test.ts
test/**/*.test.ts?(x)

📄 CodeRabbit inference engine (CLAUDE.md)

Always use port: 0 in tests - do not hardcode ports or use custom random port number functions

Files:

  • test/regression/issue/25660.test.ts
src/**/*.zig

📄 CodeRabbit inference engine (src/CLAUDE.md)

src/**/*.zig: Private fields in Zig are fully supported using the # prefix: struct { #foo: u32 };
Use decl literals in Zig for declaration initialization: const decl: Decl = .{ .binding = 0, .value = 0 };
Prefer @import at the bottom of the file (auto formatter will move them automatically)

Files:

  • src/interchange/yaml.zig
**/*.zig

📄 CodeRabbit inference engine (CLAUDE.md)

In Zig code, be careful with allocators and use defer for cleanup

Files:

  • src/interchange/yaml.zig
🧠 Learnings (14)
📓 Common learnings
Learnt from: cirospaciari
Repo: oven-sh/bun PR: 22946
File: test/js/sql/sql.test.ts:195-202
Timestamp: 2025-09-25T22:07:13.851Z
Learning: PR oven-sh/bun#22946: JSON/JSONB result parsing updates (e.g., returning parsed arrays instead of legacy strings) are out of scope for this PR; tests keep current expectations with a TODO. Handle parsing fixes in a separate PR.
📚 Learning: 2025-11-24T18:36:59.706Z
Learnt from: CR
Repo: oven-sh/bun PR: 0
File: src/bun.js/bindings/v8/CLAUDE.md:0-0
Timestamp: 2025-11-24T18:36:59.706Z
Learning: Applies to src/bun.js/bindings/v8/test/v8/v8.test.ts : Add corresponding test cases to test/v8/v8.test.ts using checkSameOutput() function to compare Node.js and Bun output

Applied to files:

  • test/regression/issue/25660.test.ts
📚 Learning: 2025-12-16T00:21:32.179Z
Learnt from: CR
Repo: oven-sh/bun PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-16T00:21:32.179Z
Learning: Applies to **/*.test.ts?(x) : Verify tests fail with `USE_SYSTEM_BUN=1 bun test <file>` and pass with `bun bd test <file>` - tests are invalid if they pass with USE_SYSTEM_BUN=1

Applied to files:

  • test/regression/issue/25660.test.ts
📚 Learning: 2025-11-24T18:37:30.259Z
Learnt from: CR
Repo: oven-sh/bun PR: 0
File: test/CLAUDE.md:0-0
Timestamp: 2025-11-24T18:37:30.259Z
Learning: Applies to test/regression/issue/**/*.test.ts : Regression tests for specific issues go in `/test/regression/issue/${issueNumber}.test.ts`. Do not put tests without issue numbers in the regression directory

Applied to files:

  • test/regression/issue/25660.test.ts
📚 Learning: 2025-11-24T18:37:30.259Z
Learnt from: CR
Repo: oven-sh/bun PR: 0
File: test/CLAUDE.md:0-0
Timestamp: 2025-11-24T18:37:30.259Z
Learning: Applies to test/**/*-fixture.ts : Test files that spawn Bun processes should end in `*-fixture.ts` to identify them as test fixtures and not tests themselves

Applied to files:

  • test/regression/issue/25660.test.ts
📚 Learning: 2025-10-19T02:44:46.354Z
Learnt from: theshadow27
Repo: oven-sh/bun PR: 23798
File: packages/bun-otel/context-propagation.test.ts:1-1
Timestamp: 2025-10-19T02:44:46.354Z
Learning: In the Bun repository, standalone packages under packages/ (e.g., bun-vscode, bun-inspector-protocol, bun-plugin-yaml, bun-plugin-svelte, bun-debug-adapter-protocol, bun-otel) co-locate their tests with package source code using *.test.ts files. This follows standard npm/monorepo patterns. The test/ directory hierarchy (test/js/bun/, test/cli/, test/js/node/) is reserved for testing Bun's core runtime APIs and built-in functionality, not standalone packages.

Applied to files:

  • test/regression/issue/25660.test.ts
📚 Learning: 2025-12-16T00:21:32.179Z
Learnt from: CR
Repo: oven-sh/bun PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-16T00:21:32.179Z
Learning: Applies to test/regression/issue/*.test.ts : Place regression tests for specific GitHub issues in `test/regression/issue/${issueNumber}.test.ts` with real issue numbers only

Applied to files:

  • test/regression/issue/25660.test.ts
📚 Learning: 2025-12-16T00:21:32.179Z
Learnt from: CR
Repo: oven-sh/bun PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-16T00:21:32.179Z
Learning: Applies to **/*.test.ts?(x) : Use `normalizeBunSnapshot` to normalize snapshot output of tests

Applied to files:

  • test/regression/issue/25660.test.ts
📚 Learning: 2025-11-14T16:07:01.064Z
Learnt from: RiskyMH
Repo: oven-sh/bun PR: 24719
File: docs/bundler/executables.mdx:527-560
Timestamp: 2025-11-14T16:07:01.064Z
Learning: In the Bun repository, certain bundler features like compile with code splitting (--compile --splitting) are CLI-only and not supported in the Bun.build() JavaScript API. Tests for CLI-only features use backend: "cli" flag (e.g., test/bundler/bundler_compile_splitting.test.ts). The CompileBuildConfig interface correctly restricts these with splitting?: never;. When documenting CLI-only bundler features, add a note clarifying they're not available via the programmatic API.

Applied to files:

  • test/regression/issue/25660.test.ts
📚 Learning: 2025-11-24T18:37:30.259Z
Learnt from: CR
Repo: oven-sh/bun PR: 0
File: test/CLAUDE.md:0-0
Timestamp: 2025-11-24T18:37:30.259Z
Learning: Applies to test/**/*.test.{ts,js,jsx,tsx,mjs,cjs} : Use `bun:test` with files that end in `*.test.{ts,js,jsx,tsx,mjs,cjs}`

Applied to files:

  • test/regression/issue/25660.test.ts
📚 Learning: 2025-11-24T18:37:30.259Z
Learnt from: CR
Repo: oven-sh/bun PR: 0
File: test/CLAUDE.md:0-0
Timestamp: 2025-11-24T18:37:30.259Z
Learning: Applies to test/**/*.test.{ts,js,jsx,tsx,mjs,cjs} : Use `-e` flag for single-file tests when spawning Bun processes

Applied to files:

  • test/regression/issue/25660.test.ts
📚 Learning: 2025-09-07T08:20:47.215Z
Learnt from: RiskyMH
Repo: oven-sh/bun PR: 22258
File: src/cli/test_command.zig:1258-1281
Timestamp: 2025-09-07T08:20:47.215Z
Learning: For Bun's test line filtering feature, the parseFileLineArg function should only handle the specific cases of "file:line" and "file:line:col" formats. It should not try to be overly tolerant of other patterns, as components like ":col" or other non-numeric segments could legitimately be part of filenames. The current conservative approach that checks for numeric segments in expected positions is appropriate.

Applied to files:

  • test/regression/issue/25660.test.ts
📚 Learning: 2025-09-30T22:53:19.887Z
Learnt from: pfgithub
Repo: oven-sh/bun PR: 23117
File: src/bun.js/test/snapshot.zig:265-276
Timestamp: 2025-09-30T22:53:19.887Z
Learning: In Bun's snapshot testing (src/bun.js/test/snapshot.zig), multiple inline snapshots at the same line and column (same call position) must have identical values. However, multiple inline snapshots on the same line at different columns are allowed to have different values. The check is position-specific (line+col), not line-wide.

Applied to files:

  • test/regression/issue/25660.test.ts
📚 Learning: 2025-11-24T18:37:30.259Z
Learnt from: CR
Repo: oven-sh/bun PR: 0
File: test/CLAUDE.md:0-0
Timestamp: 2025-11-24T18:37:30.259Z
Learning: Unit tests for specific features are organized by module (e.g., `/test/js/bun/`, `/test/js/node/`)

Applied to files:

  • test/regression/issue/25660.test.ts
🔇 Additional comments (14)
src/interchange/yaml.zig (6)

2285-2291: LGTM! Proper line-start tracking initialization.

The initialization of nl correctly handles both the start of input (self.pos == .zero) and positions immediately after a newline, establishing the foundation for restricting document boundary markers to line-start positions.


2299-2300: Correctly guards document boundary markers at line start.

The checks properly implement the YAML specification requirement that document markers (--- and ...) must appear at the start of a line with no indentation and be followed by whitespace or EOF. The nl guard ensures these markers are not recognized when they appear mid-value.

Also applies to: 2317-2318


2303-2303: Proper nl flag management throughout the scan loop.

The nl flag is correctly maintained throughout scanPlainScalar: reset to false after consuming any non-newline token and set to true after processing a newline. This ensures accurate line-start tracking for the entire scalar parsing process.

Also applies to: 2321-2321, 2348-2348, 2379-2379, 2403-2403, 2424-2424, 2465-2465, 2470-2470


2940-2944: Line-start detection correctly implemented for literal scalars.

The line_start computation mirrors the nl initialization logic in scanPlainScalar, correctly identifying positions at the start of input or after newlines. Note that this uses an on-demand computation pattern (checking the previous character each time) rather than maintaining a tracked flag, which is a valid alternative approach.

Also applies to: 2963-2967


3671-3675: Scanner-level line-start guards correctly restrict boundary tokens.

The line-start checks in the scan function ensure that document_start and document_end tokens are only generated when --- or ... appear at the beginning of a line with no indentation. This is the correct level of enforcement for the YAML document boundary specification.

Also applies to: 3755-3759


4311-4311: Improved EOF semantics for boundary detection.

Changing isAnyOrEofAt to return true when beyond the end of input correctly implements the "or EOF" semantics. This ensures that document boundary markers (--- and ...) are properly recognized when followed by EOF, aligning with YAML specification requirements.

test/regression/issue/25660.test.ts (8)

10-20: LGTM! Core regression test covers the reported issue.

This test directly addresses the issue reported in #25660 where --- appearing at the end of a scalar value was incorrectly treated as a document separator.


22-32: Good coverage of separator sequences within scalar values.

These tests verify that --- and ... are preserved as literal content when they appear in the middle of scalar values, complementing the end-of-value test case.


34-41: Correctly tests document markers as scalar values.

This test ensures that --- and ... are properly treated as string values when they appear after a colon, as they don't meet the line-start requirement for document boundary markers.


43-53: Addresses edge case from previous review feedback.

This test verifies that --- at the start of an indented continuation line within a multiline plain scalar is treated as part of the value, not as a document separator. This addresses the suggestion from previous review comments.


55-67: Essential positive test for legitimate document separators.

This test confirms that actual document separators (at line start with no indentation) still correctly split the input into multiple documents, ensuring the fix doesn't break valid multi-document YAML parsing.


69-85: Thorough coverage of trailing boundary marker sequences.

These tests ensure that --- and ... at the end of scalar values are correctly preserved as literal content, even when they could be visually confused with document boundary markers.


87-105: Validates correct boundary marker recognition for top-level scalars.

These tests ensure that document boundary markers are properly recognized when they appear at line start following top-level plain scalars, confirming the fix is precise and doesn't over-correct by breaking valid multi-document parsing scenarios. These cases were specifically mentioned in the PR objectives as important edge cases to verify.


1-106: Well-structured regression test suite with comprehensive coverage.

The test file correctly follows all coding guidelines:

  • ✓ Uses bun:test framework
  • ✓ Located at test/regression/issue/25660.test.ts matching the issue number
  • ✓ Tests are deterministic with no flaky timeouts or delays
  • ✓ Proper describe/test structure

The test coverage is comprehensive for the reported issue, including both negative cases (where ---/... should be treated as content) and positive cases (where they should be recognized as document boundaries). While additional edge cases like block scalars or quoted strings with these sequences could be tested, the current coverage appropriately addresses the specific bug fix.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Disabled knowledge base sources:

  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 2247c38 and b56bb1f.

📒 Files selected for processing (2)
  • src/interchange/yaml.zig
  • test/regression/issue/25660.test.ts
🧰 Additional context used
📓 Path-based instructions (7)
test/**/*.test.{ts,js,jsx,tsx,mjs,cjs}

📄 CodeRabbit inference engine (test/CLAUDE.md)

test/**/*.test.{ts,js,jsx,tsx,mjs,cjs}: Use bun:test with files that end in *.test.{ts,js,jsx,tsx,mjs,cjs}
Do not write flaky tests. Never wait for time to pass in tests; always wait for the condition to be met instead of using an arbitrary amount of time
Never use hardcoded port numbers in tests. Always use port: 0 to get a random port
Prefer concurrent tests over sequential tests using test.concurrent or describe.concurrent when multiple tests spawn processes or write files, unless it's very difficult to make them concurrent
When spawning Bun processes in tests, use bunExe and bunEnv from harness to ensure the same build of Bun is used and debug logging is silenced
Use -e flag for single-file tests when spawning Bun processes
Use tempDir() from harness to create temporary directories with files for multi-file tests instead of creating files manually
Prefer async/await over callbacks in tests
When callbacks must be used and it's just a single callback, use Promise.withResolvers to create a promise that can be resolved or rejected from a callback
Do not set a timeout on tests. Bun already has timeouts
Use Buffer.alloc(count, fill).toString() instead of 'A'.repeat(count) to create repetitive strings in tests, as ''.repeat is very slow in debug JavaScriptCore builds
Use describe blocks for grouping related tests
Always use await using or using to ensure proper resource cleanup in tests for APIs like Bun.listen, Bun.connect, Bun.spawn, Bun.serve, etc
Always check exit codes and test error scenarios in error tests
Use describe.each() for parameterized tests
Use toMatchSnapshot() for snapshot testing
Use beforeAll(), afterEach(), beforeEach() for setup/teardown in tests
Track resources (servers, clients) in arrays for cleanup in afterEach()

Files:

  • test/regression/issue/25660.test.ts
test/regression/issue/**/*.test.ts

📄 CodeRabbit inference engine (test/CLAUDE.md)

Regression tests for specific issues go in /test/regression/issue/${issueNumber}.test.ts. Do not put tests without issue numbers in the regression directory

Files:

  • test/regression/issue/25660.test.ts
**/*.test.ts?(x)

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.test.ts?(x): Never use bun test directly - always use bun bd test to run tests with debug build changes
For single-file tests, prefer -e flag over tempDir
For multi-file tests, prefer tempDir and Bun.spawn over single-file tests
Use normalizeBunSnapshot to normalize snapshot output of tests
Never write tests that check for 'panic', 'uncaught exception', or similar strings in test output
Use tempDir from harness to create temporary directories - do not use tmpdirSync or fs.mkdtempSync
When spawning processes in tests, expect stdout before expecting exit code for more useful error messages on test failure
Do not write flaky tests - do not use setTimeout in tests; instead await the condition to be met
Verify tests fail with USE_SYSTEM_BUN=1 bun test <file> and pass with bun bd test <file> - tests are invalid if they pass with USE_SYSTEM_BUN=1
Test files must end with .test.ts or .test.tsx
Avoid shell commands like find or grep in tests - use Bun's Glob and built-in tools instead

Files:

  • test/regression/issue/25660.test.ts
test/regression/issue/*.test.ts

📄 CodeRabbit inference engine (CLAUDE.md)

Place regression tests for specific GitHub issues in test/regression/issue/${issueNumber}.test.ts with real issue numbers only

Files:

  • test/regression/issue/25660.test.ts
test/**/*.test.ts?(x)

📄 CodeRabbit inference engine (CLAUDE.md)

Always use port: 0 in tests - do not hardcode ports or use custom random port number functions

Files:

  • test/regression/issue/25660.test.ts
src/**/*.zig

📄 CodeRabbit inference engine (src/CLAUDE.md)

src/**/*.zig: Private fields in Zig are fully supported using the # prefix: struct { #foo: u32 };
Use decl literals in Zig for declaration initialization: const decl: Decl = .{ .binding = 0, .value = 0 };
Prefer @import at the bottom of the file (auto formatter will move them automatically)

Files:

  • src/interchange/yaml.zig
**/*.zig

📄 CodeRabbit inference engine (CLAUDE.md)

In Zig code, be careful with allocators and use defer for cleanup

Files:

  • src/interchange/yaml.zig
🧠 Learnings (12)
📓 Common learnings
Learnt from: cirospaciari
Repo: oven-sh/bun PR: 22946
File: test/js/sql/sql.test.ts:195-202
Timestamp: 2025-09-25T22:07:13.851Z
Learning: PR oven-sh/bun#22946: JSON/JSONB result parsing updates (e.g., returning parsed arrays instead of legacy strings) are out of scope for this PR; tests keep current expectations with a TODO. Handle parsing fixes in a separate PR.
📚 Learning: 2025-11-24T18:36:59.706Z
Learnt from: CR
Repo: oven-sh/bun PR: 0
File: src/bun.js/bindings/v8/CLAUDE.md:0-0
Timestamp: 2025-11-24T18:36:59.706Z
Learning: Applies to src/bun.js/bindings/v8/test/v8/v8.test.ts : Add corresponding test cases to test/v8/v8.test.ts using checkSameOutput() function to compare Node.js and Bun output

Applied to files:

  • test/regression/issue/25660.test.ts
📚 Learning: 2025-12-16T00:21:32.179Z
Learnt from: CR
Repo: oven-sh/bun PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-16T00:21:32.179Z
Learning: Applies to **/*.test.ts?(x) : Verify tests fail with `USE_SYSTEM_BUN=1 bun test <file>` and pass with `bun bd test <file>` - tests are invalid if they pass with USE_SYSTEM_BUN=1

Applied to files:

  • test/regression/issue/25660.test.ts
📚 Learning: 2025-11-24T18:37:30.259Z
Learnt from: CR
Repo: oven-sh/bun PR: 0
File: test/CLAUDE.md:0-0
Timestamp: 2025-11-24T18:37:30.259Z
Learning: Applies to test/regression/issue/**/*.test.ts : Regression tests for specific issues go in `/test/regression/issue/${issueNumber}.test.ts`. Do not put tests without issue numbers in the regression directory

Applied to files:

  • test/regression/issue/25660.test.ts
📚 Learning: 2025-12-16T00:21:32.179Z
Learnt from: CR
Repo: oven-sh/bun PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-16T00:21:32.179Z
Learning: Applies to test/regression/issue/*.test.ts : Place regression tests for specific GitHub issues in `test/regression/issue/${issueNumber}.test.ts` with real issue numbers only

Applied to files:

  • test/regression/issue/25660.test.ts
📚 Learning: 2025-11-24T18:37:30.259Z
Learnt from: CR
Repo: oven-sh/bun PR: 0
File: test/CLAUDE.md:0-0
Timestamp: 2025-11-24T18:37:30.259Z
Learning: Applies to test/**/*-fixture.ts : Test files that spawn Bun processes should end in `*-fixture.ts` to identify them as test fixtures and not tests themselves

Applied to files:

  • test/regression/issue/25660.test.ts
📚 Learning: 2025-12-16T00:21:32.179Z
Learnt from: CR
Repo: oven-sh/bun PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-16T00:21:32.179Z
Learning: Applies to **/*.test.ts?(x) : Use `normalizeBunSnapshot` to normalize snapshot output of tests

Applied to files:

  • test/regression/issue/25660.test.ts
📚 Learning: 2025-10-19T02:44:46.354Z
Learnt from: theshadow27
Repo: oven-sh/bun PR: 23798
File: packages/bun-otel/context-propagation.test.ts:1-1
Timestamp: 2025-10-19T02:44:46.354Z
Learning: In the Bun repository, standalone packages under packages/ (e.g., bun-vscode, bun-inspector-protocol, bun-plugin-yaml, bun-plugin-svelte, bun-debug-adapter-protocol, bun-otel) co-locate their tests with package source code using *.test.ts files. This follows standard npm/monorepo patterns. The test/ directory hierarchy (test/js/bun/, test/cli/, test/js/node/) is reserved for testing Bun's core runtime APIs and built-in functionality, not standalone packages.

Applied to files:

  • test/regression/issue/25660.test.ts
📚 Learning: 2025-11-14T16:07:01.064Z
Learnt from: RiskyMH
Repo: oven-sh/bun PR: 24719
File: docs/bundler/executables.mdx:527-560
Timestamp: 2025-11-14T16:07:01.064Z
Learning: In the Bun repository, certain bundler features like compile with code splitting (--compile --splitting) are CLI-only and not supported in the Bun.build() JavaScript API. Tests for CLI-only features use backend: "cli" flag (e.g., test/bundler/bundler_compile_splitting.test.ts). The CompileBuildConfig interface correctly restricts these with splitting?: never;. When documenting CLI-only bundler features, add a note clarifying they're not available via the programmatic API.

Applied to files:

  • test/regression/issue/25660.test.ts
📚 Learning: 2025-11-24T18:37:30.259Z
Learnt from: CR
Repo: oven-sh/bun PR: 0
File: test/CLAUDE.md:0-0
Timestamp: 2025-11-24T18:37:30.259Z
Learning: Applies to test/**/*.test.{ts,js,jsx,tsx,mjs,cjs} : Use `bun:test` with files that end in `*.test.{ts,js,jsx,tsx,mjs,cjs}`

Applied to files:

  • test/regression/issue/25660.test.ts
📚 Learning: 2025-11-24T18:37:30.259Z
Learnt from: CR
Repo: oven-sh/bun PR: 0
File: test/CLAUDE.md:0-0
Timestamp: 2025-11-24T18:37:30.259Z
Learning: Applies to test/**/*.test.{ts,js,jsx,tsx,mjs,cjs} : Use `-e` flag for single-file tests when spawning Bun processes

Applied to files:

  • test/regression/issue/25660.test.ts
📚 Learning: 2025-09-07T08:20:47.215Z
Learnt from: RiskyMH
Repo: oven-sh/bun PR: 22258
File: src/cli/test_command.zig:1258-1281
Timestamp: 2025-09-07T08:20:47.215Z
Learning: For Bun's test line filtering feature, the parseFileLineArg function should only handle the specific cases of "file:line" and "file:line:col" formats. It should not try to be overly tolerant of other patterns, as components like ":col" or other non-numeric segments could legitimately be part of filenames. The current conservative approach that checks for numeric segments in expected positions is appropriate.

Applied to files:

  • test/regression/issue/25660.test.ts
🔇 Additional comments (3)
src/interchange/yaml.zig (2)

2291-2294: LGTM! The fix correctly prevents false document separator detection.

The added guard ctx.str_builder.len() == 0 ensures that --- is only recognized as a document separator when:

  1. No scalar content has been accumulated yet
  2. We're at the start of a line (self.line_indent == .none)
  3. The sequence is followed by whitespace or EOF

This correctly handles cases like name: some-text--- where --- is part of the scalar value.


2310-2313: Consistent fix applied for document end marker.

The same guard condition is correctly applied to the ... document end marker, ensuring symmetrical behavior with the --- fix.

test/regression/issue/25660.test.ts (1)

1-65: Well-structured regression test with good coverage.

The test file correctly:

  • Uses bun:test as required
  • Follows the naming convention test/regression/issue/${issueNumber}.test.ts
  • Covers the main fix scenario plus several edge cases
  • Tests both --- and ... markers in various positions
  • Verifies that legitimate document separators still work correctly (line 34-46)

@Jarred-Sumner
Copy link
Collaborator

@dylan-conway can you review? you have more context on YAML than me

Copy link
Member

@dylan-conway dylan-conway left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change will break top level plain scalars (unquoted values). For example ... will be included in the value of

hello
...

when it should be recognized as a document end marker.

I suggest adding the var nl = false; pattern that's used in scanSingleQuotedScalar and scanDoubleQuotedScalar. It's a little messy but works well

majiayu000 and others added 2 commits December 30, 2025 11:44
Replace the str_builder.len() == 0 check with a proper nl (newline) flag
to correctly detect document separators (--- and ...).

The previous approach would fail for top-level plain scalars like:
```yaml
hello
...
```
Where `...` should be recognized as a document end marker but wasn't
because str_builder already contained "hello".

The nl flag pattern (already used in scanSingleQuotedScalar and
scanDoubleQuotedScalar) correctly tracks whether we're at the start
of a new line, allowing proper detection of:
- `name: text---more` → `---` not at line start, not a separator
- `hello\n...` → `...` at line start, is a separator

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <[email protected]>
Add test cases that verify the fix handles Dylan's feedback correctly:
- Top-level plain scalar followed by `...` document end marker
- Top-level plain scalar followed by `---` document separator

These tests ensure the nl flag approach correctly recognizes document
markers at line start even when str_builder already has content.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <[email protected]>
@majiayu000 majiayu000 force-pushed the claude/fix-yaml-document-separator branch from f5bc5ab to 9dc9895 Compare December 30, 2025 05:36
@majiayu000
Copy link
Author

Thanks for the feedback! I've updated the implementation to use the nl flag pattern as you suggested.

The fix now tracks whether we're at the start of a new line, matching the approach in scanSingleQuotedScalar and scanDoubleQuotedScalar.

Also added test cases for the top-level plain scalar scenario you mentioned:

  • hello\n... → correctly recognizes ... as document end
  • first\n---\nsecond → correctly recognizes --- as separator

'\n', '\r' => true,
else => false,
};
if (line_start and self.line_indent == .none and self.remainStartsWith("---") and self.isAnyOrEofAt(" \t\n\r", 3)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh I think I see the bug! isAnyOrEofAt returns false instead of true if the position out of bounds

fn isAnyOrEofAt(self: *const @This(), values: []const enc.unit(), n: usize) bool {
    const pos = self.pos.add(n);
    if (pos.isLessThan(self.input.len)) {
        return std.mem.indexOfScalar(enc.unit(), values, self.input[pos.cast()]) != null;
    }
+   return true;
-   return false;
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

YAML.parse() incorrectly splits on --- inside values

3 participants