Skip to content

fix: preserve LiteralNode content indentation in Replace operations#864

Open
babarot wants to merge 1 commit intogoccy:masterfrom
babarot:fix/literal-node-replace
Open

fix: preserve LiteralNode content indentation in Replace operations#864
babarot wants to merge 1 commit intogoccy:masterfrom
babarot:fix/literal-node-replace

Conversation

@babarot
Copy link
Copy Markdown

@babarot babarot commented Apr 8, 2026

Purpose

Fix MappingValueNode.Replace and SequenceNode.Replace corrupting literal block scalar (|) content when replacing nodes. Characters were eaten from the beginning of each content line because AddColumn adjusts Position.Column but LiteralNode.String() uses the raw Origin text, which was never updated to reflect the new position.

This was originally reported in #636.

Root cause

MappingValueNode.Replace computes a column delta and calls value.AddColumn(delta). For most node types, String() uses Position.Column to reconstruct output, so this works correctly. However, LiteralNode.String() ignores Position.Column entirely and returns the raw Origin text from the scanner. The Origin retains the indentation from the source document (the replacement value), not the destination, causing the mismatch.

Fix

  • In MappingValueNode.Replace and SequenceNode.Replace, recalculate the column delta for MappingNode, SequenceNode, multiline StringNode, and LiteralNode based on the key/parent column position rather than the value token position.
  • For LiteralNode, rebuild the Origin text from the actual Value string with the correct target indentation. This avoids cumulative corruption when the same node is replaced into multiple locations (e.g. via [*] or .. paths).

Tests added

  • TestPath_ReplaceWithNode_ValueToNodeIndent: MappingNode replacement via ValueToNode (Wrong indentation when path.ReplaceWithNode with an ast.Node created by the function yaml.ValueToNode #636 scenario)
  • TestPath_ReplaceWithNode_LiteralNodeIndent: LiteralNode replacement via ValueToNode
  • TestPath_ReplaceWithParsedLiteralNodeIndent: LiteralNode replacement via parser.ParseBytes
  • TestPath_ReplaceWithParsedLiteralNodeIndent_MultipleMatches: [*] path replacing same node into multiple locations
  • TestPath_ReplaceWithParsedLiteralNodeIndent_SpecialCharacters: content with YAML-special characters (*, /, :) at line beginnings
  • TestPath_ReplaceSequenceElementWithParsedLiteralNodeIndent: SequenceNode.Replace with literal block scalar

@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented Apr 8, 2026

⚠️ Please install the 'codecov app svg image' to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 80.67%. Comparing base (edee2f9) to head (737662c).
❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files
@@            Coverage Diff             @@
##           master     #864      +/-   ##
==========================================
+ Coverage   80.65%   80.67%   +0.01%     
==========================================
  Files          22       22              
  Lines        6845     6845              
==========================================
+ Hits         5521     5522       +1     
+ Misses        905      903       -2     
- Partials      419      420       +1     
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

MappingValueNode.Replace and SequenceNode.Replace corrupted literal
block scalar content because AddColumn adjusts Position.Column but
LiteralNode.String() uses the raw Origin text, which was never
updated to reflect the new position.

Recalculate column deltas for structured node types (MappingNode,
SequenceNode, multiline StringNode, LiteralNode) based on the
key/parent column position. For LiteralNode, rebuild Origin from
the actual Value string with correct indentation to avoid cumulative
corruption when the same node is replaced into multiple locations.

Fixes goccy#636
@babarot babarot force-pushed the fix/literal-node-replace branch from 6e7454e to 737662c Compare April 8, 2026 03:01
babarot added a commit to babarot/gh-infra that referenced this pull request Apr 8, 2026
Link to goccy/go-yaml#864 and #636 in SetLiteral, replaceLiteralContent,
and replaceWithLiteralBlock so the workaround can be removed once the
upstream fix is merged.
goccy added a commit that referenced this pull request Apr 11, 2026
MappingValueNode.Replace and SequenceNode.Replace had two related defects
that produced wrong indentation in the resulting document:

1. Literal block scalars (`|`) ignored their content getting moved to a
   new column, because LiteralNode.String() and the multiline path of
   StringNode.String() bypass Position.Column. Replacing such a value via
   parser.ParseBytes (yields *LiteralNode) or yaml.ValueToNode with
   UseLiteralStyleIfMultiline (yields *StringNode) left the new content
   indented at the source's column rather than the destination's.

2. Replacing a block-style mapping or sequence used the value's
   GetToken().Position.Column as the source/destination anchor, but
   *MappingNode.GetToken() returns the parser-internal Start token (often
   the first colon) whose column varies between parser-built and
   ValueToNode-built nodes. The result was an off-by-one indent for any
   replacement that mixed those two construction paths.

The fix introduces helpers in ast/ast.go:

  - detectLiteralContentIndent / applyLiteralContentIndent rebuild a
    *LiteralNode's inner Origin from its Value, or pin a multiline
    *StringNode's Position.Column / IndentNum so its String() output
    lands at the destination indent. The rebuild path is idempotent
    across repeated replacements (e.g. `[*]`), so applying the same
    parsed node into multiple slots no longer accumulates corruption.

  - replaceColumns / anchorColumn / isBlockStructure compute the column
    delta from each node's first content position rather than from its
    Start token, so block-style replacements work consistently regardless
    of how the value was constructed. When a scalar slot is being
    replaced by a block structure (no source-side indent to inherit),
    the caller-supplied "parent_key + 2" convention default is used.

Adds TestPath_ReplaceWithNode_PreserveIndent in path_test.go covering
both construction paths, 2-space and 4-space and 3-space source indents,
multi-slot `[*]` replacements, YAML-special characters at line starts,
and sequence-element literal replacement.

Fixes #636. Supersedes #864.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants