Skip to content

Conversation

@knutwannheden
Copy link
Contributor

@knutwannheden knutwannheden commented Nov 27, 2025

Replace regex-based XPath parsing with a proper ANTLR grammar for improved correctness, maintainability, and performance.

Key Changes

  • ANTLR Grammar: New XPathLexer.g4 and XPathParser.g4 provide a formal grammar for the supported XPath subset
  • XPathCompiler: Compiles XPath expressions into an intermediate representation at match time (lazy DCL pattern)
  • Bottom-up Matching: Walk the cursor chain instead of building path arrays - more efficient for deep trees
  • Allocation-free Evaluation: Unified expression evaluation avoids allocations during matching
  • Fail-fast Validation: Unsupported XPath functions throw at parse time rather than silently failing at match time

Supported XPath Features

  • Absolute/relative paths: /root/child, child/grandchild
  • Descendant-or-self: //element
  • Wildcards: /root/*
  • Attributes: /@attr, /@*
  • Predicates: [@attr='value'], [child='value'], [1], [last()]
  • Functions: local-name(), contains(), starts-with(), text(), position(), last(), not()
  • Logical operators: and, or
  • Axes: parent::, self::, .., .

Test Improvements

Tests now verify exact match counts instead of just boolean existence, providing stronger validation of XPath matching behavior.

@knutwannheden knutwannheden marked this pull request as ready for review December 1, 2025 09:48
@knutwannheden knutwannheden changed the title XPath ANTLR grammar ANTLR-based XPathMatcher implementation Dec 1, 2025
@knutwannheden knutwannheden merged commit 5224e77 into main Dec 1, 2025
1 check passed
@github-project-automation github-project-automation bot moved this from In Progress to Done in OpenRewrite Dec 1, 2025
@knutwannheden knutwannheden deleted the xpath-antlr branch December 1, 2025 12:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

Add support for "contains" in XML XPath matching Support positional elements in XPathMatcher

4 participants