Add native support for concatenated ClassAds in grammar by patrickbrophy · Pull Request #5 · PelicanPlatform/classad

patrickbrophy · 2025-12-12T20:02:33Z

Previously, concatenated ClassAds (e.g., "][" without whitespace) were handled by a wrapper reader that inserted newlines. This change modifies the grammar to natively support parsing multiple concatenated ClassAds.

Changes:

Modified grammar (classad.y) to accept classad_list with multiple ClassAds
Added ParseMultipleClassAds() function to parser package
Added ParseMultiple() wrapper in classad package
Refactored Reader to use grammar-based parsing instead of wrapper reader
Removed classAdSeparatorReader wrapper implementation
Added comprehensive tests for concatenated ClassAd parsing

This provides cleaner architecture, better performance, and native grammar support for the HTCondor format where ClassAds may be concatenated without whitespace between them.

Previously, concatenated ClassAds (e.g., "][" without whitespace) were handled by a wrapper reader that inserted newlines. This change modifies the grammar to natively support parsing multiple concatenated ClassAds. Changes: - Modified grammar (classad.y) to accept classad_list with multiple ClassAds - Added ParseMultipleClassAds() function to parser package - Added ParseMultiple() wrapper in classad package - Refactored Reader to use grammar-based parsing instead of wrapper reader - Removed classAdSeparatorReader wrapper implementation - Added comprehensive tests for concatenated ClassAd parsing This provides cleaner architecture, better performance, and native grammar support for the HTCondor format where ClassAds may be concatenated without whitespace between them.

codecov · 2025-12-12T20:07:06Z

Welcome to Codecov 🎉

Once you merge this PR into your default branch, you're all set! Codecov will compare coverage reports and display results in all future pull requests.

ℹ️ You can also turn on project coverage checks and project coverage reporting on Pull Request comment

Thanks for integrating Codecov - We've got you covered ☂️

bbockelm

This buffers everything into memory -- I think we need to keep things more stream-oriented, processing ads one-by-one. Reasonable first attempt, just needs some tuning.

classad/reader.go

classad/classad.go

…fixes - Add buffer size limit (10MB) to prevent unbounded memory growth - Fix UTF-8 handling in findCompleteClassAd for proper multi-byte character support - Refactor duplicate EOF handling logic into handleEOF() method - Fix variable shadowing in findCompleteClassAd return values - Add comprehensive edge case tests: * UTF-8 characters in attribute values and strings * Empty brackets handling * Buffer size limit enforcement * Unclosed brackets (malformed input) * Multiple ClassAds with UTF-8 content - Optimize buffer size checks to occur before expensive scans - Add constants for maxBufferSize and readChunkSize for maintainability All existing tests pass, and new edge case tests verify robustness.

bbockelm

You're hitting a basic design issue in the current lexer - it wants to be fed a full ClassAd as a string.

That's a dumb design causing you to generate basically a mini-parser in the reader to buffer out a full ad.

I'll try a more fundamental change, allowing the lexer to work on streams.

bbockelm · 2025-12-13T18:11:54Z

Fixed in #6

patrickbrophy requested a review from bbockelm December 12, 2025 20:02

bbockelm requested changes Dec 12, 2025

View reviewed changes

classad/reader.go Outdated Show resolved Hide resolved

classad/classad.go Outdated Show resolved Hide resolved

patrickbrophy added 2 commits December 12, 2025 15:58

Remove ParseMultiple function

9de9ad0

patrickbrophy requested a review from bbockelm December 12, 2025 22:02

Fixed linter issues

5aae585

bbockelm requested changes Dec 13, 2025

View reviewed changes

bbockelm closed this Dec 13, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add native support for concatenated ClassAds in grammar#5

Add native support for concatenated ClassAds in grammar#5
patrickbrophy wants to merge 4 commits intoPelicanPlatform:mainfrom
patrickbrophy:concat-ad-support

patrickbrophy commented Dec 12, 2025

Uh oh!

codecov bot commented Dec 12, 2025

Uh oh!

bbockelm left a comment

Uh oh!

Uh oh!

Uh oh!

bbockelm left a comment

Uh oh!

bbockelm commented Dec 13, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

patrickbrophy commented Dec 12, 2025

Uh oh!

codecov bot commented Dec 12, 2025

Welcome to Codecov 🎉

Uh oh!

bbockelm left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

bbockelm left a comment

Choose a reason for hiding this comment

Uh oh!

bbockelm commented Dec 13, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants