Skip to content

Conversation

@kozy4324
Copy link
Contributor

This PR implements a content model–based validation approach as discussed in #267, focusing on a small and practical subset of checks derived from the HTML specification.

What this PR does

This implementation checks whether child elements are allowed under a given parent based on its content model, without aiming for full structural validation.

It currently supports the following patterns:

  1. Flow disallowed cases (simplest case)
    • Detects cases where flow content appears in places where flow content is not permitted.
  2. Transparent content model (e.g. <a>)
    • Elements with a transparent content model delegate validation to their parent’s content model.
  3. Specific tag patterns (e.g. <ul><li>)
    • Handles elements that only allow specific child tags.
  4. Context-dependent patterns
    • Handles elements whose allowed content depends on context (e.g. <div>, <span>).
  5. Custom elements and unknown elements
    • Since custom and unknown elements do not have an explicit content model, they are treated as allowing any content.

What this PR intentionally does not do

The following are intentionally out of scope for this PR:

  1. Child element ordering
    • The order of children defined by a content model is not validated.
  2. Descendant-level constraints
    • Rules that forbid specific elements anywhere in the descendant tree are not checked.
  3. Validation under <head>
    • These cases are already covered by the existing html-head-only-elements and html-body-only-elements rules.

I’d be happy to adjust this if you have any feedback or suggestions. Thank you.

@kozy4324
Copy link
Contributor Author

This PR also seems to fix #186, #248, #255, #260, #272, and #291.

@kozy4324
Copy link
Contributor Author

kozy4324 commented Jan 1, 2026

The commit ca3758b adds support for ERBBlockNode and ERBContentNode with the tag helper.

Extracting helper methods via regex is getting tricky.
Since a Prism-based approach isn’t available yet, should we wait for that, or keep extending the regex support for now?

@marcoroth
Copy link
Owner

Great, thank you so much for exploring this @kozy4324!

Since a Prism-based approach isn’t available yet, should we wait for that, or keep extending the regex support for now?

We can leave out the tag.div and content_tag and similar helpers in ERB for now. A lot of rules need to be updated once we have the tag detection working.

If you want, we can keep the tests and use test.fails("...", () => {}) so we know which ones to update later on.

@kozy4324
Copy link
Contributor Author

kozy4324 commented Jan 1, 2026

Got it. I’ll keep the scope limited to the tag helper for now. Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants