Skip to content

Conversation

@brucemiller
Copy link
Owner

This PR resolves the regression which broke make formats and removes one more blocker to an mhchem binding.

There are subtle distinctions in TeX's treatment, and LaTeXML's abuse, of newlines: newlines in text generally end up either ignored or as a SPACE token in TeX, but LaTeXML attempts to preserve them as a special space token, with newline as the contents. OTOH, a ^^J produces a linefeed, essentially equivalent to a newline, but generally an OTHER token, or occasionally made ACTIVE for tricky coding purposes. Misunderstanding & clumsiness led to these distinctions getting lost in LaTeXML which caused problems. This PR reverts a previous change to Explode, ExplodeText (from #2646) which wrongly made all newlines turn into SPACE tokens. \string avoids Explode and more carefully preserves SPACE tokens while converts most other to OTHER (in particular non-space newlines like ^^J), and fixes \meaning so that it non-space newlines get properly decoded (typically ending up as Omega).

@brucemiller brucemiller requested a review from dginev as a code owner December 4, 2025 00:25
<p>[ ][blank space ][blank space ][ ].</p>
</para>
<para class="ltx_noindent" xml:id="p28">
<p>[
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

probably not essential, but this newline char in the XML for the p28 test is a regular space in the PDF output.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The intention was that they should have an equivalent effect in the xml/html, which seems to be the case.

I'd wondered whether the one's in blank space \n should also keep the newline, rather than normalize to space, but that would have required a lot of changes & complication in \meaning to keep the right newlines and decode the wrong ones, so I opted to revert Explode.

Copy link
Collaborator

@dginev dginev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very subtle and very encouraging! The new extended meaning.tex test is educational even - I checked that the PDF and XML versions match up (with 1 tiny nit difference).

It is great to have \string and \meaning get so close to their originals.

I see the make formats installation succeed with tests passing locally, LGTM.

@brucemiller brucemiller merged commit 7399c5a into master Dec 4, 2025
26 checks passed
dginev pushed a commit to dginev/LaTeXML that referenced this pull request Dec 5, 2025
* Revert change to Explode/ExplodeText from Spaces(brucemiller#2646) to convert newlines OTHER rather than SPACE

* Make \string only use Explode for CS, avoiding blurring distinctions between newlines given as SPACE or OTHER or ACTIVE

* Have \meaning normalize SPACE tokens to ' ' to avoid blurring newlines with different catcodes

* Enhance meaning testcase to also probe \string & \detokenize; check newlines, ^^J
dginev pushed a commit to dginev/LaTeXML that referenced this pull request Dec 11, 2025
* Revert change to Explode/ExplodeText from Spaces(brucemiller#2646) to convert newlines OTHER rather than SPACE

* Make \string only use Explode for CS, avoiding blurring distinctions between newlines given as SPACE or OTHER or ACTIVE

* Have \meaning normalize SPACE tokens to ' ' to avoid blurring newlines with different catcodes

* Enhance meaning testcase to also probe \string & \detokenize; check newlines, ^^J
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants