\pgfmathparse improvements #2485

xworld21 · 2025-01-11T22:05:26Z

Fix #2482 and lots of other small issues with the parser.

My intention was to simply 'linearise' the grammar, i.e., make sure it never backtracks when parsing the rules. This should improve performance quite a bit (the parser is now guaranteed to run in linear time). I am wondering how much of that can be done for the maths grammar, which has dramatic performance issues.

However, while writing some tests, I discovered that the parser is very broken. So I fixed the easy stuff (wrong functions, string handling) and added a new test to exercise the basic functionality (right now, just the functions).

There are still major issues, e.g. you can't even evaluate bin(185) without LaTeXML claiming an overflow. Strings should become tokens in case they contain unexpanded primitives. The blacklist is very fragile. Maybe the LaTeXML parser should fall back to the PGF one when it fails.

dginev · 2025-01-12T00:41:42Z

Thank you for looking into this! Could you also add the new test files to MANIFEST, so that we get CI working?

The one comment I will leave for archival is that - yes, indeed - the perl handling for pgf math has been all of fragile, buggy and incomplete. But importantly - so has the raw interpretataion of the TeX sources of that package. So incremental progress here is good (and I am especially grateful you took the time to add so many new tests).

xworld21 · 2025-01-12T14:27:02Z

The math parser is now slightly less broken, but still very broken. The operator precedences are all slightly wrong – they should have been set reading the \pgfmathdeclareoperator commands in the sources.

Still wrong: the factorial operator binds higher than multiplication, but lower than power; the radians 'unit' is between addition and multiplication. Hopefully I got the rest right.

@dginev are there notable examples from arXiv that might need all of this?

dginev · 2025-01-12T15:00:24Z

@dginev are there notable examples from arXiv that might need all of this?

Oh yes, certainly, pgf is a very commonly used package. Looking at the loaded files report for ar5iv, pgfmathparser.code.tex gets loaded in 436,976 articles.

If you want to fish for examples, I remember from #2237 that most/all pgfplots use the math parser, and there are ~59,000 articles using that library. That previous PR patch was based on arXiv:2104.00602, and you can find a few other reported pgf-related issues at the ar5iv tracker. But naturally arXiv readers do not distill the problems down to the unit test level. It is possible that converting the pgf/tikz showcase sites are a better way to fish out relevant examples.

But yes, this is very much a sizeable need in arXiv.

lib/LaTeXML/Package/pgfmath.code.tex.ltxml

xworld21 · 2025-01-12T19:16:36Z

Uhm, the <pagination role="newpage"></pagination> error changes with TeX Live version. Is there a standard solution for that particular problem?

dginev · 2025-01-13T14:11:07Z

Is there a standard solution for that particular problem?

I am not really familiar with this kind of texlive version mismatch. So is the issue that there are two trailing \pars deposited just before \end{document}?

One simple workaround could be to replace the \par in your test macro with something like \newline I suppose. But would really need to get back to a texlive 2021 to test it out.

dginev · 2025-01-13T17:41:59Z

@xworld21 OK, tried on a server with an old texlive, it looks like there is an extra \clearpage somehow added to the end of even the smallest pgf-loading document:

\documentclass{article}
\usepackage{pgf}
\begin{document}\end{document}

What you can do for the CI test is to add \let\clearpage\relax after \usepackage{pgf}, as I assume you won't be needing that feature in this particular math-oriented setup. That works in my setup.

xworld21 · 2025-01-13T20:30:19Z

What you can do for the CI test is to add \let\clearpage\relax after \usepackage{pgf}

Done!

In the meanwhile, I changed my mind. I can definitely fix the radians operator at least, and implement gcd. I'll do more tweaks and throw in some long expressions without parentheses in the tests, to catch other precedence errors.

brucemiller · 2025-03-13T20:37:58Z

Interestingly, our raw TeX is gradually good enough that using only the raw pgfmath.code.tex actually works (except for some shading errors), but it is sooooo slow.

xworld21 · 2025-03-13T20:42:59Z

By the way, this should be draft as well – I discovered that it breaks pgfplots somehow! Maybe because of ExpandedPartially? I don't know right now, I need to test it more carefully and squash the remaining bugs before calling it done.

Hardcoding the function names in a regex causes the parser to misinterpret function names (e.g. 'cosec' is interpreted greedily as 'cos e' and fails.)

…rser

xworld21 · 2025-11-22T18:40:37Z

I am afraid I didn't take note of which PGF plot was broken by this PR. I tried a few examples after rebasing to master and it looks fine. This needs to be tested in the wild.

dginev

I retested with arXiv:2104.00602, which is OK.

However, the dominoes test which is mentioned in the issue regresses - the arrangement seems to flip around. I half-remember this being an old problem before #2237 got merged.

Would be nice to get that back to working before merging here.

xworld21 · 2025-12-14T20:41:12Z

However, the dominoes test which is mentioned in the issue regresses - the arrangement seems to flip around. I half-remember this being an old problem before #2237 got merged.

Ok, the problem comes from the line

\pgfpoint{sin(\yy)*(\xx)}{-((\xx)/75)^2+(\zz)/100*(\xx)}

Switching to

\pgfpoint{sin(\yy)*(\xx)}{-1*((\xx)/75)^2+(\zz)/100*(\xx)}

is fine. I guess the unary - is not associating properly and it becomes part of the left operand of ^. Probably.

xworld21 · 2025-12-14T21:32:41Z

The dominoes look good now. Ideally, one should really rewrite the grammar to follow pgfmathparser.code.tex more closely rather than patching the existing grammar, but let's shelve that for later.

Further note: I am fairly convinced that back in March, I pasted a pgfplots example without adding a necessary usepackage at the beginning, which made me think that the new parser was breaking things. False alarm, most likely!

dginev

Nice, I can confirm the dominoes test works again.

I am also seeing some healthy (or at least healthier than before) plot conversions when testing on a handful of ar5iv issues that contain pgfplots reports:
dginev/ar5iv#147, dginev/ar5iv#335, dginev/ar5iv#402, dginev/ar5iv#493, dginev/ar5iv#557.

I also checked ar5iv:1804.07114, where the plots are not yet perfect, but the legend placement is better.

Lastly, the new XML test values match the PDF values, read through. The three classes of diffs in that test which I see are:

the minor precision differences, as with pi r computing 180.0 in latexml and 179.99962 in pdflatex.
the consistency in using floats in the binding, as in sign(0) being 0.0 instead of 0
width("Some Lovely Text") seems to show the full \textwidth (469.75) instead of the width of the box (83.99927)

So, to my limited expectations, this PR appears to be an improvement and is good to merge.
Thanks a lot for the expertise!

dginev · 2025-12-15T13:08:14Z

I have started testing this PR (with some of the others tucked in) on our arXiv sandboxes.

There seems to be a widespread regression (Fatals with 100 errors) due to no longer parsing some pgfmath epxressions.

As a first example, arXiv:1511.02101 has errors of the kind:

Error:pgfparse:pgfparse Parse of '70.40544pt-\pgfdecoratedcompleteddistance-1sp' failed
	at inclusionvcd.tex; line 399 col 0 - line 399 col 178
	LTX: '70.40544'
	Left: -\pgfdecoratedcompleteddistance-1sp
	Next token is T_CS[\advance] ( == Core::Definition::Primitive[\advance Variable SkipKeyword:by])
	In Core::Definition::Primitive[\pgfmaths...

I will try to add more details here later today.

xworld21 · 2025-12-15T13:10:40Z

So, to my limited expectations, this PR appears to be an improvement and is good to merge.

Let me add: there are more inconsistencies. For instance, PGF parses 2^3^2 as (2^3)^2 but 2^-3^2 as 2^(-3^2), and that is not accounted for yet here. However, I am in favour of merging as is because it should be a substantial net improvement, hopefully with negligible regressions. We can add tests and improvements later, and eventually replace the grammar with something that tracks pgfmathparser almost line by line.

By the way, should we merge the new test with the stress_pgfmath, that I somehow did not notice?

dginev · 2025-12-15T15:31:01Z

By the way, should we merge the new test with the stress_pgfmath, that I somehow did not notice?

Oh, sure. I think reusing that simplifies admin a little, they are exercising the same functionality.

lib/LaTeXML/Package/pgfmath.code.tex.ltxml

dginev · 2025-12-16T13:18:41Z

With the latest fixes we are closer to (but still a bit weaker than) the tikz-cd arXiv sandbox. That has just under 20,000 articles and at time of writing goes from 8.98% Fatal to 9.99% Fatal articles, as well as from 33.32% Error to 39.97% Error articles.

Also somewhat worryingly the conversions appear to be slower than before (anecdotally for now). That said, the changeset I am testing includes more than only this PR, so likely not all causes are rooted here. I will investigate some more today.

xworld21 force-pushed the pgfmathparse branch from 55f2867 to 34a7680 Compare January 12, 2025 14:20

xworld21 changed the title ~~Pgfmathparse~~ \pgfmathparse improvements Jan 12, 2025

xworld21 force-pushed the pgfmathparse branch from 34a7680 to 356b535 Compare January 12, 2025 15:13

xworld21 commented Jan 12, 2025

View reviewed changes

lib/LaTeXML/Package/pgfmath.code.tex.ltxml Outdated Show resolved Hide resolved

xworld21 force-pushed the pgfmathparse branch from 356b535 to cbf2017 Compare January 12, 2025 16:36

dginev requested review from brucemiller and dginev January 13, 2025 15:43

xworld21 force-pushed the pgfmathparse branch from cbf2017 to 5977884 Compare January 13, 2025 20:23

xworld21 force-pushed the pgfmathparse branch from 5977884 to 542d3ee Compare January 18, 2025 18:24

xworld21 marked this pull request as ready for review January 18, 2025 18:24

xworld21 force-pushed the pgfmathparse branch from 542d3ee to 1985cd8 Compare January 18, 2025 18:41

xworld21 marked this pull request as draft March 13, 2025 20:43

xworld21 added 9 commits November 22, 2025 17:49

avoid recomputing subrules in PGF math parser

6579341

do not hardcode PGF function names in PGF math parser

1525e15

Hardcoding the function names in a regex causes the parser to misinterpret function names (e.g. 'cosec' is interpreted greedily as 'cos e' and fails.)

allow strings in result of \pgfmathparse

4345aca

perl hex() is not the same as PGF hex(), blacklist it

ebde966

interpret binary, octal, hexadecimal numbers correctly in PGF math pa…

6ab305a

…rser

add correct implementations of bin(), frac(), int() to PGF math parser

da657d8

allow empty string as return value of PGF math parser

467a3bf

force comparison operators to return integers in PGF math parser

917c08f

distinguish meaning of ! in prefix and postfix position

e5c161b

xworld21 added 2 commits November 22, 2025 17:49

do not modify $string when trying eval in PGF math parser

c501189

parse postfix r operator as a multiplication

6fab2ec

xworld21 force-pushed the pgfmathparse branch from 1985cd8 to d1e00ad Compare November 22, 2025 18:35

xworld21 marked this pull request as ready for review November 22, 2025 18:39

xworld21 force-pushed the pgfmathparse branch from d1e00ad to d3d1af4 Compare November 22, 2025 18:53

dginev reviewed Dec 13, 2025

View reviewed changes

xworld21 added 4 commits December 14, 2025 21:25

parse factorial and powers with correct predecence

2931fc9

document predecence of PGF operators

2ceaa9b

implement PGF gcd

598c93b

new \pgfmathparse tests

cd0b52a

xworld21 force-pushed the pgfmathparse branch from d3d1af4 to cd0b52a Compare December 14, 2025 21:27

dginev added a commit to arXiv/LaTeXML that referenced this pull request Dec 15, 2025

Merge \pgfmathparse improvements (PR brucemiller#2485)

7408929

dginev approved these changes Dec 15, 2025

View reviewed changes

dginev requested changes Dec 15, 2025

View reviewed changes

lib/LaTeXML/Package/pgfmath.code.tex.ltxml Outdated Show resolved Hide resolved

dginev reviewed Dec 15, 2025

View reviewed changes

lib/LaTeXML/Package/pgfmath.code.tex.ltxml Show resolved Hide resolved

dginev mentioned this pull request Dec 15, 2025

\pgfmathparse sandbox testing patches xworld21/LaTeXML#1

Merged

dginev added 6 commits December 16, 2025 10:47

allow underscored named functions

1510293

add missing base case for pgfmath register rule

7dc5e56

avoid UnTeX line break for pgfmath grammar pass

9aadcb0

pgfmathparse: remove braces, especially trailing braces

f468a07

guard sqrt on undefined values to survive with Error instead of Fatal

ba11d53

perltidy on pgfmath.code.tex.ltxml

90b7151

dginev approved these changes Dec 16, 2025

View reviewed changes

\pgfmathparse improvements #2485

Are you sure you want to change the base?

\pgfmathparse improvements #2485

Uh oh!

Conversation

xworld21 commented Jan 11, 2025

Uh oh!

dginev commented Jan 12, 2025

Uh oh!

xworld21 commented Jan 12, 2025

Uh oh!

dginev commented Jan 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

xworld21 commented Jan 12, 2025

Uh oh!

dginev commented Jan 13, 2025

Uh oh!

dginev commented Jan 13, 2025

Uh oh!

xworld21 commented Jan 13, 2025

Uh oh!

brucemiller commented Mar 13, 2025

Uh oh!

xworld21 commented Mar 13, 2025

Uh oh!

xworld21 commented Nov 22, 2025

Uh oh!

dginev left a comment

Choose a reason for hiding this comment

Uh oh!

xworld21 commented Dec 14, 2025

Uh oh!

xworld21 commented Dec 14, 2025

Uh oh!

dginev left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dginev commented Dec 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

xworld21 commented Dec 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dginev commented Dec 15, 2025

Uh oh!

Uh oh!

Uh oh!

dginev commented Dec 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

dginev commented Jan 12, 2025 •

edited

Loading

dginev left a comment •

edited

Loading

dginev commented Dec 15, 2025 •

edited

Loading

xworld21 commented Dec 15, 2025 •

edited

Loading

dginev commented Dec 16, 2025 •

edited

Loading