Skip to content

[Strings] Automate string lifting and lowering in the optimization pipeline #7540

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 33 commits into
base: main
Choose a base branch
from

Conversation

kripken
Copy link
Member

@kripken kripken commented Apr 22, 2025

-O2 and above now automatically lift strings at the start, allowing
optimization in the middle, and lower at the end. This lets people
use imported JS strings and have string optimizations "just work",
if they enable the strings feature.

The best way I could figure out how to do this is to make a "paired"
set of lifting and lowering passes. The pair work together, in that
the lifting notes if it found anything, and if not, then the lowering
is skipped. If we do not do this then the lowering always runs, which
adds work (scan the entire module for strings), and equally bad it
also always rewrites types (to replace string with extern; avoiding
that work would require intrusive changes to TypeUpdating, or
even more work in some pre-scan).

TODO: document in optimizer cookbook if we agree on this path

@kripken kripken requested a review from tlively April 22, 2025 18:42
Copy link
Member

@tlively tlively left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is currently a weird discontinuity in behavior here where input that includes a bunch of stringref operations will be unaffected by these passes, but then if even a single string import is added, all of the stringref operations will start being lowered away.

Can we solve this by having the lifting pass set liftedStrings if it encounters any stringref operations (i.e. "pre-lifted" operations) as well?

Can we also make it so the paired passes are the only versions? I don't think it's useful to provide both the paired and unpaired versions to users. We can have the PassRunner conservatively default to having liftedStrings=true so explicitly running only lowering still works as expected.

@kripken
Copy link
Member Author

kripken commented Apr 22, 2025

There is currently a weird discontinuity in behavior here where input that includes a bunch of stringref operations will be unaffected by these passes, but then if even a single string import is added, all of the stringref operations will start being lowered away.

I agree that is a weird discontinuity, yeah. OTOH, another corner case is if the input has stringref but not imported strings. This PR leaves the stringref content alone in that case, which I think is natural (seems weird if you input stringrefs, optimize, and lose stringrefs), so changing things (to detect stringref regardless of imported strings) would regress that situation.

Overall I think those corner cases matter less than the goal of avoiding extra work in the common case. This PR atm adds practically no work for -O2 -all if there are no imported strings, and I think that's worth keeping. And, if someone has stringref in the input, or a mixture of stringref and imported strings, that user likely Knows What They're Doing.

Can we also make it so the paired passes are the only versions? I don't think it's useful to provide both the paired and unpaired versions to users.

I think the user-facing passes are useful for people that use stringref directly. Which, maybe is almost or actually zero atm, but I see little harm to keeping them around, and as you know, I do believe stringref makes sense for wasm in the long term... 😄

(The paired version is for internal use, I can't see users wanting those.)

@kripken
Copy link
Member Author

kripken commented Apr 22, 2025

After offline discussion, I added a nicer flag to disable the lowering --no-string-lowering and made the internal passes "test passes" so they don't clutter the --help.

if (options.optimizeLevel >= 2 && wasm->features.hasStrings()) {
// Gather strings to globals right before reorder-globals, which will then
// sort them properly.
addIfNoDWARFIssues("string-gathering");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need string-gathering at all any more, or can we remove its code?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need it for users, though it is nice for testing. I can make it a testing pass as a followup?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good to me.

kripken added a commit that referenced this pull request Apr 24, 2025
After we lift the strings section into stringref, we don't need the section any
more, and leaving it around could cause problems with repeated lifting/
lowering operations (which would be very much possible with #7540). In
particular, without erasing it, we'd accumulate such sections over time.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants