Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize without regexes #38

Draft
wants to merge 2 commits into
base: cosmic-dev
Choose a base branch
from

Conversation

CosmicHorrorDev
Copy link
Collaborator

Resolves #37

The fuzzer is still finding some fun edge cases. I'm currently chewing on {{ underline }}Underline{{ bold }}{{ underline_off }}Bold which outputs

<u>Underline<b></b></u><b>Bold</b>

without the optimization regexes running (note the empty <b></b> 🙃). It's getting there though

@Aloso
Copy link
Owner

Aloso commented Jan 10, 2025

Here's an idea: Instead of an HTML string, we produce a Vec<TextOrTag>, which we can optimize more easily, and convert it to HTML in a second step. The enum would be

enum TextOrTag<'s> {
    Text(&'s str),
    OpenTag(Style),
    CloseTag,
    Removed,
}

And the optimization pass could look something like

let mut idx = 0;
while let Some(elem) = list.get(idx) {
    match elem {
        TextOrTag::OpenTag(_) => {
            if let Some(end_idx) = find_end_empty(&list, idx) {
                list[idx] = TextOrTag::Removed;
                list[end_idx] = TextOrTag::Removed;
                idx = end_idx;
            }
        }
        TextOrTag::CloseTag => {
            if let Some(start_idx) = find_start_empty(&list, idx) {
                list[start_idx] = TextOrTag::Removed;
                list[idx] = TextOrTag::Removed;
            }
        }
        _ => {},
    }
    idx += 1;
}

This would even work recursively (e.g. <u><b></b></u>), which we currently don't remove completely.

@CosmicHorrorDev
Copy link
Collaborator Author

CosmicHorrorDev commented Jan 10, 2025

100% agree! After more of the pending minifier changes shake out I was planning on moving all of the optimizations to operate on the abstract HTML before we emit it since it's way easier to handle optimizations there (as opposed to trying to massage the parsed ansi values)

@CosmicHorrorDev
Copy link
Collaborator Author

Gonna go ahead and handle moving the current minifier to work on an abstract HTML representation and then I'll revise this PR and #42

@CosmicHorrorDev CosmicHorrorDev changed the base branch from main to cosmic-dev February 16, 2025 19:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Converter minifier / optimizations overlap
2 participants