Handle decoding of input in html5ever
#590
Draft
+616
−195
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
These changes are an attempt to allow users of
html5ever
to respect the encodings specified with<meta charset="...">
tags in a spec-compliant way.The major change is that the https://html.spec.whatwg.org/#input-stream now lives in the html5ever crates. As a result, the new API surface exposes a "pull" instead of the existing "push" interface.
The entry point to the new API is a
DecodingParser
, which wraps either a HTML or an XML parser.After providing some amount of byte input to a
DecodingParser
, the user can callDecodingParser::parse
, which returns an iterator overParserAction
s. A parser action is either a<script>
tag that needs to be executed or a new encoding that the document should be re-parsed with. The caller can drive the parser by repeatedly advancing this iterator.The old API is fully preserved, without breaking changes (that I'm aware of).
This is a draft because the design is not final and this needs a companion servo PR to verify the correctness of these changes. Initial feedback is welcome.
Depends on #591.