-
Notifications
You must be signed in to change notification settings - Fork 20
Open
Labels
snapshot qualityImproving fidelity/size/durability/etc of the outputImproving fidelity/size/durability/etc of the output
Description
Like issue #29, but for subdocuments inside frames. As remarked here:
get blob() { return new Blob([this.string], { type: 'text/html' }) },
get string() {
// TODO Add <meta charset> if absent? Or html-encode characters as needed?
return documentOuterHTML(clonedDoc)
},
The same applies to crawl-subresources for frames whose inner document we cannot access directly.
It seems new Blob() always utf-8-encodes given strings (mdn). I suppose we should either add <meta charset="utf-8"> to the DOM before running documentOuterHTML. Alternatively, we change the blob’s MIME type to text/html;charset=utf-8; something we could not do for the top-level document — might that be ‘cleaner’?
Problem observed in the wild.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
snapshot qualityImproving fidelity/size/durability/etc of the outputImproving fidelity/size/durability/etc of the output