How to decode potential JavaScript #7

annevk · 2021-01-11T14:35:57Z

We might not always have an encoding, e.g., fetch(..., { mode: "no-cors" }). Is it reasonable to always use UTF-8 for this check?

The text was updated successfully, but these errors were encountered:

annevk · 2021-01-22T11:32:18Z

Looking at this again and in particular https://html.spec.whatwg.org/#fetch-a-classic-script I think the simplest option here is that we pass the encoding along with the request and then we need to abstract or duplicate these steps (and maybe improve them while we're at it, especially getting the charset parameter from the Content-Type header):

If response's Content Type metadata, if any, specifies a character encoding, and the user agent supports that encoding, then set character encoding to that encoding (ignoring the passed-in value).

Let source text be the result of decoding response's body to Unicode, using character encoding as the fallback encoding.

Let script be the result of creating a classic script given source text, settings object, response's url, options, and muted errors.

And then if script's record is null parsing failed.

@domenic does that seem right to you?

domenic · 2021-01-22T17:00:19Z

I don't have the full context on what security guarantees we're trying to preserve here (is it bad to leak information about the Content-Type header?) but in terms of a spec refactoring, that seems reasonable.

domenic · 2021-01-22T17:01:14Z

and maybe improve them while we're at it, especially getting the charset parameter from the Content-Type header

Basically every usage of "Content-Type metadata" in HTML could be improved by using the new MIME type getter, I think.

annevk · 2021-10-04T08:46:29Z

One risk here is that the attacker has control over the encoding, so this technically gives them more opportunity to find a way to get something parsed as JavaScript. In practice it still seems hard to parse as JavaScript as the majority of significant bytes are in the ASCII range.

annevk · 2022-05-17T13:04:38Z

I included a fix for this in whatwg/fetch#1442 which I think works. The HTML side will need to set it on requests, but that's a very straightforward change.

And while it is unfortunate that the fallback encoding is in the hands of the attacker, this is no different from the status quo.

annevk · 2022-06-01T13:02:32Z

I forgot that the response itself also carries encoding-related information. whatwg/fetch#1447 tackles the first part of that. Once that lands it should be easy to call from Fetch's ORB PR.

annevk mentioned this issue May 17, 2021

No size limit #22

Open

annevk added the mvp label May 17, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to decode potential JavaScript #7

How to decode potential JavaScript #7

annevk commented Jan 11, 2021

annevk commented Jan 22, 2021

domenic commented Jan 22, 2021

domenic commented Jan 22, 2021

annevk commented Oct 4, 2021

annevk commented May 17, 2022

annevk commented Jun 1, 2022

How to decode potential JavaScript #7

How to decode potential JavaScript #7

Comments

annevk commented Jan 11, 2021

annevk commented Jan 22, 2021

domenic commented Jan 22, 2021

domenic commented Jan 22, 2021

annevk commented Oct 4, 2021

annevk commented May 17, 2022

annevk commented Jun 1, 2022