Prevent navigations to non-HTTP(S) URLs #247

domenic · 2020-08-25T19:22:35Z

Closes #145.

Test cases this should generate:

For each badURL of data:text/html,foo, javascript:'a string', javascript:undefined, about:blank, a blob: URL, and a real URL that is served with the Content-Disposition: attachment header, a real URL that has a 204 response, and a real URL that has a 205 response:
- <portal src="badURL"></portal> must end up closed
- <portal src="redirect"></portal> where redirect redirects (say, 302) to badURL must end up closed
- <portal src="web+test:foo"></portal> where web+test is mapped using registerProtocolHandler to badURL must end up closed. (Only possible for same-origin HTTPS bad URLs; you can't register the others.)
- <portal src="test"></portal> where test does location.href = badURL must stay on test and not end up closed
- <portal src="test"></portal> where test contains <meta http-equiv="refresh" content="0; URL=badURL"> must stay on test and not end up closed
- <portal src="test"></portal> where test contains <a href="badURL">a</a><script>document.querySelector("a").click() must stay on test and not end up closed
<portal src="404"></portal> must show the 404 page from the server, and not end up closed
<portal src="500"></portal> must show the 500 page from the server, and not end up closed
registerProtocolHandler("web+soup", "soup?url=%s") + <portal src="web+soup:chicken-kïwi"></portal> should load, and not end up closed, despite the non-HTTP(S) src="".
<portal src="unresolvable"></portal> must end up closed

jeremyroman

@kjmcnee I think Chromium presently just prevents the navigation rather than closing the portal; does that match your recollection? Any opinions one way or the other?

jeremyroman · 2020-08-27T21:38:06Z

index.bs

+
+    1. If |browsingContext| is a [=portal browsing context=], and <var ignore>response</var>'s
+        [=response/url=] is either null or is a [=URL=] whose [=url/scheme=] is not a [=HTTP(S)
+        scheme=], then:


IIUC this happens after all redirects etc. Do you know if there are any other fetch schemes that could do redirects, and thus would be missed here? Would it matter if they were?

Is this the registerProtocolHandler case? If so, I'm not sure what we do in Chromium and we should probably file a bug to figure out out.

domenic · 2020-08-28T16:53:07Z

I'm going to open a new issue for preventing navigation vs. error pages vs. closing the portal.

kjmcnee · 2020-08-28T17:07:16Z

@kjmcnee I think Chromium presently just prevents the navigation rather than closing the portal; does that match your recollection? Any opinions one way or the other?

Right, we currently ignore/cancel the navigation. We have https://crbug.com/1058489 which is consistent with case 1, I think. I'm not as sure about closing for case 2 instead of canceling, since that would cause the portal to close out from under its host, rather than due to the host's action. That said, these seem like exceptional cases, and if the portal apis encourage pages to consider portals being discarded, then consistency here sounds good. Also, I agree on case 5.

domenic · 2020-09-03T19:36:49Z

Alright, this is ready for re-review, and now intends to cover all of #248. The list of tests in the OP is outdated; after this gets approved, I'll reassemble such a list based on #248 and file a Chromium bug to create such tests.

domenic · 2020-09-03T19:38:58Z

Hmm, OK, one interesting question: should we remove the early-close on src assignment to non-HTTP(S) URLs, in favor of this late check?

In particular, consider if you did registerProtocolHandler("web+soup", "soup?url=%s") and then did <portal src="web+soup:chicken-kïwi"></portal>. With the early check still in place, this would fail to load. If we removed the early check, then that URL would get translated into https://example.com/soup?url=web+soup:chicken-k%C3%AFwi as appropriate, and then pass the late check, and things would load as expected. That seems kind of nice.

jeremyroman · 2020-09-14T18:13:49Z

Agreed it seems kind of nice, though I don't have enough familiarity with registerProtocolHandler to know whether this is a good idea or not (is this reliable enough that it's useful?)

domenic · 2020-09-16T15:31:36Z

registerProtocolHandler is pretty reliable; I don't know how high usage is, but I think it points toward an architectural principle here that we shouldn't be acting on src="" values directly, and instead doing the late check. So I'll switch to that.

Closes #145. Closes #248. Part of #185.

domenic · 2020-09-16T15:46:39Z

Updated the OP with a list of test cases. Review appreciated!

jeremyroman · 2020-09-17T18:58:32Z

index.bs

+    first step of the queued task:
+
+    1. If |browsingContext|'s [=portal state=] is not "`none`", then [=close a portal
+        element|close=] |browsingContext|'s [=host element=] and abort these steps.


Hmm, is this the right fix here? This might be silly but I'm not sure what's actually wrong with it:

 <a id="loadmore" href="javascript:loadMoreStuff()">load more</a> <script> onload = () => document.getElementById("loadmore").click(); </script>

I think I would have expected to ban javascript: from portal src, so you can't start there, and then otherwise permit it (since the document you would load still has a URL that's legal to load in a portal, even if the javascript: URL does replace the document -- and mostly it wouldn't).

This is a very silly edge case, though, so I don't feel strongly.

Great catch. I forgot javascript: in my case analysis, at least mentally, so I didn't properly harmonize it with our desired outcome for the others.

So for string-returning JavaScript snippets, the result gets the default 200 status code. I think we want that to close the portal. It's the non-string returning ones that we want to allow.

Hmm. Well, what falls out naturally from the existing spec text is a bit different.

If we delete this section entirely, we get the following:

If the JS returns a string, then it will create a 200 response with null url.

If we're in the initial-load case, this will close the portal

If we're in the non-initial-load case, then this will bail out and do nothing

If the JS returns a non-string, then it will create a 204 response with null url.

Exact same as in the string case.

I guess that's OK? Maybe this just reflects the bad existing JS spec; e.g. maybe some of those responses should be non-null URL. I can look into that, but perhaps we should just move ahead with this PR by deleting this section.

Final (?) reply-to-myself: it looks like the existing spec is correct in this; both string and non-string cases have null response URL.

http://software.hixie.ch/utilities/js/live-dom-viewer/?saved=8481

http://software.hixie.ch/utilities/js/live-dom-viewer/?saved=8482

I'm not sure whether the response URL is null or not for the generated response (I don't see anywhere that sets it), but document.URL is not -- it remains the same as the page it replaced (or possibly the initiator?):

http://software.hixie.ch/utilities/js/live-dom-viewer/?%3C!DOCTYPE%20html%3E%0A%3Ciframe%20src%3D%22data%3Atext%2Fhtml%2C%3Ca%20href%3D%26quot%3Bjavascript%3A%27%3Cscript%3Eonload%20%3D%20()%20%3D%3E%20document.body.appendChild(document.createTextNode(document.URL))%3C%2Fscript%3E%27%26quot%3B%3Eclick%20me%3C%2Fa%3E%22%3E%3C%2Fiframe%3E

Your examples both print document.URL before the synthetic response is generated (because the expression is still being evaluated), and I think they're about:blank because it's the initial about:blank that's executing the javascript: URL in the case where you directly set it as iframe src.

I definitely care way more about non-string ones than string ones (does anyone actually use the string case?), but if it does get to keep the URL of the page it replaces then I would have expected to preserve that, odd as it is.

Thanks for correcting my bad examples :). Do you think there's anything we should modify in this PR though? Or is this good to merge?

index.bs

domenic · 2020-09-17T19:45:29Z

Alright, pushed to address comments. Yay for review!

jeremyroman

LGTM <3

The test case list is good. So we don't forget to turn it into WPT, mind filing a crbug against Blink>HTML>Portal if you haven't already?

domenic · 2020-09-21T16:17:16Z

Done! https://bugs.chromium.org/p/chromium/issues/detail?id=1130605

jeremyroman reviewed Aug 27, 2020

View reviewed changes

domenic mentioned this pull request Aug 28, 2020

Handling bad URLs, navigation errors, downloads, and 204s in a unified way #248

Closed

domenic force-pushed the no-bad-redirects branch from 9010024 to 75d3155 Compare September 3, 2020 19:35

domenic added 2 commits September 16, 2020 11:32

Prevent navigations to bad responses, and downloads

8ddf8ad

Closes #145. Closes #248. Part of #185.

Remove early HTTP(S) scheme check

42eaa47

jeremyroman reviewed Sep 17, 2020

View reviewed changes

domenic added 2 commits September 17, 2020 15:19

Fix editorial nit

19b7dd9

Expand the note

9f60901

domenic force-pushed the no-bad-redirects branch from 75d3155 to 9f60901 Compare September 17, 2020 19:27

Try fixing javascript: URLs

3e63416

Use request's current URL, which makes things clearer

bde9605

jeremyroman approved these changes Sep 21, 2020

View reviewed changes

domenic merged commit 703914f into master Sep 21, 2020

domenic deleted the no-bad-redirects branch September 21, 2020 16:14

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Prevent navigations to non-HTTP(S) URLs #247

Prevent navigations to non-HTTP(S) URLs #247

domenic commented Aug 25, 2020 •

edited by pr-preview bot

Loading

jeremyroman left a comment

jeremyroman Aug 27, 2020

jeremyroman Aug 27, 2020

domenic commented Aug 28, 2020

kjmcnee commented Aug 28, 2020

domenic commented Sep 3, 2020

domenic commented Sep 3, 2020

jeremyroman commented Sep 14, 2020

domenic commented Sep 16, 2020

domenic commented Sep 16, 2020

jeremyroman Sep 17, 2020

domenic Sep 17, 2020

domenic Sep 17, 2020

domenic Sep 17, 2020

domenic Sep 17, 2020

jeremyroman Sep 17, 2020

domenic Sep 18, 2020

domenic commented Sep 17, 2020

jeremyroman left a comment

domenic commented Sep 21, 2020

Prevent navigations to non-HTTP(S) URLs #247

Prevent navigations to non-HTTP(S) URLs #247

Conversation

domenic commented Aug 25, 2020 • edited by pr-preview bot Loading

jeremyroman left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

domenic commented Aug 28, 2020

kjmcnee commented Aug 28, 2020

domenic commented Sep 3, 2020

domenic commented Sep 3, 2020

jeremyroman commented Sep 14, 2020

domenic commented Sep 16, 2020

domenic commented Sep 16, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

domenic commented Sep 17, 2020

jeremyroman left a comment

Choose a reason for hiding this comment

domenic commented Sep 21, 2020

domenic commented Aug 25, 2020 •

edited by pr-preview bot

Loading