Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prevent navigations to non-HTTP(S) URLs #247

Merged
merged 6 commits into from
Sep 21, 2020
Merged

Prevent navigations to non-HTTP(S) URLs #247

merged 6 commits into from
Sep 21, 2020

Conversation

domenic
Copy link
Collaborator

@domenic domenic commented Aug 25, 2020

Closes #145.

Test cases this should generate:

  • For each badURL of data:text/html,foo, javascript:'a string', javascript:undefined, about:blank, a blob: URL, and a real URL that is served with the Content-Disposition: attachment header, a real URL that has a 204 response, and a real URL that has a 205 response:

    • <portal src="badURL"></portal> must end up closed

    • <portal src="redirect"></portal> where redirect redirects (say, 302) to badURL must end up closed

    • <portal src="web+test:foo"></portal> where web+test is mapped using registerProtocolHandler to badURL must end up closed. (Only possible for same-origin HTTPS bad URLs; you can't register the others.)

    • <portal src="test"></portal> where test does location.href = badURL must stay on test and not end up closed

    • <portal src="test"></portal> where test contains <meta http-equiv="refresh" content="0; URL=badURL"> must stay on test and not end up closed

    • <portal src="test"></portal> where test contains <a href="badURL">a</a><script>document.querySelector("a").click() must stay on test and not end up closed

  • <portal src="404"></portal> must show the 404 page from the server, and not end up closed

  • <portal src="500"></portal> must show the 500 page from the server, and not end up closed

  • registerProtocolHandler("web+soup", "soup?url=%s") + <portal src="web+soup:chicken-kïwi"></portal> should load, and not end up closed, despite the non-HTTP(S) src="".

  • <portal src="unresolvable"></portal> must end up closed


Preview | Diff

Copy link
Collaborator

@jeremyroman jeremyroman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kjmcnee I think Chromium presently just prevents the navigation rather than closing the portal; does that match your recollection? Any opinions one way or the other?

index.bs Outdated

1. If |browsingContext| is a [=portal browsing context=], and <var ignore>response</var>'s
[=response/url=] is either null or is a [=URL=] whose [=url/scheme=] is not a [=HTTP(S)
scheme=], then:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC this happens after all redirects etc. Do you know if there are any other fetch schemes that could do redirects, and thus would be missed here? Would it matter if they were?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this the registerProtocolHandler case? If so, I'm not sure what we do in Chromium and we should probably file a bug to figure out out.

@domenic
Copy link
Collaborator Author

domenic commented Aug 28, 2020

I'm going to open a new issue for preventing navigation vs. error pages vs. closing the portal.

@kjmcnee
Copy link
Collaborator

kjmcnee commented Aug 28, 2020

@kjmcnee I think Chromium presently just prevents the navigation rather than closing the portal; does that match your recollection? Any opinions one way or the other?

Right, we currently ignore/cancel the navigation. We have https://crbug.com/1058489 which is consistent with case 1, I think. I'm not as sure about closing for case 2 instead of canceling, since that would cause the portal to close out from under its host, rather than due to the host's action. That said, these seem like exceptional cases, and if the portal apis encourage pages to consider portals being discarded, then consistency here sounds good. Also, I agree on case 5.

@domenic
Copy link
Collaborator Author

domenic commented Sep 3, 2020

Alright, this is ready for re-review, and now intends to cover all of #248. The list of tests in the OP is outdated; after this gets approved, I'll reassemble such a list based on #248 and file a Chromium bug to create such tests.

@domenic
Copy link
Collaborator Author

domenic commented Sep 3, 2020

Hmm, OK, one interesting question: should we remove the early-close on src assignment to non-HTTP(S) URLs, in favor of this late check?

In particular, consider if you did registerProtocolHandler("web+soup", "soup?url=%s") and then did <portal src="web+soup:chicken-kïwi"></portal>. With the early check still in place, this would fail to load. If we removed the early check, then that URL would get translated into https://example.com/soup?url=web+soup:chicken-k%C3%AFwi as appropriate, and then pass the late check, and things would load as expected. That seems kind of nice.

@jeremyroman
Copy link
Collaborator

Agreed it seems kind of nice, though I don't have enough familiarity with registerProtocolHandler to know whether this is a good idea or not (is this reliable enough that it's useful?)

@domenic
Copy link
Collaborator Author

domenic commented Sep 16, 2020

registerProtocolHandler is pretty reliable; I don't know how high usage is, but I think it points toward an architectural principle here that we shouldn't be acting on src="" values directly, and instead doing the late check. So I'll switch to that.

@domenic
Copy link
Collaborator Author

domenic commented Sep 16, 2020

Updated the OP with a list of test cases. Review appreciated!

index.bs Outdated
first step of the queued task:

1. If |browsingContext|'s [=portal state=] is not "`none`", then [=close a portal
element|close=] |browsingContext|'s [=host element=] and abort these steps.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, is this the right fix here? This might be silly but I'm not sure what's actually wrong with it:

<!-- in a portal -->
<a id="loadmore" href="javascript:loadMoreStuff()">load more</a>
<script>
onload = () => document.getElementById("loadmore").click();
</script>

I think I would have expected to ban javascript: from portal src, so you can't start there, and then otherwise permit it (since the document you would load still has a URL that's legal to load in a portal, even if the javascript: URL does replace the document -- and mostly it wouldn't).

This is a very silly edge case, though, so I don't feel strongly.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great catch. I forgot javascript: in my case analysis, at least mentally, so I didn't properly harmonize it with our desired outcome for the others.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So for string-returning JavaScript snippets, the result gets the default 200 status code. I think we want that to close the portal. It's the non-string returning ones that we want to allow.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm. Well, what falls out naturally from the existing spec text is a bit different.

If we delete this section entirely, we get the following:

  • If the JS returns a string, then it will create a 200 response with null url.
    • If we're in the initial-load case, this will close the portal
    • If we're in the non-initial-load case, then this will bail out and do nothing
  • If the JS returns a non-string, then it will create a 204 response with null url.
    • Exact same as in the string case.

I guess that's OK? Maybe this just reflects the bad existing JS spec; e.g. maybe some of those responses should be non-null URL. I can look into that, but perhaps we should just move ahead with this PR by deleting this section.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Final (?) reply-to-myself: it looks like the existing spec is correct in this; both string and non-string cases have null response URL.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure whether the response URL is null or not for the generated response (I don't see anywhere that sets it), but document.URL is not -- it remains the same as the page it replaced (or possibly the initiator?):

http://software.hixie.ch/utilities/js/live-dom-viewer/?%3C!DOCTYPE%20html%3E%0A%3Ciframe%20src%3D%22data%3Atext%2Fhtml%2C%3Ca%20href%3D%26quot%3Bjavascript%3A%27%3Cscript%3Eonload%20%3D%20()%20%3D%3E%20document.body.appendChild(document.createTextNode(document.URL))%3C%2Fscript%3E%27%26quot%3B%3Eclick%20me%3C%2Fa%3E%22%3E%3C%2Fiframe%3E

Your examples both print document.URL before the synthetic response is generated (because the expression is still being evaluated), and I think they're about:blank because it's the initial about:blank that's executing the javascript: URL in the case where you directly set it as iframe src.

I definitely care way more about non-string ones than string ones (does anyone actually use the string case?), but if it does get to keep the URL of the page it replaces then I would have expected to preserve that, odd as it is.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for correcting my bad examples :). Do you think there's anything we should modify in this PR though? Or is this good to merge?

index.bs Outdated Show resolved Hide resolved
index.bs Outdated Show resolved Hide resolved
@domenic
Copy link
Collaborator Author

domenic commented Sep 17, 2020

Alright, pushed to address comments. Yay for review!

Copy link
Collaborator

@jeremyroman jeremyroman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM <3

The test case list is good. So we don't forget to turn it into WPT, mind filing a crbug against Blink>HTML>Portal if you haven't already?

@domenic domenic merged commit 703914f into master Sep 21, 2020
@domenic domenic deleted the no-bad-redirects branch September 21, 2020 16:14
@domenic
Copy link
Collaborator Author

domenic commented Sep 21, 2020

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

URL scheme control on 'set the source URL' may be insufficient
3 participants