Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

More CORB-protected MIME types - adding protected types one-by-one. #860

Closed
anforowicz opened this issue Jan 18, 2019 · 14 comments
Closed
Labels
security/privacy There are security or privacy implications topic: orb

Comments

@anforowicz
Copy link
Contributor

In #721 we have been discussing how to extend CORB to other types using a "safelist" approach (e.g. attempting to protect/block everything that is not a "safelisted" content type like an image, stylesheet, script, etc). The "safelist" (e.g. the 2 flavours pointed out in #721 (comment)) seems like the right long-term approach, but it seems very unfortunate that some sensitive content types are not protected in the short-term. For example - it seems very desirable to protect PDF documents as they are likely to contain sensitive information (e.g. financial statements).

I think I would like to just go ahead and add protection of "application/pdf" to Chromium's CORB implementation (with no sniffing). Does that seem reasonable to all of you? If I hear no objections, then I'll try to work on that (implementation + WPT coverage) and possibly adding that mime type to https://fetch.spec.whatwg.org/#corb later on.

Do you have any suggestions for how to decide on additional types that should be protected by CORB? PDF seems like an important one, but hopefully such types can be identified in a structured way (rather than unilaterally and possibly incorrectly deciding that PDF is important but ZIP and CSV maybe not so much). I wonder if there are public data sources that have statistics on the most commonly used MIME types on the web (I couldn't find this after a cursory glance at https://archive.org/).

cc @annevk @csreis @jakearchibald

@anforowicz
Copy link
Contributor Author

Not sure how to assign "topic:corb" and "security/privacy" labels to this issue... :-/

@ricea ricea added security/privacy There are security or privacy implications topic: orb labels Jan 21, 2019
@annevk
Copy link
Member

annevk commented Jan 21, 2019

Adding more types sounds good. Doing it based on the Content-Type header seems good, but as the result of sniffing seems acceptable too. Especially "binary" formats often have reliable signatures we could use to quickly block things.

We should definitely define how all of that works. I hope to have time later this year to revamp the MIME Sniffing specification a little so it's a bit more apparent what kind of architecture we need/have (though if someone else wants to take that on that'd be great too).

I'd basically try to add as many types as possible. We can add whatever cannot normally be loaded or parsed via no-cors (media, script, style).

I do think it would be nice if we could switch the error handling as per #727. Currently CORB sticks out a bit and I don't think it's needed, even though the change would be somewhat observable. At that point it wouldn't really be different from blocking text/csv for script loads as we do today.

@anforowicz
Copy link
Contributor Author

anforowicz commented Jan 23, 2019

I do think it would be nice if we could switch the error handling as per #727. Currently CORB sticks out a bit and I don't think it's needed, even though the change would be somewhat observable.

Hmmm... maybe this could be approached by changing how Fetch spec describes handling of "Cross-Origin-Resource-Policy" header (i.e. approaching this as tweaking CORP rather than tweaking CORB). CORP already results in errors. We could add new steps between steps 2 and 3 (after checking that this is a cross-origin, no-cors request) saying:

  • Extract "Content-Type"
  • If "Content-Type" is "PDF" (or X, Y, Z), then return "blocked".

Tweaking CORP might also be easier implementation-wise (at least from Chromium perspective) than tweaking CORB.

WDYT?

cc @johnwilander @youennf

I'd basically try to add as many types as possible. We can add whatever cannot normally be loaded or parsed via no-cors (media, script, style).

So, should we just use this issue to brainstorm which types to protect? Or should we just try application/pdf for now (and see if anything breaks once it reaches more users)?

Various notes on this subject:

  • On one hand, I am a bit worried that the list of types to protect might eventually grow large and unwieldy.
  • OTOH, I don't want to arbitrarily decide that PDFs are worthy of protection, but (hypothetically) CSV, DOCX and/or ODT are not.
  • Still, I want to start protecting "obvious" types (PDFs seem like a common format for financial statements) as soon as possible, without necessarily waiting until all potentially important types have been identified.

So, personally, I would like to just start with application/pdf unless others object. WDYT?

@annevk
Copy link
Member

annevk commented Jan 24, 2019

I meant changing the error handling for CORB in general. I think what you suggest works, but it's not clear to me that's the architecture we want to have for this long term.

@anforowicz
Copy link
Contributor Author

it's not clear to me that's the architecture we want to have for this long term.

Ack. Maybe this deserves its own, small section in the spec (i.e. something separate from both CORB and CORP). One question here would be what to call this hypothetical section - "Blocking of cross-origin responses with content types that are never useful in no-cors requests" (BoCORwCTTANUiNCR)?

@anforowicz
Copy link
Contributor Author

BTW: I have tried to look at HTTP Archive data to see which Content-Type values are most commonly found in http responses. I've shared the results in a sheet here.

I think we should consider adding more protection for the following types (ordered roughly in the order of how frequently they occur in the wild, with most common types listed first):

  • application/x-gzip
  • application/zip
  • application/x-protobuf
  • application/pdf
  • text/csv
  • application/msword

@annevk
Copy link
Member

annevk commented Jan 25, 2019

cc @evilpie

chromium-wpt-export-bot pushed a commit to web-platform-tests/wpt that referenced this issue May 24, 2019
This CL adds CORB coverage for:

1) text/event-stream, application/x-www-form-urlencoded, based on the
code review discussion in a previous CL here:
https://chromium-review.googlesource.com/c/chromium/src/+/1604244/4/services/network/cross_origin_read_blocking.cc#227

2) application/gzip, which wasn't mentioned explicitly in the CR
discussion above, but which is ranked #212 in the spreadsheet
mentioned in whatwg/fetch#860 (comment)
and therefore probably should have been included in r659671 together
with x-gzip (ranked #54) and zip (ranked #71).

Bug: 802836
Change-Id: I8c10f900110a2cb471437a19425bfd5e38aed2fe
chromium-wpt-export-bot pushed a commit to web-platform-tests/wpt that referenced this issue May 28, 2019
This CL adds CORB coverage for:

1) text/event-stream, application/x-www-form-urlencoded, based on the
code review discussion in a previous CL here:
https://chromium-review.googlesource.com/c/chromium/src/+/1604244/4/services/network/cross_origin_read_blocking.cc#227

2) application/gzip, which wasn't mentioned explicitly in the CR
discussion above, but which is ranked #212 in the spreadsheet
mentioned in whatwg/fetch#860 (comment)
and therefore probably should have been included in r659671 together
with x-gzip (ranked #54) and zip (ranked #71).

Bug: 802836
Change-Id: I8c10f900110a2cb471437a19425bfd5e38aed2fe
Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/1628809
Reviewed-by: Charlie Reis <[email protected]>
Commit-Queue: Łukasz Anforowicz <[email protected]>
Cr-Commit-Position: refs/heads/master@{#663824}
chromium-wpt-export-bot pushed a commit to web-platform-tests/wpt that referenced this issue May 28, 2019
This CL adds CORB coverage for:

1) text/event-stream, application/x-www-form-urlencoded, based on the
code review discussion in a previous CL here:
https://chromium-review.googlesource.com/c/chromium/src/+/1604244/4/services/network/cross_origin_read_blocking.cc#227

2) application/gzip, which wasn't mentioned explicitly in the CR
discussion above, but which is ranked #212 in the spreadsheet
mentioned in whatwg/fetch#860 (comment)
and therefore probably should have been included in r659671 together
with x-gzip (ranked #54) and zip (ranked #71).

Bug: 802836
Change-Id: I8c10f900110a2cb471437a19425bfd5e38aed2fe
Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/1628809
Reviewed-by: Charlie Reis <[email protected]>
Commit-Queue: Łukasz Anforowicz <[email protected]>
Cr-Commit-Position: refs/heads/master@{#663824}
moz-v2v-gh pushed a commit to mozilla/gecko-dev that referenced this issue Jun 19, 2019
… x-www-form-urlencoded., a=testonly

Automatic update from web-platform-tests
CORB should block event-stream, gzip and x-www-form-urlencoded.

This CL adds CORB coverage for:

1) text/event-stream, application/x-www-form-urlencoded, based on the
code review discussion in a previous CL here:
https://chromium-review.googlesource.com/c/chromium/src/+/1604244/4/services/network/cross_origin_read_blocking.cc#227

2) application/gzip, which wasn't mentioned explicitly in the CR
discussion above, but which is ranked #212 in the spreadsheet
mentioned in whatwg/fetch#860 (comment)
and therefore probably should have been included in r659671 together
with x-gzip (ranked #54) and zip (ranked #71).

Bug: 802836
Change-Id: I8c10f900110a2cb471437a19425bfd5e38aed2fe
Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/1628809
Reviewed-by: Charlie Reis <[email protected]>
Commit-Queue: Łukasz Anforowicz <[email protected]>
Cr-Commit-Position: refs/heads/master@{#663824}

--

wp5At-commits: 4308b47e147274f6702a0c1add77488fbc75a309
wpt-pr: 17006
xeonchen pushed a commit to xeonchen/gecko that referenced this issue Jun 19, 2019
… x-www-form-urlencoded., a=testonly

Automatic update from web-platform-tests
CORB should block event-stream, gzip and x-www-form-urlencoded.

This CL adds CORB coverage for:

1) text/event-stream, application/x-www-form-urlencoded, based on the
code review discussion in a previous CL here:
https://chromium-review.googlesource.com/c/chromium/src/+/1604244/4/services/network/cross_origin_read_blocking.cc#227

2) application/gzip, which wasn't mentioned explicitly in the CR
discussion above, but which is ranked #212 in the spreadsheet
mentioned in whatwg/fetch#860 (comment)
and therefore probably should have been included in r659671 together
with x-gzip (ranked #54) and zip (ranked #71).

Bug: 802836
Change-Id: I8c10f900110a2cb471437a19425bfd5e38aed2fe
Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/1628809
Reviewed-by: Charlie Reis <[email protected]>
Commit-Queue: Łukasz Anforowicz <[email protected]>
Cr-Commit-Position: refs/heads/master@{#663824}

--

wp5At-commits: 4308b47e147274f6702a0c1add77488fbc75a309
wpt-pr: 17006
marcoscaceres pushed a commit to web-platform-tests/wpt that referenced this issue Jul 23, 2019
This CL adds CORB coverage for:

1) text/event-stream, application/x-www-form-urlencoded, based on the
code review discussion in a previous CL here:
https://chromium-review.googlesource.com/c/chromium/src/+/1604244/4/services/network/cross_origin_read_blocking.cc#227

2) application/gzip, which wasn't mentioned explicitly in the CR
discussion above, but which is ranked #212 in the spreadsheet
mentioned in whatwg/fetch#860 (comment)
and therefore probably should have been included in r659671 together
with x-gzip (ranked #54) and zip (ranked #71).

Bug: 802836
Change-Id: I8c10f900110a2cb471437a19425bfd5e38aed2fe
Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/1628809
Reviewed-by: Charlie Reis <[email protected]>
Commit-Queue: Łukasz Anforowicz <[email protected]>
Cr-Commit-Position: refs/heads/master@{#663824}
gecko-dev-updater pushed a commit to marco-c/gecko-dev-comments-removed that referenced this issue Oct 4, 2019
… x-www-form-urlencoded., a=testonly

Automatic update from web-platform-tests
CORB should block event-stream, gzip and x-www-form-urlencoded.

This CL adds CORB coverage for:

1) text/event-stream, application/x-www-form-urlencoded, based on the
code review discussion in a previous CL here:
https://chromium-review.googlesource.com/c/chromium/src/+/1604244/4/services/network/cross_origin_read_blocking.cc#227

2) application/gzip, which wasn't mentioned explicitly in the CR
discussion above, but which is ranked #212 in the spreadsheet
mentioned in whatwg/fetch#860 (comment)
and therefore probably should have been included in r659671 together
with x-gzip (ranked #54) and zip (ranked #71).

Bug: 802836
Change-Id: I8c10f900110a2cb471437a19425bfd5e38aed2fe
Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/1628809
Reviewed-by: Charlie Reis <creischromium.org>
Commit-Queue: Łukasz Anforowicz <lukaszachromium.org>
Cr-Commit-Position: refs/heads/master{#663824}

--

wp5At-commits: 4308b47e147274f6702a0c1add77488fbc75a309
wpt-pr: 17006

UltraBlame original commit: 2c70fb58e7267888692dce4121614c1240f470d4
gecko-dev-updater pushed a commit to marco-c/gecko-dev-wordified that referenced this issue Oct 4, 2019
… x-www-form-urlencoded., a=testonly

Automatic update from web-platform-tests
CORB should block event-stream, gzip and x-www-form-urlencoded.

This CL adds CORB coverage for:

1) text/event-stream, application/x-www-form-urlencoded, based on the
code review discussion in a previous CL here:
https://chromium-review.googlesource.com/c/chromium/src/+/1604244/4/services/network/cross_origin_read_blocking.cc#227

2) application/gzip, which wasn't mentioned explicitly in the CR
discussion above, but which is ranked #212 in the spreadsheet
mentioned in whatwg/fetch#860 (comment)
and therefore probably should have been included in r659671 together
with x-gzip (ranked #54) and zip (ranked #71).

Bug: 802836
Change-Id: I8c10f900110a2cb471437a19425bfd5e38aed2fe
Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/1628809
Reviewed-by: Charlie Reis <creischromium.org>
Commit-Queue: Łukasz Anforowicz <lukaszachromium.org>
Cr-Commit-Position: refs/heads/master{#663824}

--

wp5At-commits: 4308b47e147274f6702a0c1add77488fbc75a309
wpt-pr: 17006

UltraBlame original commit: 2c70fb58e7267888692dce4121614c1240f470d4
gecko-dev-updater pushed a commit to marco-c/gecko-dev-wordified-and-comments-removed that referenced this issue Oct 4, 2019
… x-www-form-urlencoded., a=testonly

Automatic update from web-platform-tests
CORB should block event-stream, gzip and x-www-form-urlencoded.

This CL adds CORB coverage for:

1) text/event-stream, application/x-www-form-urlencoded, based on the
code review discussion in a previous CL here:
https://chromium-review.googlesource.com/c/chromium/src/+/1604244/4/services/network/cross_origin_read_blocking.cc#227

2) application/gzip, which wasn't mentioned explicitly in the CR
discussion above, but which is ranked #212 in the spreadsheet
mentioned in whatwg/fetch#860 (comment)
and therefore probably should have been included in r659671 together
with x-gzip (ranked #54) and zip (ranked #71).

Bug: 802836
Change-Id: I8c10f900110a2cb471437a19425bfd5e38aed2fe
Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/1628809
Reviewed-by: Charlie Reis <creischromium.org>
Commit-Queue: Łukasz Anforowicz <lukaszachromium.org>
Cr-Commit-Position: refs/heads/master{#663824}

--

wp5At-commits: 4308b47e147274f6702a0c1add77488fbc75a309
wpt-pr: 17006

UltraBlame original commit: 2c70fb58e7267888692dce4121614c1240f470d4
@ghost
Copy link

ghost commented Jan 16, 2020

I've just discovered the work happening here and I must say I'm enamored. Securing the web and the user space/s outside of the web while ensuring user privacy is essential. If I may suggest an additional MIME type for consideration: application/pgp-signature. As an outsider to this prospectively daunting task, I believe that such keys both public and private (in the rare cases when private keys are mistakenly exposed online), deserve the same privacy considerations as those types listed by @anforowicz. Thank-you all.

@anforowicz
Copy link
Contributor Author

@mozdevcontrib, thanks for chiming in. Interestingly application/pgp* prefix does not show up in the HTTP-archive-based sheet. Still, adding application/pgp-signature (and application/pgp-encrypted and application/pgp-keys from rfc3156 or maybe even all of application/pgp*) sounds like a good idea. We probably should also consider multipart/signed (or maybe all of multipart/*) and application/pkcs7-signature. Let me track that from Chromium side in https://crbug.com/1042836.

@ghost
Copy link

ghost commented Jan 16, 2020 via email

@anforowicz
Copy link
Contributor Author

@mozdevcontrib, could you clarify why it is desirable to prevent disclosing application/pgp-signature resources? An example scenario might help.

I understand that in multipart/signed the unencrypted, signed body might contain some sensitive/personal information that might benefit from CORB protection. OTOH, I am not sure why application/pgp-signature might be sensitive - AFAIU it reveals very limited information:

  • the hash of the signed document (when signing a binary or text document - signature type 0x00 or 0x01)
  • user id and public key (e.g. for 0x12 signature type: Casual certification of a User ID and Public-Key packet
  • some signature metadata (like signature creation time).

@anforowicz
Copy link
Contributor Author

anforowicz commented Dec 3, 2020

@annevk, can you please provide feedback whether we might want to move forward with making spec changes here? (#860 (comment) seemed supportive if we treat error handling as an orthogonal discussion.) We could wait until we have tried experimenting with ORB / safelist-based approach, but maybe adding extra MIME types to the spec makes sense?

How would the spec changes look like?:

FWIW, WPT tests (at fetch/corb/script-resource-with-nonsniffable-types.tentative.sub.html) cover the following types that are recognized by CORB in Chromium:

  • application/gzip
  • application/pdf
  • application/x-gzip
  • application/x-protobuf (with plans to also cover application/x-protobuffer)
  • application/zip
  • multipart/byteranges
  • multipart/signed
  • text/csv
  • text/event-stream

Additionally, CORB in Chromium recognizes:

  • application/msexcel
  • application/mspowerpoint
  • application/msword
  • application/msword-template
  • application/vnd.ces-quickpoint
  • application/vnd.ces-quicksheet
  • application/vnd.ces-quickword
  • application/vnd.ms-excel
  • application/vnd.ms-excel.sheet.macroenabled.12
  • application/vnd.ms-powerpoint
  • application/vnd.ms-powerpoint.presentation.macroenabled.12
  • application/vnd.ms-word
  • application/vnd.ms-word.document.12
  • application/vnd.ms-word.document.macroenabled.12
  • application/vnd.msword
  • application/vnd.openxmlformats-officedocument.presentationml.presentation
  • application/vnd.openxmlformats-officedocument.presentationml.template
  • application/vnd.openxmlformats-officedocument.spreadsheetml.sheet
  • application/vnd.openxmlformats-officedocument.spreadsheetml.template
  • application/vnd.openxmlformats-officedocument.wordprocessingml.document
  • application/vnd.openxmlformats-officedocument.wordprocessingml.template
  • application/vnd.presentation-openxml
  • application/vnd.presentation-openxmlm
  • application/vnd.spreadsheet-openxml
  • application/vnd.wordprocessing-openxml

I am not sure if we'd want to include all of these types in the spec - we'd probably need to agree on the set of types to include, before proceeding with any spec changes.

@annevk
Copy link
Member

annevk commented Dec 4, 2020

@anforowicz I could see https://github.com/annevk/orb having an early step to check mimeType's essence for one of those types (edit: in my mind, the more types the merrier) and then return false, which would then result in a network error. I would be supportive of adding that behavior to the specification. It would be similar to the existing MIME type blocking but for opaque responses only.

@annevk
Copy link
Member

annevk commented May 17, 2022

As we've been discussing this as part of ORB (see annevk/orb@6268458 and annevk/orb@282c8ff), this will eventually be resolved through #1442. It doesn't seem worth it to keep this issue open separately at this point.

@annevk annevk closed this as completed May 17, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
security/privacy There are security or privacy implications topic: orb
Development

No branches or pull requests

3 participants