Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Define size of all storage actions #110

Open
annevk opened this issue Jul 10, 2020 · 5 comments
Open

Define size of all storage actions #110

annevk opened this issue Jul 10, 2020 · 5 comments

Comments

@annevk
Copy link
Member

annevk commented Jul 10, 2020

In order to give developers a more consistent experience across browsers, while allowing browsers to compress, deduplicate, and otherwise optimize the stored data, we should standardize the upper bound for each storage action and have all browsers enforce that.

E.g., the size of localStorage[key] = value could be (key's code unit length + value's code unit length) × 2 + 16 bytes of safety padding or some such. (I did not put a lot of thought into this. If we go down this path we'd need to do that.)

(See 6 in #95 (comment) and reply for context.)

@asutherland
Copy link
Collaborator

This seems desirable and has indeed come up before. Specifically, in terms of allowing structured serialized storage of data on things like ServiceWorker registrations and related data (ex: Notification.data) where it would be desirable to place an upper bound on storage but is an interop nightmare without this issue addressed.

I believe this would require the serialization steps for [Serializable] to also produce a size/upper-bound value as well?

It seems like the most complex issues are;

  1. Blob/File and any similarly immutable abstractions which allow implementations like IndexedDB to store a single copy of the data on disk. Firefox only stores a single copy of a given Blob/File (based on object identity, independent of contents). I presume the only course of action is to either standardize this or to tally each time the blob is used in a structured serialization (which will be-duplicate internally via its "memory"). If standardized, interesting and terrifying new possibilities are raised, such as the BlobStore being its own storage endpoint which can then be used by Notification.data and even ServiceWorker's Cache API storage.
  2. Compression. It would be unfortunate for implementations to be able to implement CPU/power/disk-efficient native storage of data but need to charge a high quota cost, resulting in content performing less efficient compression in JS/WASM in order to be charged a lower quota cost but actually use more disk space. Presumably the answer is Compression Streams? But this is still awkward because, for example, Firefox currently uses Snappy (for Cache API storage) and wants to use LZ4 (for Cache API storage and IndexedDB), and neither of those are yet specified and it would be arguably silly to run gzip against data just for the purposes of calculating a more generous quota charge while actually storing the data using LZ4.

@pwnall
Copy link
Contributor

pwnall commented Jul 22, 2020

Thank you very much for opening a specific issue for this topic!

Reiterating here for clarity -- Chrome is supportive of this effort to come up with an abstract cost model for storage. We'd be willing to take on the (quite non-trivial) implementation costs if the model gains cross-browser acceptance.

I also really like that @asutherland brought up some of the complex issues early on. I'd be tempted to follow the solutions of other systems I'm aware of.

  1. Blobs: Charge a separate copy per item. I claim this approach is more intuitive to users -- you're charged for what you write, with decisions made locally. Implementers get the benefits from content de-duplication as operational cost reduction. I think this approach would also make the proposal more palatable, because we'd be avoiding asking browsers to implement content de-duplication to be compliant.

  2. Compression: Charge for uncompressed data. Same reasoning as above -- it's more intuitive to be charged for what you write. Also, unless we mandate that each object is compressed individually, compression ratios depend on adjacent data, so I think we'd end up with a lot of constraints around physical data layout. I'd strongly prefer that specs don't get into this business 😄

On a brighter note, the zstd benchmarks suggest that the algorithms we'd consider have ratios within 2x of each other (and below 3x of uncompressed) for "typical" data. I claim this is well within the precision margin for the cost model we'd be building up here.

Along the same lines, I hope that we can avoid having apps play games (like manual compression) by being reasonably generous with quota. Ideally, apps without bugs should not run into quota problems.

@pwnall
Copy link
Contributor

pwnall commented Jul 22, 2020

I found some notes from when I tried to sketch a storage cost model for IndexedDB. This was in 2018, and I knew a lot less about the implementation back then. So, the numbers are probably bad, but at least it's a list of things to consider.

Object cost:

  • primitives (number, Date, null, true, false): 10 -- accommodates (tag + 8 bytes or tag + <= 9 bytes of varint)
  • string: 8 + 2 * string length
  • object: 8 + sum of keys and values
  • array: 16 + sum of elements
  • native arrays: 16 + buffer length
  • Imagedata: 32 + the cost of ImageData.data as a native array
  • Blob: 64 + cost of MIME type as string + length
  • File: Blob + cost of filename as string

I might have missed some other object. The idea is to assign a cost based on a straightforward representation for each clonable. The cost doesn't have to be exact, because we expect implementations to have their own overhead.

IndexedDB transaction costs (get refunded when the transaction completes):

  • 32 per open store and index in a transaction; write transactions open all indexes in their stores; versionchange transactions open all stores and indexes
  • write: 64 + inputs (key + value) + sum over indexes touched (16 + index key + primary key)
  • delete: like a write, but with zero value cost
  • store creation: 64 + store name and key path as strings
  • index creation: 64 + index name and key path as strings
  • store/index renames: same as creation
  • store/index deletion: 64; deleting a store implies deleting all its indexes

This isn't a complete list. I hope it's a good starting point if someone is itching to start an explainer 😄

@asutherland
Copy link
Collaborator

asutherland commented Jul 22, 2020

@pwnall Your simplifying proposal in #110 (comment) sounds good to me. Also, it's very consistent with reality, as Mozilla's Servo project is an example of bringing up a browser from scratch-ish and they've found implementing IndexedDB non-trivial, so further complicating the standard and raising the bar to building a compliant browser engine would not be a win for the web.

@annevk
Copy link
Member Author

annevk commented Oct 26, 2020

See also: whatwg/html#4914.

chromium-wpt-export-bot pushed a commit to web-platform-tests/wpt that referenced this issue Oct 23, 2023
storagemanager-estimate and estimate-indexeddb are more or less the
same tests, the latter having been ported from the former to use
`async`/`await`. The former probably should have been deleted when the
latter was introduced.

Since some of the tests are related to IndexedDB and some are not, this
change keeps the IndexedDB tests in the file called `estimate-indexeddb`
and keeps the basic tests in `storagemanager-estimate` (with minor
updates).

One wrinkle from the Chromium side is that the behavior of
`storageManager.estimate()` is not actually specced, and as the Chromium
implementation uses LevelDB, which behaves in mysterious ways, adding
things to the database does NOT always increase the reported usage size.
Both of these tests operate on large things, which typically do increase
usage, however we noticed that for one reason or another, the test that
adds an *uninitialized* ArrayBuffer starts failing on Windows if
durability is set to relaxed. There are other ways to make the test fail
as well: using a shorter name for the database, or putting small
values, does not reliably increase the reported usage. This is all fine
in the sense that it isn't defined behavior, but it does suggest that
working in this area or on tests of this ilk is a bit of a minefield.
These tests probably should not exist as WPT, at least not until quota
behavior is specced (see whatwg/storage#110).
But it would also be sort of a shame to delete WPT that are passing, so
I've left them in place for now.

Bug: 1489517
Change-Id: I6619f504ce92e428054691ac6bf54a0e14e3ce5f
aarongable pushed a commit to chromium/chromium that referenced this issue Oct 24, 2023
storagemanager-estimate and estimate-indexeddb are more or less the
same tests, the latter having been ported from the former to use
`async`/`await`. The former probably should have been deleted when the
latter was introduced.

Since some of the tests are related to IndexedDB and some are not, this
change keeps the IndexedDB tests in the file called `estimate-indexeddb`
and keeps the basic tests in `storagemanager-estimate` (with minor
updates).

One wrinkle from the Chromium side is that the behavior of
`storageManager.estimate()` is not actually specced, and as the Chromium
implementation uses LevelDB, which behaves in mysterious ways, adding
things to the database does NOT always increase the reported usage size.
Both of these tests operate on large things, which typically do increase
usage, however we noticed that for one reason or another, the test that
adds an *uninitialized* ArrayBuffer starts failing on Windows if
durability is set to relaxed. There are other ways to make the test fail
as well: using a shorter name for the database, or putting small
values, does not reliably increase the reported usage. This is all fine
in the sense that it isn't defined behavior, but it does suggest that
working in this area or on tests of this ilk is a bit of a minefield.
These tests probably should not exist as WPT, at least not until quota
behavior is specced (see whatwg/storage#110).
But it would also be sort of a shame to delete WPT that are passing, so
I've left them in place for now.

Bug: 1489517
Change-Id: I6619f504ce92e428054691ac6bf54a0e14e3ce5f
Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/4968659
Commit-Queue: Evan Stade <[email protected]>
Reviewed-by: Ayu Ishii <[email protected]>
Cr-Commit-Position: refs/heads/main@{#1214324}
chromium-wpt-export-bot pushed a commit to web-platform-tests/wpt that referenced this issue Oct 24, 2023
storagemanager-estimate and estimate-indexeddb are more or less the
same tests, the latter having been ported from the former to use
`async`/`await`. The former probably should have been deleted when the
latter was introduced.

Since some of the tests are related to IndexedDB and some are not, this
change keeps the IndexedDB tests in the file called `estimate-indexeddb`
and keeps the basic tests in `storagemanager-estimate` (with minor
updates).

One wrinkle from the Chromium side is that the behavior of
`storageManager.estimate()` is not actually specced, and as the Chromium
implementation uses LevelDB, which behaves in mysterious ways, adding
things to the database does NOT always increase the reported usage size.
Both of these tests operate on large things, which typically do increase
usage, however we noticed that for one reason or another, the test that
adds an *uninitialized* ArrayBuffer starts failing on Windows if
durability is set to relaxed. There are other ways to make the test fail
as well: using a shorter name for the database, or putting small
values, does not reliably increase the reported usage. This is all fine
in the sense that it isn't defined behavior, but it does suggest that
working in this area or on tests of this ilk is a bit of a minefield.
These tests probably should not exist as WPT, at least not until quota
behavior is specced (see whatwg/storage#110).
But it would also be sort of a shame to delete WPT that are passing, so
I've left them in place for now.

Bug: 1489517
Change-Id: I6619f504ce92e428054691ac6bf54a0e14e3ce5f
Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/4968659
Commit-Queue: Evan Stade <[email protected]>
Reviewed-by: Ayu Ishii <[email protected]>
Cr-Commit-Position: refs/heads/main@{#1214324}
chromium-wpt-export-bot pushed a commit to web-platform-tests/wpt that referenced this issue Oct 24, 2023
storagemanager-estimate and estimate-indexeddb are more or less the
same tests, the latter having been ported from the former to use
`async`/`await`. The former probably should have been deleted when the
latter was introduced.

Since some of the tests are related to IndexedDB and some are not, this
change keeps the IndexedDB tests in the file called `estimate-indexeddb`
and keeps the basic tests in `storagemanager-estimate` (with minor
updates).

One wrinkle from the Chromium side is that the behavior of
`storageManager.estimate()` is not actually specced, and as the Chromium
implementation uses LevelDB, which behaves in mysterious ways, adding
things to the database does NOT always increase the reported usage size.
Both of these tests operate on large things, which typically do increase
usage, however we noticed that for one reason or another, the test that
adds an *uninitialized* ArrayBuffer starts failing on Windows if
durability is set to relaxed. There are other ways to make the test fail
as well: using a shorter name for the database, or putting small
values, does not reliably increase the reported usage. This is all fine
in the sense that it isn't defined behavior, but it does suggest that
working in this area or on tests of this ilk is a bit of a minefield.
These tests probably should not exist as WPT, at least not until quota
behavior is specced (see whatwg/storage#110).
But it would also be sort of a shame to delete WPT that are passing, so
I've left them in place for now.

Bug: 1489517
Change-Id: I6619f504ce92e428054691ac6bf54a0e14e3ce5f
Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/4968659
Commit-Queue: Evan Stade <[email protected]>
Reviewed-by: Ayu Ishii <[email protected]>
Cr-Commit-Position: refs/heads/main@{#1214324}
moz-v2v-gh pushed a commit to mozilla/gecko-dev that referenced this issue Nov 7, 2023
Automatic update from web-platform-tests
Deduplicate some WPT.

storagemanager-estimate and estimate-indexeddb are more or less the
same tests, the latter having been ported from the former to use
`async`/`await`. The former probably should have been deleted when the
latter was introduced.

Since some of the tests are related to IndexedDB and some are not, this
change keeps the IndexedDB tests in the file called `estimate-indexeddb`
and keeps the basic tests in `storagemanager-estimate` (with minor
updates).

One wrinkle from the Chromium side is that the behavior of
`storageManager.estimate()` is not actually specced, and as the Chromium
implementation uses LevelDB, which behaves in mysterious ways, adding
things to the database does NOT always increase the reported usage size.
Both of these tests operate on large things, which typically do increase
usage, however we noticed that for one reason or another, the test that
adds an *uninitialized* ArrayBuffer starts failing on Windows if
durability is set to relaxed. There are other ways to make the test fail
as well: using a shorter name for the database, or putting small
values, does not reliably increase the reported usage. This is all fine
in the sense that it isn't defined behavior, but it does suggest that
working in this area or on tests of this ilk is a bit of a minefield.
These tests probably should not exist as WPT, at least not until quota
behavior is specced (see whatwg/storage#110).
But it would also be sort of a shame to delete WPT that are passing, so
I've left them in place for now.

Bug: 1489517
Change-Id: I6619f504ce92e428054691ac6bf54a0e14e3ce5f
Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/4968659
Commit-Queue: Evan Stade <[email protected]>
Reviewed-by: Ayu Ishii <[email protected]>
Cr-Commit-Position: refs/heads/main@{#1214324}

--

wpt-commits: 446b9bce1ca9a174c11c64f4faa077a9e9d22e72
wpt-pr: 42700
vinnydiehl pushed a commit to vinnydiehl/mozilla-unified that referenced this issue Nov 7, 2023
Automatic update from web-platform-tests
Deduplicate some WPT.

storagemanager-estimate and estimate-indexeddb are more or less the
same tests, the latter having been ported from the former to use
`async`/`await`. The former probably should have been deleted when the
latter was introduced.

Since some of the tests are related to IndexedDB and some are not, this
change keeps the IndexedDB tests in the file called `estimate-indexeddb`
and keeps the basic tests in `storagemanager-estimate` (with minor
updates).

One wrinkle from the Chromium side is that the behavior of
`storageManager.estimate()` is not actually specced, and as the Chromium
implementation uses LevelDB, which behaves in mysterious ways, adding
things to the database does NOT always increase the reported usage size.
Both of these tests operate on large things, which typically do increase
usage, however we noticed that for one reason or another, the test that
adds an *uninitialized* ArrayBuffer starts failing on Windows if
durability is set to relaxed. There are other ways to make the test fail
as well: using a shorter name for the database, or putting small
values, does not reliably increase the reported usage. This is all fine
in the sense that it isn't defined behavior, but it does suggest that
working in this area or on tests of this ilk is a bit of a minefield.
These tests probably should not exist as WPT, at least not until quota
behavior is specced (see whatwg/storage#110).
But it would also be sort of a shame to delete WPT that are passing, so
I've left them in place for now.

Bug: 1489517
Change-Id: I6619f504ce92e428054691ac6bf54a0e14e3ce5f
Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/4968659
Commit-Queue: Evan Stade <[email protected]>
Reviewed-by: Ayu Ishii <[email protected]>
Cr-Commit-Position: refs/heads/main@{#1214324}

--

wpt-commits: 446b9bce1ca9a174c11c64f4faa077a9e9d22e72
wpt-pr: 42700
gecko-dev-updater pushed a commit to marco-c/gecko-dev-wordified-and-comments-removed that referenced this issue Nov 8, 2023
Automatic update from web-platform-tests
Deduplicate some WPT.

storagemanager-estimate and estimate-indexeddb are more or less the
same tests, the latter having been ported from the former to use
`async`/`await`. The former probably should have been deleted when the
latter was introduced.

Since some of the tests are related to IndexedDB and some are not, this
change keeps the IndexedDB tests in the file called `estimate-indexeddb`
and keeps the basic tests in `storagemanager-estimate` (with minor
updates).

One wrinkle from the Chromium side is that the behavior of
`storageManager.estimate()` is not actually specced, and as the Chromium
implementation uses LevelDB, which behaves in mysterious ways, adding
things to the database does NOT always increase the reported usage size.
Both of these tests operate on large things, which typically do increase
usage, however we noticed that for one reason or another, the test that
adds an *uninitialized* ArrayBuffer starts failing on Windows if
durability is set to relaxed. There are other ways to make the test fail
as well: using a shorter name for the database, or putting small
values, does not reliably increase the reported usage. This is all fine
in the sense that it isn't defined behavior, but it does suggest that
working in this area or on tests of this ilk is a bit of a minefield.
These tests probably should not exist as WPT, at least not until quota
behavior is specced (see whatwg/storage#110).
But it would also be sort of a shame to delete WPT that are passing, so
I've left them in place for now.

Bug: 1489517
Change-Id: I6619f504ce92e428054691ac6bf54a0e14e3ce5f
Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/4968659
Commit-Queue: Evan Stade <estadechromium.org>
Reviewed-by: Ayu Ishii <ayuichromium.org>
Cr-Commit-Position: refs/heads/main{#1214324}

--

wpt-commits: 446b9bce1ca9a174c11c64f4faa077a9e9d22e72
wpt-pr: 42700

UltraBlame original commit: fe8c71b661e3cce79f0dabbf89fbe7982d82ee99
gecko-dev-updater pushed a commit to marco-c/gecko-dev-wordified that referenced this issue Nov 8, 2023
Automatic update from web-platform-tests
Deduplicate some WPT.

storagemanager-estimate and estimate-indexeddb are more or less the
same tests, the latter having been ported from the former to use
`async`/`await`. The former probably should have been deleted when the
latter was introduced.

Since some of the tests are related to IndexedDB and some are not, this
change keeps the IndexedDB tests in the file called `estimate-indexeddb`
and keeps the basic tests in `storagemanager-estimate` (with minor
updates).

One wrinkle from the Chromium side is that the behavior of
`storageManager.estimate()` is not actually specced, and as the Chromium
implementation uses LevelDB, which behaves in mysterious ways, adding
things to the database does NOT always increase the reported usage size.
Both of these tests operate on large things, which typically do increase
usage, however we noticed that for one reason or another, the test that
adds an *uninitialized* ArrayBuffer starts failing on Windows if
durability is set to relaxed. There are other ways to make the test fail
as well: using a shorter name for the database, or putting small
values, does not reliably increase the reported usage. This is all fine
in the sense that it isn't defined behavior, but it does suggest that
working in this area or on tests of this ilk is a bit of a minefield.
These tests probably should not exist as WPT, at least not until quota
behavior is specced (see whatwg/storage#110).
But it would also be sort of a shame to delete WPT that are passing, so
I've left them in place for now.

Bug: 1489517
Change-Id: I6619f504ce92e428054691ac6bf54a0e14e3ce5f
Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/4968659
Commit-Queue: Evan Stade <estadechromium.org>
Reviewed-by: Ayu Ishii <ayuichromium.org>
Cr-Commit-Position: refs/heads/main{#1214324}

--

wpt-commits: 446b9bce1ca9a174c11c64f4faa077a9e9d22e72
wpt-pr: 42700

UltraBlame original commit: fe8c71b661e3cce79f0dabbf89fbe7982d82ee99
gecko-dev-updater pushed a commit to marco-c/gecko-dev-comments-removed that referenced this issue Nov 8, 2023
Automatic update from web-platform-tests
Deduplicate some WPT.

storagemanager-estimate and estimate-indexeddb are more or less the
same tests, the latter having been ported from the former to use
`async`/`await`. The former probably should have been deleted when the
latter was introduced.

Since some of the tests are related to IndexedDB and some are not, this
change keeps the IndexedDB tests in the file called `estimate-indexeddb`
and keeps the basic tests in `storagemanager-estimate` (with minor
updates).

One wrinkle from the Chromium side is that the behavior of
`storageManager.estimate()` is not actually specced, and as the Chromium
implementation uses LevelDB, which behaves in mysterious ways, adding
things to the database does NOT always increase the reported usage size.
Both of these tests operate on large things, which typically do increase
usage, however we noticed that for one reason or another, the test that
adds an *uninitialized* ArrayBuffer starts failing on Windows if
durability is set to relaxed. There are other ways to make the test fail
as well: using a shorter name for the database, or putting small
values, does not reliably increase the reported usage. This is all fine
in the sense that it isn't defined behavior, but it does suggest that
working in this area or on tests of this ilk is a bit of a minefield.
These tests probably should not exist as WPT, at least not until quota
behavior is specced (see whatwg/storage#110).
But it would also be sort of a shame to delete WPT that are passing, so
I've left them in place for now.

Bug: 1489517
Change-Id: I6619f504ce92e428054691ac6bf54a0e14e3ce5f
Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/4968659
Commit-Queue: Evan Stade <estadechromium.org>
Reviewed-by: Ayu Ishii <ayuichromium.org>
Cr-Commit-Position: refs/heads/main{#1214324}

--

wpt-commits: 446b9bce1ca9a174c11c64f4faa077a9e9d22e72
wpt-pr: 42700

UltraBlame original commit: fe8c71b661e3cce79f0dabbf89fbe7982d82ee99
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

3 participants