-
Notifications
You must be signed in to change notification settings - Fork 42
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Consider tying background fetches to storage buckets #135
Comments
The all-or-nothing approach we have for storage feels easy to explain. If I use my own system to link together an entry in session storage, the cache API, and indexeddb, it's nice that I can rely on those things existing together, or none existing. It seems weird that an entry could disappear from my 'users' IDB store, just because that entry includes a avatar blob that came from a background fetch. It might not be obvious that the blob originated from a bg fetch, as it might have gone into the cache API, and later be returned and put into IDB by independent bits of code. I guess I'm not fully understanding the problem. Is it something to do with the limits on one origin giving clues to the storage of another? That said, I like the idea of creating storage with configurable automatic eviction, and it's something folks have asked from the cache API before, but I think it should be explicitly requested by the developer.
What would it mean to evict background fetch storage? The lifetime of bgfetch storage is pretty limited. Once the notification is gone, storage can be freed as JS loses references to the associated requests and responses. If the developer stores items somewhere more persistent (eg cache API), then it takes on the persistency of the new storage. So, if we create a way to make an instance/entry in the cache API auto-evictable, it would take on those rules. I guess an early eviction of background fetch storage would mean aborting the fetch. Is that something the developer would want to happen out-of-the-blue? |
That's the motivation behind my thought process, yes. Right now Firefox says that the maximum amount of space a group (eTLD+1, shared amongst all origins in a group) is ~1/5 of free storage on the disk capped at 2GiB, inclusive of space already tracked by quota management. This is somewhat of a cop-out approach and Starting from first principles, one might propose a strategy for quota management like:
The eternal browser UX problems with prompting the user are of course:
File downloads are an interesting case where I expect users are more likely to understand what's happening. In the ideal case, the user understands that the site wants to store some specific data on their computer and this will consume network data and use up disk space until deleted. Browsers have also begun adopting UX flows where the download starts automatically, the user is alerted to the fact that it's started, but aren't necessarily prompted (at least not after the first time, unless they want to be). File downloads also potentially map exactly to storage buckets. Which we can then explain to the user and allow them to revoke the quota grant and reclaim the storage space without completely wiping out the origin. My proposal about magically disappearing Blobs was hand-waving as to whether there's a way to take baby steps in terms of having background-sync use storage buckets but not require API consumers to be aware of the existence of other storage buckets than the origin default. It might be best to ignore that.
My question as it relates to this is that if background-fetch lets you download a 1GB file and guarantees it stays alive until the success event... how does that 1GB file interact with the quota system? It's straightforward if the SW was a bit-torrent app and the download is handed off as a Blob to the normal system-download mechanism where the downloaded file exists outside of the browser itself. But if the SW is part of a video streaming site that allows offlining and it tries to store it in the Cache API, how does that work? Assuming we boost the quota, how do we get that quota back if we can't tell what is part of that download once it goes into the opaque box of content JS and content storage APIs? Do we end up evicting every origin that was foolish enough not to immediately request Firefox's strategy is very naive LRU-based eviction at this point, so we can certainly become a lot more clever to avoid worst-case pathological eviction cycles, but there are limits to how clever we can be with all-or-nothing eviction. (And the reality is that sites will code against the cleverness of the browsers they develop/test against.) |
I figured it would be part of the same quota as the cache API. Ideally, the
This does create potential for storage duplication when adding fetched items into the cache API. I guess browsers could be 'smart' and dedupe a blob stored in two places. Otherwise, I guess we'd need some kind of storage 'transfer'. Once the bgfetch operation is complete, the browser can delete its storage (unless it's doing the blob deduping above, in which case it just decrements the reference count or whatever). |
Problem Statement: Managing storage quota for origins in a browser is problematic. Although the storage spec now has a concept of buckets as the atomic unit of storage for eviction purposes, there is still only a single bucket per origin. Browsers potentially have a very good understanding of how much a user uses a site which the user allows to be persisted in their history, but no understanding of how that usage maps to storage. Keeping an origin's storage is all or nothing. This results in Quota Management effectively being a combination of prompting (navigator.storage.persist(), Firefox's former 50 MB limit on IndexedDB unless the user said yes to a prompt) and LRU-style eviction with quotas set based on available disk space.
Motivating Concern: While thinking about whatwg/storage#70 as it relates to Firefox I recognized it's hard to cap quota for an origin to a reasonable size by default unless there's a way for the user to grant revocable portions of quota.
Modest Proposal: Tie background fetch downloads to storage buckets. The background fetch spec has already figured out how to expose long-running actions that can use a lot of disk space to the user via the browser's downloads UI. Downloads are something users are (more) able to reason about, both in the act and if explicitly cleaning things up.
Spec-wise, we need content to be prepared for data to disappear from disk and potentially able to detect and recover from it. While a browser could treat background-fetch granted downloads as explicit quota grants, there's little benefit if there's no way to actually reclaim that space related to those grants without breaking the origin.
Practical Considerations: Spec'ing a full storage bucket API at this time would be a lot of work and there's potential ergonomics issues that could spell doom for adoption of background-fetch if consumers have to jump through a ton of hoops. A baby step might be to support branding the returned background-fetch Response with a storage bucket that is also propagated to the Blob returned by
response.blob()
. The storage bucket is considered to exist as long as one of the branded objects exists, and it would have an explicit storage grant of the size requested for the download.In the event the user or browser chooses to evict the bucket, the following behaviors would occur:
The future plan would be that:
Discussion Hoped For:
The text was updated successfully, but these errors were encountered: