Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HTTPFileSystem breaks when range requests are not supported #1626

Open
dholth opened this issue Jun 14, 2024 · 3 comments
Open

HTTPFileSystem breaks when range requests are not supported #1626

dholth opened this issue Jun 14, 2024 · 3 comments

Comments

@dholth
Copy link
Contributor

dholth commented Jun 14, 2024

from fsspec.core import url_to_fs

ufs, url = url_to_fs("https://example.org/")
f = ufs.open("https://example.org/no-range-requests/data.txt")
f.seek(8192)
f.read(1)

Will raise # ValueError: The HTTP server doesn't appear to support range requests.

I was having trouble with cache options, but it turns out I was sending simplecache:: to the wrong place; this does work, but I don't really want to keep the seekable file around after the file is closed. Trying to be careful to open the file only once.

@martindurant
Copy link
Member

Specifically:

  • HTTPFile, which supports seeking, sends range requests and is refused by the server
  • HTTPStreamFile does not use range, but does not support seek().
  • HTTPStreamFile ought to support some caching modes, particularly "all" and "first", but is hard-coded to "none".

@dholth
Copy link
Contributor Author

dholth commented Jun 14, 2024

IIUC, for my use case I could

  • use simplecache:: or filecache::
  • enhance those so that the cache only keeps a file or two if I'm worried about disk space, like python TemporaryFile's delete-file-on-close
  • Edit HTTPStreamFile so that it can fetch a range, if that range is right after whatever we've fetched so far, and use the file-level "all" cache instead of the filesystem wrappers simplecache::

@martindurant
Copy link
Member

cf #1629

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants