-
Notifications
You must be signed in to change notification settings - Fork 999
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Can I use JuiceFS for storing sparse files? #5675
Comments
Sorry, I was a little rush to push out the question. In the issue, it mentioned similar problem in glusterfs, which is exactly dealing with sparse files. So my question become: if #3898 is merged, will juicefs can efficiently upload into backend storage, like s3? I get a impression from this comment that gnu tools can be impacted if we use cp on a file inside a mounted juicefs. But what I really care is between juicefs and backend storage, where is bandwidth and storage efficiency matter. Any insight suggestions? Thanks a lot in advanced! |
I also observed some difference between local fs and s3 backend:
why different backend make so much big difference? |
Another finding, juicefs seem handle sparse file dramatically different depending on if file is in qcow2 or raw format? what could make such difference? |
for now, according to #2637, juicefs don't support seek_hole, seek_data for sparse files copy (like cp cmd); |
@jiefenghuang So #3898 dose have a value, even tools like cp are not fully benefited, demanding on backend storage, communication between FUSE and backend storage are significantly reduced. Why don't we just get it merged? |
it is too expensive for general scenarios, fyi #3924 |
I am wondering if Juicefs is a good fit for storing sparse file.
According to #2637, it seems JuiceFS have limited support sparse features. What if I put a sparse file into JuiceFS, will juicefs upload the file with logic size into backend storage, or it only uploads the physical size of the file. I mean, if juicefs fill the storage with a lot of zero, or just skip them for efficiency? If it is the later case, what happens to the usage? Can Juicefs correctly calculate the real usage?
The text was updated successfully, but these errors were encountered: