-
Notifications
You must be signed in to change notification settings - Fork 150
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ext4: support bigger blkdev block size #241
base: master
Are you sure you want to change the base?
Conversation
Nice. Out of curiosity, do you think there's any value in device blocks even being exposed in the interface like that? |
Do you mean Edit: Sorry, you probably mean all of the block devices. Got to think about that :-D |
I mean the API interface between a file system and a block device. Like, why should ext4 driver even care what the physical block size is? Sorry I was being ambiguous. :) |
Yes, I get you. That would indeed make a lot of things a lot simpler. And it doesn't actually sound that hard to realize. Just rewrite libblock (and convert everything) to use the classic file API. Or is there some pitfall I don't see? Edit: Well, probably the block caching will have to be hidden inside the file API, which might not be very easy to do, and it's quite nice that filesystems can manipulate the blocks on a lower level, set some flags or mark them dirty, etc. |
I can't think of anything, but you seem to be more intimately familiar with the code right now, so thought I'd ask you. 😄 Anyway, right now I'm trying to finish implementing a completely new way to do IPC, and if that works out like I want, then we'll have the opportunity to rework some IPC protocols as they are being migrated to the new shiny thing. Good reason to think about how the seams between layers should work and where there are opportunities for improvement. |
The comparison to files was just illustrative. Obviously it would still be its own protocol with its own nuances, but ability to read/write with arbitrary granularity (and size) like files do would be nice. Though you make a good point. I'm not actually familiar with the extent of operations that can be done on blocks. But now that I think about it, "trim" would be a good example? Since that can only work on whole blocks, and definitely sounds like something we want to support (eventually). EDIT: So maybe the right interface would have flexible reads/writes, but the intrinsically blocky operation would still have block-based interface that the file system can exploit if it wants to. |
I've been working on HelenRAID for a few months now, but in there I use the direct read/write of blocks, just forwarding the final writes or reads that come to the array. Just now I started to look at filesystems, because I would like to do some performance evaluation on "real" workload, with files and whatnot. The filesystems use libblock more extensively, with caching and everything, which I don't fully understand as well as the filesystems.
Sounds great, also the "non-redundant" copying would be great to have, at least for block devices and big IO. I am interested in the new IPC you are working on, do you have any notes or something I can read to try to understand your new proposal? I am not very experienced in that area, but there is a chance I can help :-D |
No formal proposal as of yet, since I'm still figuring out how to implement what I want and details keep changing to accomodate. :) But currently, the broad strokes look like this: Task creates an IPC queue, then it creates a uniquely identified IPC endpoint on that queue (possibly one for each unique resource handled by the server, e.g. each open file). The endpoint makes its way to a client via IPC (similar to what IPC_CONNECT__ calls do now, but with more control and granularity). The client makes calls on the endpoint, providing its own endpoint as a return address. The main feature is that endpoints represent individual resources managed by servers and can be transferred between tasks arbitrarily. They basically act like references to OOP objects that live in another task. Anyway, the messages sent to endpoint are fixed size, so to send a random bucket of data, you explicitly write the data to a kernel buffer (immutable once created), then send a reference to that buffer to the other party, who can then read chunks out of it. The IPC forward call function is replaced simply by being able to pass the received buffer reference to another task without touching it. Possibly more than once for different parts. |
Allows ext4 to be used with block devices that have block size > EXT4_SUPERBLOCK_SIZE (1K).
afe8ff5
to
cb747b3
Compare
Check |
Allows ext4 to be used with block devices that have block size > EXT4_SUPERBLOCK_SIZE (1K).
Currently, ext4 "supports" only block dev bsize smaller or equal to the superblock size. (quotes on support because it gets
EPERM
whenwritingreading past the allocated buffer)This currently fails (blkdev bsize is set to 4K with
-b
tofile_bd
):But the block devices must be at most the block size of the filesystem, currently maximum is 4096 which is the default too.
Will be useful when for example NVMe support is implemented in the future, or something that uses bigger block sizes.