-
Notifications
You must be signed in to change notification settings - Fork 798
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reserve large fixed-size flat address spaces as files #692
base: master
Are you sure you want to change the base?
Conversation
No idea why the automated test is failing. This shouldn't affect anything existing since all the behavior is behind the file type flags. Unless there's some corrupt data existing in the test that happens to hit those flags? EDIT: Tried putting an LFS_ASSERT(false) in all the new code paths, and none of those got triggered in the existing tests.
|
This feature would also be interesting to MicroPython, where we need a filesystem which can hold contiguous (read-only) files that can be memory mapped. Eg they can contain resources, bytecode and machine code which can be executed in place (XIP). Essentially we would use littlefs (with this PR) as a container format to store a set of assets in the layout of a filesystem. |
Huh. Somehow incrementing the Minor disk version does need to be upped, as the new file record is not understood by library versions preceding this PR. |
… block can be reserved
Hi @kaetemi, sorry about such a large delay in getting to this. This is really great work! Working around the lookahead buffer for large files is especially clever. A bit of unfortunate timing, but I've been looking into some larger changes to the file data-structure for performance reasons. In particular I'm planning to move from the existing CTZ skip-list to a more traditional B-tree. Given that a B-tree would not require invasive pointers in the data-blocks, it seems a B-tree would allow you to create the Thoughts? It's worth highlighting that this is a ~11% code size increase (15722 B -> 17462 B on thumb). This isn't a deal-breaker, just that we should either put this behind
Ah, that test measures wear distribution and asserts that the standard deviation is within reasonable values. I've also noticed it's flaky on NAND, it writes a fixed amount of data and this currently isn't enough for inter-block wear-leveling to kick in with NAND's large block sizes. It's up to chance if the same block gets selected twice for metadata which will make it fail the check. I wouldn't worry about it, we can disable the test for NAND for now and create an issue to make the test more robust (which probably just means throwing more data at it). |
That sounds interesting. Would it be better in that case to mark the relevant files with a flag to specify them as "flat", or should it just run through the B-tree to check if the file qualifies whenever it is needed? (Performance / complexity trade off when opening the files.)
Yes, agreed. Better to put it behind a macro so it can be taken out when not needed. |
I think a flag would be worth it in this case, but it depends on your use case. If you're going to read the whole file anyways it may not cost anything extra to scan the whole B-tree. In this hypothetical you could at least store if a file is contiguous in a custom attribute, without requiring in-filesystem changes. That being said I've been considering including offset+length information in each pointer in the B-tree. This helps take advantage of unused-erased parts of the inner-nodes in the B-tree and as a side effect would let you encode an entire flat file as a single "pointer". Though the intention would be that LittleFS itself remains block-based for allocation, etc. Sorry I don't have more information, this is all just in the untested design phase at the moment. |
A flag would be ideal then. In our case the flat files are generally not read by the microcontroller, they're used directly by the graphics chip. And preferably bootup times of the application should be fast. :) As long as we can create files that effectively just reserve a flat address space that can be accessed directly outside of the filesystem library, it suits our requirements. |
Effectively being able to mmap a file is a great benefit, since it would allow to save a ton of precious RAM (many code perform a copy from the file in flash to RAM for parsing the data in place, a mmap would skip this). Is there any progress on this ? |
mirroring @X-Ryl669 , I'd find it very useful in one of my projects (Adding LittleFS to the Game & Watch homebrew scene). We memory-map ROM files, allowing us to run games that we couldn't ever fit in RAM. Currently ROMs are just compiled into the firmware, allowing this memory mapping. But, it would be so much nicer if these were part of the filesystem (and still be memory mapped). |
@BrianPugh You can have a look at FrogFS that supports mmap file via its |
thanks for the reference! I'd say FrogFS almost fits my usecases, but has the drawbacks:
These are not absolute dealbreakers, but definitely doesn't make it seem like the obvious choice. A nice part if we were to use FrogFS is that it separates out the ROM data from the compilation process, allowing us to make pre-compiled firmware releases (and thus, simplifying the user-experience). |
This is not an issue page, but it might be interesting to note:
With the recent addition of WAMR for ESP32, you can now run XIP's WASM binary from flash, which you could update dynamically without having to reboot, a bit like a true OS. Without MMAP'd, the code needs to be loaded in RAM, and this is usually a lot slower (since you'll read from flash to SPI RAM and then back from SPI RAM to execute it). |
If you want bad ideas, you could set the block_size to 64KiB. Then 64KiB files would always be contiguous. Sorry, I've holding off on updating this issue until things are more actionable. This is currently blocked by me behind B-tree support, as with B-trees we should be able to provide contiguous files without an additional file structure. And the fewer file structures we have floating around, the less code we need for backwards compatibility. This is work is mostly complete, but mostly complete is still a bit far from ready to stabilize. Contiguous files are also messing with plans for future features in complicated and messy ways. I think the general issue is the filesystem wants control of data location for filesystem purposes and contiguous files limit that:
All of these special cases invalidate ealier comments of mine. Contiguous files are going to need a flag, and must be a recognized by littlefs to at least avoid corrupting their contiguousness. For a while I was considering tying contiguous files to the future-planned optional on-disk block-map. So instead of allowing filesystem-maintained contiguous files, allow users to reserve contiguous blocks that are tracked as reserved in the block-map. Users could then store info about the allocated blocks in a file if they wanted to the file to appear in the filesystem tree. This would add more work to users, but it would make it clear contiguous files don't benefit from other filesystem features. But then I realized this creates a unavoidable race condition. There is no way for users to avoid leaking storage if power is lost between the block allocation and the writing of tracking file to disk. So it seems contiguous files need to be implemented in the filesystem in order for the allocation + tracking to be atomic. So the best plan to me still seems to be:
But this all depends on B-tree support, so I don't think it's actionable yet. There's some other questions, like should we even support write on contiguous files vs letting users call the bd_erase/bd_prog directly when they have the physical address, but figuring out the details can probably wait until after B-trees... And it's not really that this feature has low-priority, I realize this is useful for users, just that other features have higher priority. Sorry about that. |
Actually the on-disk information you need is more like "pinned". That a file's data blocks should not be moved. That the file is contiguous is only needed at allocation time. |
It think it is worth mentioning that if you only have one binary, you can always partition your disk to store that binary outside of the filesystem while still storing its metadata in the filesystem. The partitioning could be an actual partition table or hardcoded. You could extend this to a sort of storage pool for contiguous files, though this path could end up hitting rough edges. |
For our use case writing/updating contiguous files is not necessary. We would be happy with a purely read-only filesystem (after it's created), although the ability to append/add new files to this "read only" filesystem would be nice. |
I'm curious, if this is a read-only image, could you not allocate blocks for the contiguous files you need from the end before creating the filesystem, and store "fake" files in the filesystem that describe where they live on disk? This is pretty much equivalent to what you would get with contiguous files, since contiguous files also can't benefit from inlining. (Maybe contiguous files could be inlined, but that would force their metadata block to become sort of frozen, which would get really messy with littlefs's design). This would also allow your contiguous files to share blocks, which unfortunately I don't think will be possible with contiguous files in littlefs. From what I understand the benefit of supporting contiguous files inside the filesystem is specifically to allow rewriting and dynamic allocation from littlefs's pool of blocks. |
I've actually proposed a similar thing for my project, but was waiting a little bit to see how this PR would evolve. I'd be willing to volunteer a small C/python project that does this. There are a few complications, but nothing unsolvable. The primary issue with this, is that we've essentially partitioned the storage, which could be an inefficient use. E.g. How do I balance how much of my flash I should allocate to LittleFS vs this secondary contiguous filesystem? This is especially imbalanced for my use-case (game ROMs in contiguous storage, compressed savestates and other smaller files in LittleFS). It would be much nicer if they could all share the same partition. |
While thinking about this, I think there are numerous intermediate steps that can be done that would closely fit the need:
|
Fortunately this goes away when your image is read-only. At creation time you know exactly how much storage you need for the contiguous files, so that's how much you allocate and you can give LittleFS the rest. But I do see where in-filesystem support is useful for dynamic contiguous files. Unfortunately navigating how to fit it into the filesystem gets messy. @X-Ryl669 I need to think on your comment. These are all interesting/valuable comments in this thread. |
For my exact use-case (and I don't think this is super rare), my read-only assets are... "mostly" read-only. Basically, under all normal operating conditions, the data is mmap'd and read-only. However, the assets may infrequently be updated over a slow-data-transfer-medium. In other projects, this could be updating some graphics as part of an OS-update or something. I certainly understand that this introduces complexities and may not be feasible. If so, I'll pursue the "put contiguous data in another partition and have a littlefs file describe it" idea. |
Out of curiosity what µc are you using? Some devices I've seen have only ~1, 2, 3 XIP regions, which places a significant constraint on how you store things.
You're right this is easy on the B-Tree side, but it creates problems on the alloc side. If you alloc different sizes you risk fragmentation, where a small block sits in a large block and prevents it from being used in larger block allocations. You can mitigate this with defragmentation (lfs_fs_defrag?), which moves small pieces of data together, but this adds more janitorial work to the system.
An annoying issue with fragmentation is that it can force larger allocations to fail, even if there is in theory enough storage available. I guess you could make allocation failure trigger defragmentation, though this would quite severely increase the potential runtime spike of write operations.
But how do you create the contiguous file? Something somewhere needs to write? I suppose this is mostly an API question and not too important.
I was reaching the same conclusion, we really only need a "pin" bit on files to indicate that their data shouldn't be moved. This would allow for both contiguous files and more flexible MMU/virtual-mapped files. Though I don't know if mixed pinned/unpinned B-tree nodes is really useful.
Ah, this would be nice. But I think you need this bit to persist on flash so that contiguous files don't get broken up after a remount. Though this may not be a problem for systems with granular virtual mapping. |
I think this is very common. I guess whether or not to completely rewrite the filesystem in this situation is a bit of a user decision. It would be nice to not need to reformat the filesystem, but with contiguous files this creates increasing fragmentation. Maybe lfs_fs_defrag is unavoidable conclusion...
To be honest I think this features is going to be a minimum >=~1/2 year out in terms of timescales. |
Yeah, same for our use case. Hence, this PR is designed to simply allow reserving a large contiguous block in one shot and giving direct access to the storage address. I expect most updates are infrequent and will not change size of assets either. Fragmentation may likely happen if there's too many assets, but it's not a killing issue. We can still fall back to regular fragmented files and load them into memory as regular assets instead of directly accessing from flash if that ever occurs. |
I'm not sure fragmentation is induced here. In order to simplify the work, maybe "just" having the ability to create a Littlefs partition with a super block size (that is, logical blocks that are an (fixed) concatenation of physical block that can't be split) would work. In that case, no special modifications are required, only the ability to pin & unpin a block (to prevent erasing or moving it) and a "give the physical address of the list of logical block for this file" function If that was possible, I think we could just have 2 partitions: one with logical block size = physical block size for regular files and the other with large logical blocks. For large assets, then the second partition would be used, and mmap'ed in a linear address space via the MMU. Even if there isn't a MMU on the µc, I think it's completely possible to change the code using the asset to work in large page via a "virtual" address array mapping. I'm using a ESP32 mainly, but also ARM's Cortex A chip that all supports MMU (albeit the former is rather dumb).
I don't think we need to write a contiguous file (I would be better if we could, but as you said, there are numerous obstacle like interleaved error correction & checksum blocks and so on). Let's say we don't write a contiguous file but a randomly stored file whose blocks are next to each other (with some granularity) so it reduces the load on the assets loader (no need to load in memory first). I think there's a trade off in the chunk's size vs the performance. If the chunk size is too small, then it's easier to load the asset in RAM to concatenate it (provided we have the RAM for this). If the chunk size is large enough, then the overhead to use a mapping from accessing it is lower than copying it. Some µc have a hardware MMU so the overhead is only at file open time instead of file reading time. If there's no MMU but a possible mmap access, then the code reading the asset will need to be "wrapped" in a pseudo memory mapping interaction. In that case, it won't be efficient to map each 512 bytes block, since many sub assets can't fit in such a small place, so this means having to copy them again to a contiguous memory area. On the opposite, a 64kB block (or even larger) can avoid this copying.
The "pin" bit isn't there to tell if a file is contiguous or not. I was more thinking about maintaining a "pinned" open's file table in RAM. When a file is opened for being mmap'd, the table would store all the blocks address used for this file (or a bitmap for all block, I don't know what is the most efficient). Then when littlefs needs to work on any blocks, it would search this table and fail a block that is required for this transaction is used in the table (so that the memory mapping is kept working). So if someone needs to move/update/replace the file, he must first unmap it, move/update/replace (this hits the flash) and re-map it. |
This is an interesting point, and clever. But I don't think it works in the XIP use case since you usually generally need the binary to be contiguous for the system to function. And it's not something you can just throw storage at to solve. You would need enough storage for every file in the filesystem to take up the same size as the contiguous file before you can guarantee that the contiguous allocation won't fail.
A purely in-RAM flag is a nice idea, and would minimize on-disk changes. But for contiguous files I think you still need to persist this info on-disk. It's not an issue for the MMU use-case, but for contiguous files I assume there would be some application logic that expects certain files to be contiguous (boot bios -> mount fs -> xip "/image.bin"). If some work is done before the contiguous file is opened and pinned, say The nice thing about the in-RAM pin is it can always be added as an optional feature without on-disk changes, but it only solves one of the use-cases presented here.
The ESP32 is a common target for littlefs. And they even get "dumber", heck, support for 8-bit MCUs is on the roadmap at some point : )
If I'm understanding this proposal correctly, can it mostly already be done? With the exception of pinning and getting the underlying block address?
The downside is you need to choose what size to make the pools, but AFAICT this is unavoidable without introducing fragmentation in some way. |
I wonder if the answer for XIP is to include optional on-demand defrag during contiguous allocations. This would get expensive ( Humorously, if contiguous files don't participate in error-correction or wear-leveling, I think contiguous allocations end up functionally isomorphic to a hypothetical |
I think you're correct. I'm a bit loss in this conversation, but how I imagine it would go:
|
Yes. And you're completely right for your point about contiguous block and fragmentation issues. If I understand correctly, there are numerous step that can be made toward the goal of a contiguous area in flash: Step 1: (With physical random order for contiguous files)Deal at app level, only possible for µc with MMU. Create 2 littlefs partitions, as described above. Pros If I understand correctly, it might also be the fastest option, since if the "logical" block size is the size of an erase block (or a multiple of so), you can skip erasing each page sequentially. Cons Requires 2 filesystem overhead. Won't work with non MMU system (or requires some change in the app code to deal with a manual virtual mapping) Step 2: (With physical random order for contiguous files)Move this logic inside a single littlefs partition. littlefs deals with 2 area in the partition, once that's made for small blocks and another that's dealing with large blocks only. There is no possible data exchange between those area, so no fragmentation can occur. Pros Cons Step 3: (Path toward contiguous file on the flash)Start implementing a defragmentation process. I think having 2 pools here is way easier to help defragmenting. In effect, the "small" pool doesn't actually require to be defrag, but any free space in it can be used to move the blocks from the "large" large pool to reorder and sort them. If done correctly, the writing/updating process would be:
Pro: Con: |
It's nowhere near a working state, but I began writing some python code implementing this idea. https://github.com/BrianPugh/gnwmanager/blob/rom-fs/gnwmanager/romfs.py This ends up being similar to FrogFS, but with:
|
The EVE graphics controller can render ASTC textures straight from SPI flash. We plan to integrate LittleFS to enable easy asset replacement. One requirement is that the ASTC texture is located in one contiguous area. The current file allocation strategy does not permit this due to the block header in files larger than one block.
To achieve our requirements, this PR adds an additional file record type
LFS_TYPE_FLATSTRUCT
flagged byLFS_F_FLAT
. This record simply stores the first block number and the file size. During block traversal, all the blocks starting from the first block until the last are simply visited (without accessing storage). This allows us to effectively reserve flat address spaces in flash within the filesystem.A function
lfs_file_reserve
is added to turn a file into a reserved flat address space. Calling this function will discard the file and attempt to allocate the requested size as a contiguous block. Specifying size 0 will discard the reservation and make the file usable as a regular file again. As we are accessing the reserved storage areas directly from the GPU, no implementation of read nor write is provided. The file is treated as an opaque storage area. Calllfs_file_reserved
to get the first block, or to check whether a file is a reserved flat area.To allocate the contiguous blocks, two strategies are used depending on the size of the file.
The first allocation strategy is to simply allocate blocks, and reset the allocation starting point whenever an expected block is skipped over, until the expected number of blocks has been allocated sequentially. The strategy is used when the file size is smaller than the lookahead cache, and is aborted when more blocks than are contained in the cache have been attempted.
The second allocation strategy allocates one block in the normal fashion. Then traverses the filesystem and advances the allocation starting point anytime a block is found that collides with the expected allocation space, past the colliding block. The traversal is repeated until a complete traversal is done with no collisions, and aborted if the attempted starting point has looped around the address space.
When calling the reserve function on an existing already reserved file, a new allocation is made. As the updated file record is only committed to the storage upon closing the file, and when no error is flagged, this mechanism seems suitable for atomically upgrading large assets on flash.
Usage
Writing a large file
Reading a large file
Flags
LFS_R_ERRED
: Flag a writing error so the reservation will not be committed on closeLFS_R_GOBBLE
: Force to only use the lookahead allocation strategy, for testing purposesLFS_R_FRONT
: Use only the traversal allocation strategy, start allocating in front, for tooling flash imagesLFS_R_TRUNCATE
: Attempt to use truncate to shrink or extend the existing reservation, otherwise allocate normallyLFS_R_OVERWRITE
: Allow overwriting the previous reservation, useful withLFS_R_FRONT
andLFS_R_COPY
to repackLFS_R_COPY
: Copy existing blocks from the previous area to the new one, no padding