Skip to content

Investigate using zstd with erofs now #31

@Sonicadvance1

Description

@Sonicadvance1

Turns out that zstd with erofs, while taking forever to compress still, can get roughly as good or better compression than sqshfs. This means we can use zstd and still get the better performance from erofs compared to squashfs.

Some stats for comparison. Previously compressed erofs and sqshfs images from our CDN, versus newly compressed.

-rw-r--r-- 1 ryanh ryanh 1.2G Aug  8 15:20 new_lz4hc_128KB_blocks_Ubuntu_24_04.ero
-rw-r--r-- 1 ryanh ryanh 1.4G Aug  8 14:21 new_lz4hc_Ubuntu_24_04.ero
-rw-r--r-- 1 ryanh ryanh 848M Aug  8 15:14 new_zstd_128K_blocks_Ubuntu_24_04.ero
-rw-r--r-- 1 ryanh ryanh 996M Aug  8 15:22 new_zstd_default_blocks_Ubuntu_24_04.ero
-rw-r--r-- 1 ryanh ryanh 827M Aug  8 15:03 new_zstd_metadata_128K_blocks_Ubuntu_24_04.ero
-rw-r--r-- 1 ryanh ryanh 789M Aug  8 14:52 new_zstd_metadata_1MB_blocks_Ubuntu_24_04.ero
-rw-rw-r-- 1 ryanh ryanh 1.4G Aug  8 14:19 Ubuntu_24_04.ero
-rw-rw-r-- 1 ryanh ryanh 843M Aug  8 14:19 Ubuntu_24_04.sqsh

As seen from the file size, if we compress with 128KB block size then erofs gets very close to squashfs. If we add metadata compression on top of that we exceed the squashfs compression.

We can even go a step beyond and make the block size significantly larger. At 1MB it is even higher compression, but we are throwing latency away. The primary reason /why/ erofs loses to squashfs in zstd compression comes from the default block sizes. squashfs defaults to 128KB, while erofs defaults to I /believe/ 4096 bytes.

As a contrast I also showed how little improvement 128KB blocks help with lz4hc, and how zstd with default block size makes up quite a bit of a difference to sqsh but doesn't quite get there.

Additionally while metadata compression is cool, it can't really be used yet since it only got merged with kernel 6.16 (as of writing a week ago!).

Some more investigations necessary, like how much this affects performance, but it looks like we'll be able to enable zstd compression at 128KB block size going forward, and maybe in a couple years we can enable metadata compression.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions