-
Notifications
You must be signed in to change notification settings - Fork 956
Fix: Recursively calculate directory size in size_of_dir
#8444
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: unstable
Are you sure you want to change the base?
Conversation
|
Please rebase your change on |
michaelsproul
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was a little hesitant to merge this change, because:
- We don't need it. The structure of the beacon chain database directory is completely flat (files inside a single directory).
- I was concerned that this change could open the door to stack overflows or infinite loops.
Concern (2) is more important. Looking at the docs for DirEntry::metadata and Metadata::is_dir we see that it would not return true for a symlinked directory. This prevents the infinite loop attack.
The risk of a deeply nested directory creating a DoS via stack overflow (or very long run time) still exists, although it would require the datadir to be writable by an attacker, which is likely only possible if permissions are incorrectly set, or there is a remote code execution vulnerability (which creates more serious problems beyond a simple DoS).
I'm leaning towards saying that this slightly increased risk is acceptable, given that it can be mitigated by setting file permissions correctly. This is despite the positive impact being low (no bug in LH's current behaviour). I think fixing the behaviour for the general case is still valuable, as it removes a footgun and saves us from potentially introducing a bug in the case where the assumption about a flat data directory is no longer true.
|
@maradini77 Do you have any thoughts on the above? |
|
@michaelsproul Thanks for the careful review. Agreed: symlinks won’t cause an infinite loop because |
|
@maradini77 Yeah let's make it iterative and capped, sounds good! |
@michaelsproul updated |
|
Some required checks have failed. Could you please take a look @maradini77? 🙏 |
|
@michaelsproul fmt fuxed |
|
Some required checks have failed. Could you please take a look @maradini77? 🙏 |
macladson
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just noticed an inconsistency with the comments. Also, I think some tests would be useful to here to make sure that the behaviour we want is being enforced (particularly with regards to symlinks).
tempfile is a good crate to use for testing this and is already used elsewhere so it's already in our workspace dependencies. You can also create symlinks with the symlink function: https://doc.rust-lang.org/std/os/unix/fs/fn.symlink.html
Co-authored-by: Mac L <[email protected]>
Co-authored-by: Mac L <[email protected]>
|
@maradini77 are you able to write some tests for this function? Something like this: #[cfg(test)]
mod tests {
use super::*;
use tempfile::TempDir;
#[test]
fn size_of_dir_empty() {
let temp_dir = TempDir::new().unwrap();
let size = size_of_dir(temp_dir.path());
assert_eq!(size, 0);
}
// ... etc
} |
Fixed
size_of_dir_entryto recursively calculate the size of directories. Previously, the function only returned the metadata size of directory entries (typically a few KB), completely ignoring files within subdirectories. This causedsize_of_dirto return incorrect sizes for any directory tree with nested folders.The fix checks if an entry is a directory and recursively calls
size_of_dirto sum up all file sizes within the directory tree.