Prototype a new `Buffer` trait. #1290

sunfishcode · 2025-01-24T18:01:20Z

I'm experimenting with a Buffer trait similar to #908, however I've run into a few problems. See the questions in examples/new_read.rs for details.

@notgull @SUPERCILEX

SUPERCILEX

If we go with this approach, I feel like there's a pretty good chance compiler folks will improve the error messages for us if we file bug reports. The story around passing in a mutable array but needing a slice instead has annoyed me for quite some time (I usually just as_mut_slice everything). It was pretty confusing to figure out initially, so I'm sure better error messages would be welcome.

SUPERCILEX · 2025-01-25T20:35:48Z

examples/new_read.rs

+    struct Wrapper<'a>(&'a mut [u8]);
+    impl<'a> Wrapper<'a> {
+        fn read(&mut self) {
+            let _x: usize = read(stdin(), self.0).unwrap();


I believe this is due to rust-lang/rust#35919. You need to reborrow manually: &mut *self.0.

One option is to make the Buffer trait impl owned types so the read method borrows it. Unfortunately I believe this requires rust 1.65. Also it seems kinda ugly so maybe not worth it.

pub fn read<Fd: AsFd, Buf: Buffer<u8> + ?Sized>( fd: Fd, buf: &mut Buf, ) -> io::Result<Buf::Result<'_>> { let len = backend::io::syscalls::read(fd.as_fd(), buf.as_maybe_uninitialized())?; // SAFETY: `read` works. unsafe { Ok(buf.finish(len)) } } pub trait Buffer<T> { /// The result of the process operation. type Result<'a> where T: 'a, Self: 'a; /// Convert this buffer into a maybe-unitiailized view. fn as_maybe_uninitialized(&mut self) -> &mut [MaybeUninit<T>]; /// Convert a finished buffer pointer into its result. /// /// # Safety /// /// At least `len` bytes of the buffer must now be initialized. unsafe fn finish(&mut self, len: usize) -> Self::Result<'_>; } /// Implements [`Buffer`] around the a slice of bytes. /// /// `Result` is a `usize` indicating how many bytes were written. impl<T> Buffer<T> for [T] { type Result<'a> = usize where T: 'a; #[inline] fn as_maybe_uninitialized(&mut self) -> &mut [MaybeUninit<T>] { // SAFETY: This just casts away the knowledge that the elements are // initialized. unsafe { core::mem::transmute::<&mut [T], &mut [MaybeUninit<T>]>(self) } } #[inline] unsafe fn finish(&mut self, len: usize) -> usize { len } } /// Implements [`Buffer`] around the a slice of uninitialized bytes. /// /// `Result` is a pair of slices giving the initialized and uninitialized /// subslices after the new data is written. impl<T> Buffer<T> for [MaybeUninit<T>] { type Result<'a> = (&'a mut [T], &'a mut [MaybeUninit<T>]) where T: 'a; #[inline] fn as_maybe_uninitialized(&mut self) -> &mut [MaybeUninit<T>] { self } #[inline] unsafe fn finish(&mut self, len: usize) -> Self::Result<'_> { let (init, uninit) = self.split_at_mut(len); // SAFETY: The user asserts that the slice is now initialized. let init = slice::from_raw_parts_mut(init.as_mut_ptr().cast::<T>(), init.len()); (init, uninit) } } /// Implements [`Buffer`] around the `Vec` type. /// /// This implementation fills the buffer, overwriting any previous data, with /// the new data data and sets the length. #[cfg(feature = "alloc")] impl<T> Buffer<T> for Vec<T> { type Result<'a> = usize where T: 'a; #[inline] fn as_maybe_uninitialized(&mut self) -> &mut [MaybeUninit<T>] { self.clear(); self.spare_capacity_mut() } #[inline] unsafe fn finish(&mut self, len: usize) -> usize { self.set_len(len); len } }

SUPERCILEX · 2025-01-25T20:37:07Z

examples/new_read.rs

+
+    // Why does this get two error messages?
+    let mut buf = [0, 0, 0];
+    let _x = read(stdin(), buf).unwrap();


Presumably because they're both valid. There's a hint assuming the type is an array and then the message that lists out all the valid types you can use.

SUPERCILEX · 2025-01-25T20:43:38Z

src/buffer.rs

 use core::mem::MaybeUninit;
 use core::slice;

+/// A memory buffer that may be uninitialized.
+pub trait Buffer<T> {


Not sure how much it's worth discussing the implementation yet, but there are a few concerns:

I don't think this trait should be public (yet)

as_maybe_uninitialized is unsound in general, but not for our purposes. The conversion from u8 -> MaybeUninit<u8> is only valid if you never uninitialize the u8. We don't do that since we're always passing this data to a syscall, but for a public trait (and probably for ourselves just in case), the conversion to maybeuninit needs to be unsafe.

I don't think we should have an implementation for arrays ([T; N]). Maybe there's value I'm missing, but it seems simple enough to require you to pass in a slice instead.

Same argument for Vec<T>, you should be required to pass in &mut Vec<T> and it should use spare_capacity_mut (but not clear) plus set_len. Though maybe there's an argument for having the owned and borrowed versions with the owned one doing the clear.

Not sure how much it's worth discussing the implementation yet, but there are a few concerns:

* I don't think this trait should be public (yet)

I've now made it `Sealed.

* `as_maybe_uninitialized` is unsound in general, but not for our purposes. The conversion from `u8 -> MaybeUninit<u8>` is only valid if you never uninitialize the `u8`. We don't do that since we're always passing this data to a syscall, but for a public trait (and probably for ourselves just in case), the conversion to maybeuninit needs to be unsafe.

Good point. I switched it back to a pointer+length now.

* I don't think we should have an implementation for arrays (`[T; N]`). Maybe there's value I'm missing, but it seems simple enough to require you to pass in a slice instead.

Agreed; I just added that as an example to see what the error looked like.

* Same argument for `Vec<T>`, you should be required to pass in `&mut Vec<T>` and it should use `spare_capacity_mut` (but not `clear`) plus `set_len`. Though maybe there's an argument for having the owned and borrowed versions with the owned one doing the `clear`.

Yeah this is subtle because Vec<T> has a DerefMut to &mut [T] so it'd mean that adding a * would switch whether we clear+set_len or not. So it seems to make sense to have the owned version to do the clear+set_len and then have &mut Vec<T> behave like &mut [T].

And even with &mut Vec<T> behaving like &mut [T] at use cases for this, I'm still not super comfortable with how subtle this is. Passing a Vec does one thing, while a &mut Vec does another.

Another option would be to omit the Vec<T> impl for this. But if we do that, then we're no longer encapsulating the set_len part of updating a Vec, and without that, I don't know if this whole Buffer trait is worth it.

Need to read your comment in full, but yeah my proposal was to remove the impl for Vec and skip the clear in the &mut Vec.

Actually let's make it even simpler. Get rid of Vec. Giving us an &mut Vec results in the full clear + set_len sequence. If you didn't want the clear, then all you have to do is use spare_capacity_mut. If we really think people don't want the clear, we could literally just add a wrapper type called VecNoClear or something and implement it with only the set_len. That's way easier to use then know the difference between taking a reference and not.

Reading your comment properly now.

I switched it back to a pointer+length now.

Yup, seems like the right move.

So it seems to make sense to have the owned version to do the clear+set_len and then have &mut Vec behave like &mut [T].

I see, yeah it doesn't seem pretty bad if &mut Vec<T> behaves subtly different from &mut *Vec<T>. What if we required all Vecs to go through nutypes? How about having a struct ClearableBuffer<T>(pub T) and struct PreparedBuffer<T>(pub T) (naming tbd). Then we implement ClearableBuffer<&mut Vec<T>> and the same for PreparedBuffer. Both call set_len but only ClearableBuffer calls clear at the beginning.

SUPERCILEX · 2025-01-25T20:45:23Z

src/fs/inotify.rs

-            match read_uninit(self.fd.as_fd(), self.buf).map(|(init, _)| init.len()) {
+            todo!("FIXME: see \"Why doesn't this work?\" in examples/new_read.rs");
+            /*
+            match read(self.fd.as_fd(), self.buf).map(|(init, _)| init.len()) {


&mut *self.buf or the GAT version of the Buffer trait fixes this, as per my other comment.

SUPERCILEX · 2025-01-31T02:23:50Z

src/buffer.rs

-///
-/// `Result` is a `usize` indicating how many bytes were written.
-impl<T, const N: usize> Buffer<T> for &mut [T; N] {
+impl<T, const N: usize> private::Sealed<T> for &mut [T; N] {


Pretty sure we can still implement the buffer trait here so we don't have to specify the impls twice.

SUPERCILEX · 2025-01-31T02:33:19Z

src/buffer.rs

 use core::mem::MaybeUninit;
 use core::slice;

+/// A memory buffer that may be uninitialized.
+pub trait Buffer<T> {


Reading your comment properly now.

I switched it back to a pointer+length now.

Yup, seems like the right move.

So it seems to make sense to have the owned version to do the clear+set_len and then have &mut Vec behave like &mut [T].

I see, yeah it doesn't seem pretty bad if &mut Vec<T> behaves subtly different from &mut *Vec<T>. What if we required all Vecs to go through nutypes? How about having a struct ClearableBuffer<T>(pub T) and struct PreparedBuffer<T>(pub T) (naming tbd). Then we implement ClearableBuffer<&mut Vec<T>> and the same for PreparedBuffer. Both call set_len but only ClearableBuffer calls clear at the beginning.

I'm experimenting with a `Buffer` trait similar to #908, however I've run into a few problems. See the questions in examples/new_read.rs for details.

SUPERCILEX reviewed Jan 25, 2025

View reviewed changes

sunfishcode mentioned this pull request Jan 27, 2025

1.0 release planning #753

Open

21 tasks

sunfishcode force-pushed the sunfishcode/new-read branch from 2b47fd2 to e455b94 Compare January 30, 2025 13:04

sunfishcode added the semver bump Issues that will require a semver-incompatible fix label Jan 30, 2025

SUPERCILEX reviewed Jan 31, 2025

View reviewed changes

sunfishcode added 3 commits January 30, 2025 20:55

Prototype a new Buffer trait.

0af89fe

I'm experimenting with a `Buffer` trait similar to #908, however I've run into a few problems. See the questions in examples/new_read.rs for details.

Address review comments, iterate.

28fc958

More iteration.

cce9330

sunfishcode force-pushed the sunfishcode/new-read branch from e455b94 to cce9330 Compare January 31, 2025 04:55

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Prototype a new `Buffer` trait. #1290

Prototype a new `Buffer` trait. #1290

sunfishcode commented Jan 24, 2025

SUPERCILEX left a comment

SUPERCILEX Jan 25, 2025

SUPERCILEX Jan 25, 2025

SUPERCILEX Jan 25, 2025

sunfishcode Jan 30, 2025

sunfishcode Jan 30, 2025

SUPERCILEX Jan 30, 2025 •

edited

Loading

SUPERCILEX Jan 31, 2025

SUPERCILEX Jan 25, 2025

SUPERCILEX Jan 31, 2025

SUPERCILEX Jan 31, 2025

Prototype a new Buffer trait. #1290

Are you sure you want to change the base?

Prototype a new Buffer trait. #1290

Conversation

sunfishcode commented Jan 24, 2025

SUPERCILEX left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

SUPERCILEX Jan 30, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Prototype a new `Buffer` trait. #1290

Prototype a new `Buffer` trait. #1290

SUPERCILEX Jan 30, 2025 •

edited

Loading