Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding methods for accepting &mut [MaybeUninit<u8>] #1574

Open
erickt opened this issue May 13, 2022 · 13 comments
Open

Adding methods for accepting &mut [MaybeUninit<u8>] #1574

erickt opened this issue May 13, 2022 · 13 comments

Comments

@erickt
Copy link

erickt commented May 13, 2022

I'm doing an audit of Tokio, and as part of it, we noticed in tokio-rs/tokio#4685 there are a few places where tokio is casting a &mut [MaybeUninit<u8>] to a &mut [u8]. They then pass this slice to a variety of mio methods.

If I'm reading the unsafe guidelines correctly, this is undefined behavior, or at least considered potentially undefined behavior.

Would it be possible to add some methods that accept &mut [MaybeUninit<u8>] to mio? That would let us shrink down the audit scope to these methods to make sure that mio won't accidentally read from these bytes. It'd also give us a place to easily migrate the ecosystem once the standard library is able to take &mut [MaybeUninit<u8>], or std::io::ReadBuf once stabilized.

@Thomasdezeeuw
Copy link
Collaborator

It's actually not UB, but it's unsound. Mio doesn't read the buffer nor does it write invalid (MaybeUninit::uninit) bytes to the buffers, so we don't trigger any UB. However we don't make the guarantee that we don't do that. I've changed socket2 to be based on MaybeUninit, see rust-lang/socket2#161, so I'm open to a change, but that would be an API breaking change.

@erickt
Copy link
Author

erickt commented May 13, 2022

After doing a little digging, I found the Unsafe Code Guidelines WG thread about it in rust-lang/unsafe-code-guidelines#71. If I'm reading @RalfJung correctly in nix-rust/nix#1655 (comment), it sounds like this pattern is UB, not just unsound. According to @djkoloski, it sounds like in order to make this non-UB, we'd need to use an LLVM intrinsic freeze to freeze the memory first, but Rust doesn't expose this yet.

@Darksonn
Copy link
Contributor

UB and unsound is the same thing.

@erickt
Copy link
Author

erickt commented May 13, 2022

@Thomasdezeeuw would mio be interested in building on top of socket2, or would it make more sense to directly call the correct libc functions?

Also, while it might be cleaner to do a breaking change, we could avoid doing that if we add methods like TcpStream::peek_uninit(buf: &mut [MaybeUninit<u8>]) -> io::Result<u8>, or TcpStream::recv_uninit(buf: &mut [MaybeUninit<u8>]) -> io::Result<u8>, although that gets a little ugly.

We'd also have to do something to handle uninitialized IoVecs like https://docs.rs/socket2/latest/socket2/struct.MaybeUninitSlice.html. Would it be better to just directly build upon socket2, and expose socket2::MaybeUninitSlice as part of mio's public API?

@RalfJung
Copy link

UB and unsound is the same thing.

I'd say they are almost the same thing. ;) "unsound" is when using only safe code (but using libraries that internally might use unsafe code), I can use UB.

UB is a property of a program with a main function; unsound is a property of a library.

@RalfJung
Copy link

But coming to the issue at hand, indeed using [u8] for passing arbitrary data around (like even the stdlib does) is a technical debt we'll have to slowly work ourselves out of. On the stdlib side, the plan for that is ReadBuf. [MaybeUninit<u8>] also works but usually doesn't suffice to provide safe abstractions. (In particular, casting &mut [u8] to &mut [MaybeUninit<u8>] and exposing that to the outside world is unsound, since one could then assign MaybeUninit::uninit() to it, thereby de-initializing the buffer.

@Thomasdezeeuw
Copy link
Collaborator

@Thomasdezeeuw would mio be interested in building on top of socket2, or would it make more sense to directly call the correct libc functions?

No we decided that Mio won't depend on Socket2, but most users of Mio actually use both.

Also, while it might be cleaner to do a breaking change, we could avoid doing that if we add methods like TcpStream::peek_uninit(buf: &mut [MaybeUninit<u8>]) -> io::Result<u8>, or TcpStream::recv_uninit(buf: &mut [MaybeUninit<u8>]) -> io::Result<u8>, although that gets a little ugly.

We'd also have to do something to handle uninitialized IoVecs like https://docs.rs/socket2/latest/socket2/struct.MaybeUninitSlice.html. Would it be better to just directly build upon socket2, and expose socket2::MaybeUninitSlice as part of mio's public API?

Yeah, I haven't solved in Socket2 either as it has the opposite problem: rust-lang/socket2#223. Maybe ReadBuf can be used, but then the RFC really needs to get a move on because I was targetting a Mio v1 based on rustc 1.61.

@Darksonn
Copy link
Contributor

In principle Tokio could use mio's try_io methods to make the libc call via socket2.

@Thomasdezeeuw

This comment was marked as outdated.

@Thomasdezeeuw

This comment was marked as outdated.

@erickt
Copy link
Author

erickt commented May 15, 2022

@Thomasdezeeuw i think they’re still iterating on the standard ReadBuf implementation. @nrc has a new variant in rust-lang/rust#97015 that tries to fix the risk that someone could unexpectedly mem::swap a &mut [MaybeUninit<u8>] to trigger UB. Hopefully this design sticks.

@RalfJung thanks for jumping in again! Just to confirm, is the act of doing let y = &mut *(x as *mut MaybeUninit<u8> as *mut u8) itself unsound? Or is it just easy to trigger unsoundness with this pattern by writing uninitialized data into it? As far as I can tell tokio/Mio doesn’t appear to be writing uninitialized data into the vec.

@RalfJung
Copy link

RalfJung commented May 15, 2022

"unsound" means "safe code can use this to cause UB", so that is definitely unsound. But I guess you are asking if that line of code causes UB.

Under a strict reading of https://doc.rust-lang.org/reference/behavior-considered-undefined.html, it does, because it creates an &mut i32 that points to an invalid (as in, uninit) i32. However, I personally think we should relax this a bit, and say that a &mut i32 can point to uninit memory as long as it is not being loaded from. That is also what Miri implements. That is rust-lang/unsafe-code-guidelines#77.

@Thomasdezeeuw
Copy link
Collaborator

I think we should close this in favour for an issue that follows the read-buf Rust feature (rust-lang/rust#78485). We decide a while ago to follow API defined in std lib, so let's keep true to that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants