Vector is empty after cloning struct with uninitialized member

  • A+
Category:Languages

In Rust 1.29.0 one of my tests has started failing. I managed to get the strange bug down to this example:

#[derive(Clone, Debug)] struct CountDrop<'a>(&'a std::cell::RefCell<usize>);  struct MayContainValue<T> {     value: std::mem::ManuallyDrop<T>,     has_value: u32, }  impl<T: Clone> Clone for MayContainValue<T> {     fn clone(&self) -> Self {         Self {             value: if self.has_value > 0 {                 self.value.clone()             } else {                 unsafe { std::mem::uninitialized() }             },             has_value: self.has_value,         }     } }  impl<T> Drop for MayContainValue<T> {     fn drop(&mut self) {         if self.has_value > 0 {             unsafe {                 std::mem::ManuallyDrop::drop(&mut self.value);             }         }     } }  #[cfg(test)] mod tests {     use super::*;     #[test]     fn check_drops() {         let n = 2000;         let drops = std::cell::RefCell::new(0usize);          let mut slots = Vec::new();         for _ in 0..n {             slots.push(MayContainValue {                 value: std::mem::ManuallyDrop::new(CountDrop(&drops)),                 has_value: 1,             });         }          unsafe { std::mem::ManuallyDrop::drop(&mut slots[0].value); }         slots[0].has_value = 0;          assert_eq!(slots.len(), slots.clone().len());     } } 

I know the code looks strange; it is all ripped out of context. I reproduced this problem with cargo test on 64-bit Ubuntu on Rust 1.29.0. A friend could not reproduce on Windows with the same Rust version.

Other things that stop reproduction:

  • Lowering n below ~900.
  • Not running the example from within cargo test.
  • Replacing CountDrop's member with u64.
  • Using a Rust version before 1.29.0.

What's going on here? Yes, MayContainValue can have an uninitialized member, but this is never used in any way.

I also managed to reproduce this on play.rust-lang.org.


I'm not interested in 'solutions' that involve re-engineering MayContainValue in some safe way with Option or enum, I'm using manual storage and occupied/vacant discrimination for a good reason.

 


TL;DR: Yes, creating an uninitialized reference is always undefined behavior. You cannot use mem::uninitialized safely with generics. There is not currently a good workaround for your specific case.


Running your code in valgrind reports 3 errors, each with the same stack trace:

==741== Conditional jump or move depends on uninitialised value(s) ==741==    at 0x11907F: <alloc::vec::Vec<T> as alloc::vec::SpecExtend<T, I>>::spec_extend (vec.rs:1892) ==741==    by 0x11861C: <alloc::vec::Vec<T> as alloc::vec::SpecExtend<&'a T, I>>::spec_extend (vec.rs:1942) ==741==    by 0x11895C: <alloc::vec::Vec<T>>::extend_from_slice (vec.rs:1396) ==741==    by 0x11C1A2: alloc::slice::hack::to_vec (slice.rs:168) ==741==    by 0x11C643: alloc::slice::<impl [T]>::to_vec (slice.rs:369) ==741==    by 0x118C1E: <alloc::vec::Vec<T> as core::clone::Clone>::clone (vec.rs:1676) ==741==    by 0x11AF89: md::tests::check_drops (main.rs:51) ==741==    by 0x119D39: md::__test::TESTS::{{closure}} (main.rs:36) ==741==    by 0x11935D: core::ops::function::FnOnce::call_once (function.rs:223) ==741==    by 0x11F09E: {{closure}} (lib.rs:1451) ==741==    by 0x11F09E: call_once<closure,()> (function.rs:223) ==741==    by 0x11F09E: <F as alloc::boxed::FnBox<A>>::call_box (boxed.rs:642) ==741==    by 0x17B469: __rust_maybe_catch_panic (lib.rs:105) ==741==    by 0x14044F: try<(),std::panic::AssertUnwindSafe<alloc::boxed::Box<FnBox<()>>>> (panicking.rs:289) ==741==    by 0x14044F: catch_unwind<std::panic::AssertUnwindSafe<alloc::boxed::Box<FnBox<()>>>,()> (panic.rs:392) ==741==    by 0x14044F: {{closure}} (lib.rs:1406) ==741==    by 0x14044F: std::sys_common::backtrace::__rust_begin_short_backtrace (backtrace.rs:136) 

Reducing while keeping the Valgrind error (or one extremely similar) leads to

use std::{iter, mem};  fn main() {     let a = unsafe { mem::uninitialized::<&()>() };     let mut b = iter::once(a);     let c = b.next();     let _d = match c {         Some(_) => 1,         None => 2,     }; } 

Running this smaller reproduction in Miri in the playground leads to this error:

error[E0080]: constant evaluation error: attempted to read undefined bytes  --> src/main.rs:7:20   | 7 |     let _d = match c {   |                    ^ attempted to read undefined bytes   | note: inside call to `main`  --> /root/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/src/libstd/rt.rs:74:34   | 74|     lang_start_internal(&move || main().report(), argc, argv)   |                                  ^^^^^^ note: inside call to `closure`  --> /root/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/src/libstd/rt.rs:59:75   | 59|             ::sys_common::backtrace::__rust_begin_short_backtrace(move || main())   |                                                                           ^^^^^^ note: inside call to `closure`  --> /root/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/src/libstd/sys_common/backtrace.rs:136:5   | 13|     f()   |     ^^^ note: inside call to `std::sys_common::backtrace::__rust_begin_short_backtrace::<[closure@DefId(1/1:1823 ~ std[82ff]::rt[0]::lang_start_internal[0]::{{closure}}[0]::{{closure}}[0]) 0:&dyn std::ops::Fn() -> i32 + std::marker::Sync + std::panic::RefUnwindSafe], i32>`  --> /root/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/src/libstd/rt.rs:59:13   | 59|             ::sys_common::backtrace::__rust_begin_short_backtrace(move || main())   |             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ note: inside call to `closure`  --> /root/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/src/libstd/panicking.rs:310:40   | 31|             ptr::write(&mut (*data).r, f());   |                                        ^^^ note: inside call to `std::panicking::try::do_call::<[closure@DefId(1/1:1822 ~ std[82ff]::rt[0]::lang_start_internal[0]::{{closure}}[0]) 0:&&dyn std::ops::Fn() -> i32 + std::marker::Sync + std::panic::RefUnwindSafe], i32>`  --> /root/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/src/libstd/panicking.rs:306:5   | 30| /     fn do_call<F: FnOnce() -> R, R>(data: *mut u8) { 30| |         unsafe { 30| |             let data = data as *mut Data<F, R>; 30| |             let f = ptr::read(&mut (*data).f); 31| |             ptr::write(&mut (*data).r, f()); 31| |         } 31| |     }   | |_____^ note: inside call to `std::panicking::try::<i32, [closure@DefId(1/1:1822 ~ std[82ff]::rt[0]::lang_start_internal[0]::{{closure}}[0]) 0:&&dyn std::ops::Fn() -> i32 + std::marker::Sync + std::panic::RefUnwindSafe]>`  --> /root/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/src/libstd/panic.rs:392:9   | 39|         panicking::try(f)   |         ^^^^^^^^^^^^^^^^^ note: inside call to `std::panic::catch_unwind::<[closure@DefId(1/1:1822 ~ std[82ff]::rt[0]::lang_start_internal[0]::{{closure}}[0]) 0:&&dyn std::ops::Fn() -> i32 + std::marker::Sync + std::panic::RefUnwindSafe], i32>`  --> /root/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/src/libstd/rt.rs:58:25   | 58|           let exit_code = panic::catch_unwind(|| {   |  _________________________^ 59| |             ::sys_common::backtrace::__rust_begin_short_backtrace(move || main()) 60| |         });   | |__________^ note: inside call to `std::rt::lang_start_internal`  --> /root/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/src/libstd/rt.rs:74:5   | 74|     lang_start_internal(&move || main().report(), argc, argv)   |     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 

The short version is that mem::uninitialized creates a null pointer, which is being treated as a reference. That's the undefined behavior.

In your original code, the Vec::clone is implemented by iterating over an iterator. Iterator::next returns an Option<T>, so you have an option of a reference, which causes the null pointer optimization to kick in. This counts as a None, which terminates the iteration early, resulting in your empty second vector.

It turns out that having mem::uninitialized, a piece of code that gives you C-like semantics, is a giant footgun and is frequently misused (surprise!), so you aren't alone here. The main things you should follow as replacements are:

Comment

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen: