Pin

Relevant for

  1. To understand Generators and Futures
  2. Knowing how to use Pin is required when implementing your own Future
  3. To understand self-referential types in Rust
  4. This is the way borrowing across await points is accomplished

Pin was suggested in RFC#2349

We already got a brief introduction of Pin in the previous chapters, so we'll start off here with some definitions and a set of rules to remember.

Definitions

Pin consists of the Pin type and the Unpin marker. Pin's purpose in life is to govern the rules that need to apply for types which implement !Unpin.

Pin is only relevant for pointers. A reference to an object is a pointer.

Yep, that's double negation for you, as in "does-not-implement-unpin". For this chapter and only this chapter we'll rename these markers to:

!Unpin = MustStay and Unpin = CanMove

It just makes it so much easier to understand them.

Rules to remember

  1. If T: CanMove (which is the default), then Pin<'a, T> is entirely equivalent to &'a mut T. in other words: CanMove means it's OK for this type to be moved even when pinned, so Pin will have no effect on such a type.

  2. Getting a &mut T to a pinned pointer requires unsafe if T: MustStay. In other words: requiring a pinned pointer to a type which is MustStay prevents the user of that API from moving that value unless it choses to write unsafe code.

  3. Pinning does nothing special with that memory like putting it into some "read only" memory or anything fancy. It only tells the compiler that some operations on this value should be forbidden.

  4. Most standard library types implement CanMove. The same goes for most "normal" types you encounter in Rust. Futures and Generators are two exceptions.

  5. The main use case for Pin is to allow self referential types, the whole justification for stabilizing them was to allow that. There are still corner cases in the API which are being explored.

  6. The implementation behind objects that are MustStay is most likely unsafe. Moving such a type can cause the universe to crash. As of the time of writing this book, creating an reading fields of a self referential struct still requires unsafe.

  7. You're not really meant to be implementing MustStay, but you can on nightly with a feature flag, or by adding std::marker::PhantomPinned to your type.

  8. When Pinning, you can either pin a value to memory either on the stack or on the heap.

  9. Pinning a MustStay pointer to the stack requires unsafe

  10. Pinning a MustStay pointer to the heap does not require unsafe. There is a shortcut for doing this using Box::pin.

Unsafe code does not mean it's literally "unsafe", it only relieves the guarantees you normally get from the compiler. An unsafe implementation can be perfectly safe to do, but you have no safety net.

Let's take a look at an example:

use std::pin::Pin;

fn main() {
    let mut test1 = Test::new("test1");
    test1.init();
    let mut test2 = Test::new("test2");
    test2.init();

    println!("a: {}, b: {}", test1.a(), test1.b());
    std::mem::swap(&mut test1, &mut test2); // try commenting out this line
    println!("a: {}, b: {}", test2.a(), test2.b());

}

#[derive(Debug)]
struct Test {
    a: String,
    b: *const String,
}

impl Test {
    fn new(txt: &str) -> Self {
        let a = String::from(txt);
        Test {
            a,
            b: std::ptr::null(),
        }
    }
    
    fn init(&mut self) {
        let self_ref: *const String = &self.a;
        self.b = self_ref;
    }
    
    fn a(&self) -> &str {
        &self.a
    } 
    
    fn b(&self) -> &String {
        unsafe {&*(self.b)}
    }
}

Let's walk through this example since we'll be using it the rest of this chapter.

We have a self-referential struct Test. Test needs an init method to be created which is strange but we'll need that to keep this example as short as possible.

Test provides two methods to get a reference to the value of the fields a and b. Since b is a reference to a we store it as a pointer since the borrowing rules of Rust doesn't allow us to define this lifetime.

In our main method we first instantiate two instances of Test and print out the value of the fields on test1. We get:

a: test1, b: test1

Next we swap the data stored at the memory location which test1 is pointing to with the data stored at the memory location test2 is pointing to and vice a verca.

We should expect that printing the fields of test2 should display the same as test1 (since the object we printed before the swap has moved there now).

a: test1, b: test2

The pointer to b still points to the old location. That location is now occupied with the string "test2". This can be a bit hard to visualize so I made a figure that i hope can help.

Fig 1: Before and after swap swap_problem

As you can see this results in unwanted behavior. It's easy to get this to segfault, show UB and fail in other spectacular ways as well.

If we change the example to using Pin instead:

use std::pin::Pin;
use std::marker::PhantomPinned;

#[derive(Debug)]
struct Test {
    a: String,
    b: *const String,
    _marker: PhantomPinned,
}


impl Test {
    fn new(txt: &str) -> Self {
        let a = String::from(txt);
        Test {
            a,
            b: std::ptr::null(),
            // This makes our type `!Unpin`
            _marker: PhantomPinned,
        }
    }
    fn init(&mut self) {
        let self_ptr: *const String = &self.a;
        self.b = self_ptr;
    }

    fn a<'a>(self: Pin<&'a Self>) -> &'a str {
        &self.get_ref().a
    }

    fn b<'a>(self: Pin<&'a Self>) -> &'a String {
        unsafe { &*(self.b) }
    }
}

pub fn main() {
    let mut test1 = Test::new("test1");
    test1.init();
    let mut test1_pin = unsafe { Pin::new_unchecked(&mut test1) }; 
    let mut test2 = Test::new("test2");
    test2.init();
    let mut test2_pin = unsafe { Pin::new_unchecked(&mut test2) };

    println!(
        "a: {}, b: {}",
        Test::a(test1_pin.as_ref()),
        Test::b(test1_pin.as_ref())
    );

    // Try to uncomment this and see what happens
    // std::mem::swap(test1_pin.as_mut(), test2_pin.as_mut());
    println!(
        "a: {}, b: {}",
        Test::a(test2_pin.as_ref()),
        Test::b(test2_pin.as_ref())
    );
}

Now, what we've done here is pinning a stack address. That will always be unsafe if our type implements !Unpin (aka MustStay).

We use some tricks here, including requiring an init. If we want to fix that and let users avoid unsafe we need to pin our data on the heap instead.

Stack pinning will always depend on the current stack frame we're in, so we can't create a self referential object in one stack frame and return it since any pointers we take to "self" is invalidated.

The next example solves some of our friction at the cost of a heap allocation.

use std::pin::Pin;
use std::marker::PhantomPinned;

#[derive(Debug)]
struct Test {
    a: String,
    b: *const String,
    _marker: PhantomPinned,
}

impl Test {
    fn new(txt: &str) -> Pin<Box<Self>> {
        let a = String::from(txt);
        let t = Test {
            a,
            b: std::ptr::null(),
            _marker: PhantomPinned,
        };
        let mut boxed = Box::pin(t);
        let self_ptr: *const String = &boxed.as_ref().a;
        unsafe { boxed.as_mut().get_unchecked_mut().b = self_ptr };

        boxed
    }

    fn a<'a>(self: Pin<&'a Self>) -> &'a str {
        &self.get_ref().a
    }

    fn b<'a>(self: Pin<&'a Self>) -> &'a String {
        unsafe { &*(self.b) }
    }
}

pub fn main() {
    let mut test1 = Test::new("test1");
    let mut test2 = Test::new("test2");

    println!("a: {}, b: {}",test1.as_ref().a(), test1.as_ref().b());

    // Try to uncomment this and see what happens
    // std::mem::swap(&mut test1, &mut test2);
    println!("a: {}, b: {}",test2.as_ref().a(), test2.as_ref().b());
}

The fact that boxing (heap allocating) a value that implements !Unpin is safe makes sense. Once the data is allocated on the heap it will have a stable address.

There is no need for us as users of the API to take special care and ensure that the self-referential pointer stays valid.

There are ways to safely give some guarantees on stack pinning as well, but right now you need to use a crate like pin_utils:pin_utils to do that.

Projection/structural pinning

In short, projection is using a field on your type. mystruct.field1 is a projection. Structural pinning is using Pin on struct fields. This has several caveats and is not something you'll normally see so I refer to the documentation for that.

Pin and Drop

The Pin guarantee exists from the moment the value is pinned until it's dropped. In the Drop implementation you take a mutable reference to self, which means extra care must be taken when implementing Drop for pinned types.

Putting it all together

This is exactly what we'll do when we implement our own Futures stay tuned, we're soon finished.