diff --git a/src/0_1_2_generators_pin.md b/src/0_1_2_generators_pin.md index 0807de4..57db87c 100644 --- a/src/0_1_2_generators_pin.md +++ b/src/0_1_2_generators_pin.md @@ -8,7 +8,7 @@ is Generators and the `Pin` type. >**Relevant for:** ->- Understanding how the async/await syntax works +>- Understanding how the async/await syntax works, and how they're implemented >- Why we need `Pin` >- Why Rusts async model is extremely efficient @@ -175,9 +175,214 @@ impl Generator for GeneratorA { >The `yield` keyword was discussed first in [RFC#1823][rfc1823] and in [RFC#1832][rfc1832]. +Now that you know that the `yield` keyword in reality rewrites your code to become a state machine, +you'll also know the basics of how `await` works. It's very similar. + +Now, there are some limitations in our naive state machine above. What happens when you have a +`borrow` across a `yield` point? + +We could forbid that, but **one of the major design goals** for the async/await syntax has been +to allow this. These kinds of borrows were not possible using `Futures 1.0` so we can't let this +limitation just slip and call it a day yet. + +Instead of discussing it in theory, let's look at some code. + +> We'll use the optimized version of the state machines which is used in Rust today. For a more +> in deapth explanation see [Tyler Mandry's execellent article: How Rust optimizes async/await][optimizing-await] + +```rust,noplaypen,ignore +let a = 4; +let b = move || { + let to_borrow = String::new("Hello"); + let borrowed = &to_borrow; + println!("{}", borrowed); + yield a * 2; + println!("{} world!", borrowed); + }; +``` + +Now what does our rewritten state machine look like with this example? + +```rust,compile_fail +# // If you've ever wondered why the parameters are called Y and R the naming from +# // the original rfc most likely holds the answer +# enum GeneratorState { +# // originally called `CoResult` +# Yielded(Y), // originally called `Yield(Y)` +# Complete(R), // originally called `Return(R)` +# } +# +# trait Generator { +# type Yield; +# type Return; +# fn resume(&mut self) -> GeneratorState; +# } + +enum GeneratorA { + Enter, + Yield1 { + to_borrow: String, + borrowed: &String, // uh, what lifetime should this have? + }, + Exit, +} + +# impl GeneratorA { +# fn start() -> Self { +# GeneratorA::Enter +# } +# } + +impl Generator for GeneratorA { + type Yield = usize; + type Return = (); + fn resume(&mut self) -> GeneratorState { + // lets us get ownership over current state + match std::mem::replace(&mut *self, GeneratorA::Exit) { + GeneratorA::Enter => { + let to_borrow = String::from("Hello"); + let borrowed = &to_borrow; + *self = GeneratorA::Yield1 {to_borrow, borrowed}; + GeneratorState::Yielded(borrowed.len()) + } + + GeneratorA::Yield1 {to_borrow, borrowed} => { + println!("Hello {}", borrowed); + *self = GeneratorA::Exit; + GeneratorState::Complete(()) + } + GeneratorA::Exit => panic!("Can't advance an exited generator!"), + } + } +} +``` + +If you try to compile this you'll get an error (just try it yourself by pressing play). + +What is the lifetime of `&String`. It's not the same as the lifetime of `Self`. It's not `static`. +Turns out that it's not possible for us in Rusts syntax to describe this lifetime, which means, that +to make this work, we'll have to let the compiler know that _we_ control this correctlt. + +That means turning to unsafe. + +Now, as you'll notice, this compiles: + +```rust +pub fn test2() { + let mut gen = GeneratorA::start(); + + if let GeneratorState::Yielded(n) = gen.resume() { + println!("Got value {}", n); + } + + let mut gen2 = GeneratorA::start(); + // If you uncomment this, very bad things can happen. This is why we need `Pin` + // let mut gen2 = GeneratorA::start(); + //std::mem::swap(&mut gen, &mut gen2); + + if let GeneratorState::Complete(()) = gen2.resume() { + () + }; +} + +use std::ptr::NonNull; + +// If you've ever wondered why the parameters are called Y and R the naming from +// the original rfc most likely holds the answer +enum GeneratorState { + // originally called `CoResult` + Yielded(Y), // originally called `Yield(Y)` + Complete(R), // originally called `Return(R)` +} + +trait Generator { + type Yield; + type Return; + fn resume(&mut self) -> GeneratorState; +} + +enum GeneratorA { + Enter, + Yield1 { + to_borrow: String, + borrowed: *const String, // Normally you'll see `std::ptr::NonNull` used instead of *ptr + }, + Exit, +} + +impl GeneratorA { + fn start() -> Self { + GeneratorA::Enter + } +} +impl Generator for GeneratorA { + type Yield = usize; + type Return = (); + fn resume(&mut self) -> GeneratorState { + // lets us get ownership over current state + match self { + GeneratorA::Enter => { + let to_borrow = String::from("Hello"); + let borrowed = &to_borrow; + let res = borrowed.len(); + + // Tricks to actually get a self reference + *self = GeneratorA::Yield1 {to_borrow, borrowed: std::ptr::null()}; + match self { + GeneratorA::Yield1{to_borrow, borrowed} => *borrowed = to_borrow, + _ => () + }; + + GeneratorState::Yielded(res) + } + + GeneratorA::Yield1 {to_borrow, borrowed} => { + let borrowed: &String = unsafe {&**borrowed}; + println!("{} world", borrowed); + *self = GeneratorA::Exit; + GeneratorState::Complete(()) + } + GeneratorA::Exit => panic!("Can't advance an exited generator!"), + } + } +} + +``` + +But now, let's + +```rust +``` + +However, this is also the point where we need to talk about one more concept to + ## Pin -Pin is used to allow for self referential structs. An example: +> Why +> +> 1. To understand `Generators` and `Futures` +> 2. Knowing how to use `Pin` when implementing your own `Future` +> 3. Understand self-referential types in Rust +> +> `Pin` was suggested in [RFC#2349][rfc2349] + +Ping consists of the `Pin` type and the `Unpin` marker. Let's start off with some general rules: + +1. Pin does nothing special, it only prevents the user of an API to violate some assumtions you make when writing your (most likely) unsafe code. +2. Most standard library types implement `Unpin` +3. `Unpin` means it's OK for this type. +4. If you `Box` a value, that boxed value automatcally implements `Unpin`. +5. The absolute main use case for `Pin` is to allow self referential types +6. The implementation behind objects that doens't implement `Unpin` is always unsafe + 1. `Pin` prevents users from your code to break the assumtions you make when writing the `unsafe` implementation + 2. It doesn't solve the fact that you'll have to write unsafe code to actually implement it + +To get a + +> Unsafe code does not mean it's litterally "unsafe", it only relieves the guarantees you normally get from the compiler. +> An `unsafe` implementation can be perfectly safe to do, but you have no safety net. + +Let's take a look at an example: ```rust,editable use std::pin::Pin; @@ -283,6 +488,73 @@ impl Test { ``` +However, to get to know the normal way of implementing such an API which is what we'll see going +forward, we can rewrite the code above into this: + +```rust, editbable, compile_fail +use std::pin::Pin; + +pub fn test1() { + let mut test1 = Test::new("test1"); + test1.init(); + let mut test1_pin = Pin::new(&mut test1); + let mut test2 = Test::new("test2"); + test2.init(); + let mut test2_pin = Pin::new(&mut test2); + + println!( + "a: {}, b: {}", + Test::a(test1_pin.as_ref()), + Test::b(test1_pin.as_ref()) + ); + + // try fixing as the compiler suggests. Is there any `swap` happening? + // Look closely at the printout. + std::mem::swap(test1_pin.as_mut(), test2_pin.as_mut()); + println!( + "a: {}, b: {}", + Test::a(test2_pin.as_ref()), + Test::b(test2_pin.as_ref()) + ); +} + +#[derive(Debug)] +struct Test { + a: String, + b: *const String, +} + + +impl Test { + fn new(txt: &str) -> Self { + let a = String::from(txt); + Test { + a, + b: std::ptr::null(), + } + } + fn init(&mut self) { + let self_ptr: *const String = &self.a; + self.b = self_ptr; + } + + fn a<'a>(self: Pin<&'a Self>) -> &'a str { + &self.get_ref().a + } + + fn b<'a>(self: Pin<&'a Self>) -> &'a String { + unsafe { &*(self.b) } + } +} +``` + +There is one caviat here. Our struct `Test` implements `Unpin`. Now this will be the "normal case" +since most types implement `Unpin`. However, a type which + +## Putting it all together + +Now that we've seen how `Pin` works + pinning ```ignore // If we borrow through yield points, we end up with this error @@ -302,4 +574,6 @@ pinning [rfc2033]: https://github.com/rust-lang/rfcs/blob/master/text/2033-experimental-coroutines.md [greenthreads]: https://cfsamson.gitbook.io/green-threads-explained-in-200-lines-of-rust/ [rfc1823]: https://github.com/rust-lang/rfcs/pull/1823 -[rfc1832]: https://github.com/rust-lang/rfcs/pull/1832 \ No newline at end of file +[rfc1832]: https://github.com/rust-lang/rfcs/pull/1832 +[rfc2349]: https://github.com/rust-lang/rfcs/blob/master/text/2349-pin.md +[optimizing-await]: https://tmandry.gitlab.io/blog/posts/optimizing-await-1/ \ No newline at end of file