finished book!!!!!!

This commit is contained in:
Carl Fredrik Samson
2020-04-06 01:51:18 +02:00
parent 3a3ad1eeea
commit 15d7c726f8
18 changed files with 720 additions and 1172 deletions

View File

@@ -72,10 +72,14 @@ First of all. For computers to be [_efficient_](https://en.wikipedia.org/wiki/Ef
start to look under the covers (like [how an operating system works](https://os.phil-opp.com/async-await/))
you'll see concurrency everywhere. It's very fundamental in everything we do.
Secondly, we have the web. Webservers is all about I/O and handling small tasks
Secondly, we have the web.
Webservers is all about I/O and handling small tasks
(requests). When the number of small tasks is large it's not a good fit for OS
threads as of today because of the memory they require and the overhead involved
when creating new threads. This gets even more relevant when the load is variable
when creating new threads.
This gets even more relevant when the load is variable
which means the current number of tasks a program has at any point in time is
unpredictable. That's why you'll see so many async web frameworks and database
drivers today.
@@ -99,8 +103,7 @@ such a system) which then continues running a different task.
Rust had green threads once, but they were removed before it hit 1.0. The state
of execution is stored in each stack so in such a solution there would be no
need for async, await, Futures or Pin. All this would be implementation details
for the library.
need for `async`, `await`, `Futures` or `Pin`.
The typical flow will be like this:
@@ -112,7 +115,7 @@ The typical flow will be like this:
task is finished
5. "jumps" back to the "main" thread, schedule a new thread to run and jump to that
These "jumps" are know as context switches. Your OS is doing it many times each
These "jumps" are know as **context switches**. Your OS is doing it many times each
second as you read this.
**Advantages:**
@@ -366,9 +369,9 @@ the same. You can always go back and read the book which explains it later.
You probably already know what we're going to talk about in the next paragraphs
from Javascript which I assume most know.
>If your exposure to Javascript has given you any sorts of PTSD earlier in life,
close your eyes now and scroll down for 2-3 seconds. You'll find a link there
that takes you to safety.
>If your exposure to Javascript callbacks has given you any sorts of PTSD earlier
in life, close your eyes now and scroll down for 2-3 seconds. You'll find a link
there that takes you to safety.
The whole idea behind a callback based approach is to save a pointer to a set of
instructions we want to run later. We can save that pointer on the stack before
@@ -389,8 +392,8 @@ Rust uses today which we'll soon get to.
- Each task must save the state it needs for later, the memory usage will grow
linearly with the number of callbacks in a chain of computations.
- Can be hard to reason about, many people already know this as as "callback hell".
- It's a very different way of writing a program, and it can be difficult to
get an understanding of the program flow.
- It's a very different way of writing a program, and will require a substantial
rewrite to go from a "normal" program flow to one that uses a "callback based" flow.
- Sharing state between tasks is a hard problem in Rust using this approach due
to it's ownership model.
@@ -401,15 +404,15 @@ like is:
fn program_main() {
println!("So we start the program here!");
set_timeout(200, || {
println!("We create tasks which gets run when they're finished!");
println!("We create tasks with a callback that runs once the task finished!");
});
set_timeout(100, || {
println!("We can even chain callbacks...");
println!("We can even chain sub-tasks...");
set_timeout(50, || {
println!("...like this!");
})
});
println!("While our tasks are executing we can do other stuff here.");
println!("While our tasks are executing we can do other stuff instead of waiting.");
}
fn main() {
@@ -469,7 +472,9 @@ impl Runtime {
We're keeping this super simple, and you might wonder what's the difference
between this approach and the one using OS threads an passing in the callbacks
to the OS threads directly. The difference is that the callbacks are run on the
to the OS threads directly.
The difference is that the callbacks are run on the
same thread using this example. The OS threads we create are basically just used
as timers.
@@ -478,10 +483,11 @@ as timers.
You might start to wonder by now, when are we going to talk about Futures?
Well, we're getting there. You see `promises`, `futures` and other names for
deferred computations are often used interchangeably. There are formal
differences between them but we'll not cover that here but it's worth
explaining `promises` a bit since they're widely known due to beeing used in
Javascript and will serve as segway to Rusts Futures.
deferred computations are often used interchangeably.
There are formal differences between them but we'll not cover that here but it's
worth explaining `promises` a bit since they're widely known due to being used
in Javascript and have a lot in common with Rusts Futures.
First of all, many languages has a concept of promises but I'll use the ones
from Javascript in the examples below.
@@ -516,11 +522,12 @@ timer(200)
The change is even more substantial under the hood. You see, promises return
a state machine which can be in one of three states: `pending`, `fulfilled` or
`rejected`. So when we call `timer(200)` in the sample above, we get back a
promise in the state `pending`.
`rejected`.
When we call `timer(200)` in the sample above, we get back a promise in the state `pending`.
Since promises are re-written as state machines they also enable an even better
syntax where we now can write our last example like this:
syntax which allows us to write our last example like this:
```js, ignore
async function run() {
@@ -533,9 +540,10 @@ async function run() {
You can consider the `run` function a _pausable_ task consisting of several
sub-tasks. On each "await" point it yields control to the scheduler (in this
case it's the well known Javascript event loop). Once one of the sub-tasks changes
state to either `fulfilled` or `rejected` the task is scheduled to continue to
the next step.
case it's the well known Javascript event loop).
Once one of the sub-tasks changes state to either `fulfilled` or `rejected` the
task is scheduled to continue to the next step.
Syntactically, Rusts Futures 1.0 was a lot like the promises example above and
Rusts Futures 3.0 is a lot like async/await in our last example.
@@ -544,12 +552,10 @@ Now this is also where the similarities with Rusts Futures stop. The reason we
go through all this is to get an introduction and get into the right mindset for
exploring Rusts Futures.
> To avoid confusion later on: There is one difference you should know. Javascript
> promises are _eagerly_ evaluated. That means that once it's created, it starts
> running a task. Rusts Futures on the other hand is _lazily_ evaluated. They
> need to be polled once before they do any work. You'll see in a moment.
> need to be polled once before they do any work.
<br />
<div style="text-align: center; padding-top: 2em;">

View File

@@ -10,10 +10,17 @@
>well written and I can recommend reading through it (it talks as much about
>async/await as it does about generators).
The second difficult part is understanding Generators and the `Pin` type. Since
they're related we'll start off by exploring generators first. By doing that
we'll soon get to see why we need to be able to "pin" some data to a fixed
location in memory and get an introduction to `Pin` as well.
## Why generators?
Generators/yield and async/await are so similar that once you understand one
you should be able to understand the other.
It's much easier for me to provide runnable and short examples using Generators
instead of Futures which require us to introduce a lot of concepts now that
we'll cover later just to show an example.
A small bonus is that you'll have a pretty good introduction to both Generators
and Async/Await by the end of this chapter.
Basically, there were three main options discussed when designing how Rust would
handle concurrency:
@@ -84,7 +91,7 @@ async fn myfn() {
Async in Rust is implemented using Generators. So to understand how Async really
works we need to understand generators first. Generators in Rust are implemented
as state machines. The memory footprint of a chain of computations is only
defined by the largest footprint of what the largest step require.
defined by the largest footprint of what the largest step require.
That means that adding steps to a chain of computations might not require any
increased memory at all and it's one of the reasons why Futures and Async in
@@ -164,7 +171,7 @@ impl Generator for GeneratorA {
type Return = ();
fn resume(&mut self) -> GeneratorState<Self::Yield, Self::Return> {
// lets us get ownership over current state
match std::mem::replace(&mut *self, GeneratorA::Exit) {
match std::mem::replace(self, GeneratorA::Exit) {
GeneratorA::Enter(a1) => {
/*----code before yield----*/
@@ -265,7 +272,7 @@ impl Generator for GeneratorA {
type Return = ();
fn resume(&mut self) -> GeneratorState<Self::Yield, Self::Return> {
// lets us get ownership over current state
match std::mem::replace(&mut *self, GeneratorA::Exit) {
match std::mem::replace(self, GeneratorA::Exit) {
GeneratorA::Enter => {
let to_borrow = String::from("Hello");
let borrowed = &to_borrow; // <--- NB!
@@ -536,14 +543,46 @@ while using just safe Rust. This is a big problem!
> you'll see that it runs without panic on the current stable (1.42.0) but
> panics on the current nightly (1.44.0). Scary!
## Async blocks and generators
Futures in Rust are implemented as state machines much the same way Generators
are state machines.
You might have noticed the similarites in the syntax used in async blocks and
the syntax used in generators:
```rust, ignore
let mut gen = move || {
let to_borrow = String::from("Hello");
let borrowed = &to_borrow;
yield borrowed.len();
println!("{} world!", borrowed);
};
```
Compare that with a similar example using async blocks:
```
let mut fut = async || {
let to_borrow = String::from("Hello");
let borrowed = &to_borrow;
SomeResource::some_task().await;
println!("{} world!", borrowed);
};
```
The difference is that Futures has different states than what a `Generator` would
have. The states of a Rust Futures is either: `Pending` or `Ready`.
An async block will return a `Future` instead of a `Generator`, however, the way
a Future works and the way a Generator work internally is similar.
The same goes for the challenges of borrowin across yield/await points.
We'll explain exactly what happened using a slightly simpler example in the next
chapter and we'll fix our generator using `Pin` so join me as we explore
the last topic before we implement our main Futures example.
## Bonus section - self referential generators in Rust today
Thanks to [PR#45337][pr45337] you can actually run code like the one in our

View File

@@ -303,9 +303,10 @@ impl Test {
_marker: PhantomPinned,
}
}
fn init(&mut self) {
fn init<'a>(self: Pin<&'a mut Self>) {
let self_ptr: *const String = &self.a;
self.b = self_ptr;
let this = unsafe { self.get_unchecked_mut() };
this.b = self_ptr;
}
fn a<'a>(self: Pin<&'a Self>) -> &'a str {
@@ -329,15 +330,18 @@ Let's see what happens if we run our example now:
```rust
pub fn main() {
// test1 is safe to move before we initialize it
let mut test1 = Test::new("test1");
test1.init();
let mut test1_pin = unsafe { Pin::new_unchecked(&mut test1) };
// Notice how we shadow `test1` to prevent it from beeing accessed again
let mut test1 = unsafe { Pin::new_unchecked(&mut test1) };
Test::init(test1.as_mut());
let mut test2 = Test::new("test2");
test2.init();
let mut test2_pin = unsafe { Pin::new_unchecked(&mut test2) };
let mut test2 = unsafe { Pin::new_unchecked(&mut test2) };
Test::init(test2.as_mut());
println!("a: {}, b: {}", Test::a(test1_pin.as_ref()), Test::b(test1_pin.as_ref()));
println!("a: {}, b: {}", Test::a(test2_pin.as_ref()), Test::b(test2_pin.as_ref()));
println!("a: {}, b: {}", Test::a(test1.as_ref()), Test::b(test1.as_ref()));
println!("a: {}, b: {}", Test::a(test2.as_ref()), Test::b(test2.as_ref()));
}
# use std::pin::Pin;
# use std::marker::PhantomPinned;
@@ -360,9 +364,10 @@ pub fn main() {
# _marker: PhantomPinned,
# }
# }
# fn init(&mut self) {
# fn init<'a>(self: Pin<&'a mut Self>) {
# let self_ptr: *const String = &self.a;
# self.b = self_ptr;
# let this = unsafe { self.get_unchecked_mut() };
# this.b = self_ptr;
# }
#
# fn a<'a>(self: Pin<&'a Self>) -> &'a str {
@@ -376,20 +381,21 @@ pub fn main() {
```
Now, if we try to pull the same trick which got us in to trouble the last time
you'll get a compilation error. So t
you'll get a compilation error.
```rust, compile_fail
pub fn main() {
let mut test1 = Test::new("test1");
test1.init();
let mut test1_pin = unsafe { Pin::new_unchecked(&mut test1) };
let mut test1 = unsafe { Pin::new_unchecked(&mut test1) };
Test::init(test1.as_mut());
let mut test2 = Test::new("test2");
test2.init();
let mut test2_pin = unsafe { Pin::new_unchecked(&mut test2) };
let mut test2 = unsafe { Pin::new_unchecked(&mut test2) };
Test::init(test2.as_mut());
println!("a: {}, b: {}", Test::a(test1_pin.as_ref()), Test::b(test1_pin.as_ref()));
std::mem::swap(test1_pin.as_mut(), test2_pin.as_mut());
println!("a: {}, b: {}", Test::a(test2_pin.as_ref()), Test::b(test2_pin.as_ref()));
println!("a: {}, b: {}", Test::a(test1.as_ref()), Test::b(test1.as_ref()));
std::mem::swap(test1.as_mut(), test2.as_mut());
println!("a: {}, b: {}", Test::a(test2.as_ref()), Test::b(test2.as_ref()));
}
# use std::pin::Pin;
# use std::marker::PhantomPinned;
@@ -412,9 +418,10 @@ pub fn main() {
# _marker: PhantomPinned,
# }
# }
# fn init(&mut self) {
# fn init<'a>(self: Pin<&'a mut Self>) {
# let self_ptr: *const String = &self.a;
# self.b = self_ptr;
# let this = unsafe { self.get_unchecked_mut() };
# this.b = self_ptr;
# }
#
# fn a<'a>(self: Pin<&'a Self>) -> &'a str {
@@ -427,9 +434,25 @@ pub fn main() {
# }
```
As you see from the error you get by running the code the type system prevents
us from swapping the pinned pointers.
> It's important to note that stack pinning will always depend on the current
> stack frame we're in, so we can't create a self referential object in one
> stack frame and return it since any pointers we take to "self" is invalidated.
>
> It also puts a lot of responsibility in your hands if you pin a value to the
> stack. A mistake that is easy to make is, forgetting to shadow the original variable
> since you could drop the pinned pointer and access the old value
> after it's initialized like this:
>
> ```rust, ignore
> let mut test1 = Test::new("test1");
> let mut test1_pin = unsafe { Pin::new_unchecked(&mut test1) };
> Test::init(test1_pin.as_mut());
> drop(test1_pin);
> println!("{:?}", test1.b);
> ```
## Pinning to the heap
@@ -481,7 +504,7 @@ pub fn main() {
}
```
The fact that boxing (heap allocating) a value that implements `!Unpin` is safe
The fact that pinning a heap allocated value that implements `!Unpin` is safe
makes sense. Once the data is allocated on the heap it will have a stable address.
There is no need for us as users of the API to take special care and ensure
@@ -496,16 +519,16 @@ now you need to use a crate like [pin_project][pin_project] to do that.
equivalent to `&'a mut T`. in other words: `Unpin` means it's OK for this type
to be moved even when pinned, so `Pin` will have no effect on such a type.
2. Getting a `&mut T` to a pinned pointer requires unsafe if `T: !Unpin`. In
2. Getting a `&mut T` to a pinned T requires unsafe if `T: !Unpin`. In
other words: requiring a pinned pointer to a type which is `!Unpin` prevents
the _user_ of that API from moving that value unless it choses to write `unsafe`
code.
3. Pinning does nothing special with memory allocation like putting it into some
"read only" memory or anything fancy. It only tells the compiler that some
operations on this value should be forbidden.
"read only" memory or anything fancy. It only uses the type system to prevent
certain operations on this value.
4. Most standard library types implement `Unpin`. The same goes for most
1. Most standard library types implement `Unpin`. The same goes for most
"normal" types you encounter in Rust. `Futures` and `Generators` are two
exceptions.
@@ -514,8 +537,9 @@ justification for stabilizing them was to allow that. There are still corner
cases in the API which are being explored.
6. The implementation behind objects that are `!Unpin` is most likely unsafe.
Moving such a type can cause the universe to crash. As of the time of writing
this book, creating and reading fields of a self referential struct still requires `unsafe`.
Moving such a type after it has been pinned can cause the universe to crash. As of the time of writing
this book, creating and reading fields of a self referential struct still requires `unsafe`
(the only way to do it is to create a struct containing raw pointers to itself).
7. You can add a `!Unpin` bound on a type on nightly with a feature flag, or
by adding `std::marker::PhantomPinned` to your type on stable.

View File

@@ -48,26 +48,33 @@ a `Future` has resolved and should be polled again.
```rust, noplaypen, ignore
// Our executor takes any object which implements the `Future` trait
fn block_on<F: Future>(mut future: F) -> F::Output {
// the first thing we do is to construct a `Waker` which we'll pass on to
// the `reactor` so it can wake us up when an event is ready.
let mywaker = Arc::new(MyWaker{ thread: thread::current() });
let waker = waker_into_waker(Arc::into_raw(mywaker));
// The context struct is just a wrapper for a `Waker` object. Maybe in the
// future this will do more, but right now it's just a wrapper.
let mut cx = Context::from_waker(&waker);
// So, since we run this on one thread and run one future to completion
// we can pin the `Future` to the stack. This is unsafe, but saves an
// allocation. We could `Box::pin` it too if we wanted. This is however
// safe since we shadow `future` so it can't be accessed again and will
// not move until it's dropped.
let mut future = unsafe { Pin::new_unchecked(&mut future) };
// We poll in a loop, but it's not a busy loop. It will only run when
// an event occurs, or a thread has a "spurious wakeup" (an unexpected wakeup
// that can happen for no good reason).
let val = loop {
// So, since we run this on one thread and run one future to completion
// we can pin the `Future` to the stack. This is unsafe, but saves an
// allocation. We could `Box::pin` it too if we wanted. This is however
// safe since we don't move the `Future` here.
let pinned = unsafe { Pin::new_unchecked(&mut future) };
match Future::poll(pinned, &mut cx) {
// when the Future is ready we're finished
Poll::Ready(val) => break val,
// If we get a `pending` future we just go to sleep...
Poll::Pending => thread::park(),
};
@@ -141,7 +148,7 @@ fn mywaker_wake(s: &MyWaker) {
// Since we use an `Arc` cloning is just increasing the refcount on the smart
// pointer.
fn mywaker_clone(s: &MyWaker) -> RawWaker {
let arc = unsafe { Arc::from_raw(s).clone() };
let arc = unsafe { Arc::from_raw(s) };
std::mem::forget(arc.clone()); // increase ref count
RawWaker::new(Arc::into_raw(arc) as *const (), &VTABLE)
}
@@ -179,24 +186,30 @@ impl Task {
// This is our `Future` implementation
impl Future for Task {
// The output for our kind of `leaf future` is just an `usize`. For other
// futures this could be something more interesting like a byte array.
type Output = usize;
fn poll(mut self: Pin<&mut Self>, cx: &mut Context<'_>) -> Poll<Self::Output> {
let mut r = self.reactor.lock().unwrap();
// we check with the `Reactor` if this future is in its "readylist"
// i.e. if it's `Ready`
if r.is_ready(self.id) {
// if it is, we return the data. In this case it's just the ID of
// the task since this is just a very simple example.
Poll::Ready(self.id)
} else if self.is_registered {
// If the future is registered alredy, we just return `Pending`
Poll::Pending
} else {
// If we get here, it must be the first time this `Future` is polled
// so we register a task with our `reactor`
r.register(self.data, cx.waker().clone(), self.id);
// oh, we have to drop the lock on our `Mutex` here because we can't
// have a shared and exclusive borrow at the same time
drop(r);
@@ -232,29 +245,26 @@ We choose to pass in a reference to the whole `Reactor` here. This isn't normal.
The reactor will often be a global resource which let's us register interests
without passing around a reference.
### Why using thread park/unpark is a bad idea for a library
It could deadlock easily since anyone could get a handle to the `executor thread`
and call park/unpark on it.
If one of our `Futures` holds a handle to our thread, or any unrelated code
calls `unpark` on our thread, the following could happen:
1. A future could call `unpark` on the executor thread from a different thread
2. Our `executor` thinks that data is ready and wakes up and polls the future
3. The future is not ready yet when polled, but at that exact same time the
`Reactor` gets an event and calls `wake()` which also unparks our thread.
4. This could happen before we go to sleep again since these processes
run in parallel.
5. Our reactor has called `wake` but our thread is still sleeping since it was
awake already at that point.
6. We're deadlocked and our program stops working
> ### Why using thread park/unpark is a bad idea for a library
>
> It could deadlock easily since anyone could get a handle to the `executor thread`
> and call park/unpark on it.
>
> 1. A future could call `unpark` on the executor thread from a different thread
> 2. Our `executor` thinks that data is ready and wakes up and polls the future
> 3. The future is not ready yet when polled, but at that exact same time the
> `Reactor` gets an event and calls `wake()` which also unparks our thread.
> 4. This could happen before we go to sleep again since these processes
> run in parallel.
> 5. Our reactor has called `wake` but our thread is still sleeping since it was
> awake already at that point.
> 6. We're deadlocked and our program stops working
> There is also the case that our thread could have what's called a
`spurious wakeup` ([which can happen unexpectedly][spurious_wakeup]), which
could cause the same deadlock if we're unlucky.
> `spurious wakeup` ([which can happen unexpectedly][spurious_wakeup]), which
> could cause the same deadlock if we're unlucky.
There are many better solutions, here are some:
There are several better solutions, here are some:
- Use [std::sync::CondVar][condvar]
- Use [crossbeam::sync::Parker][crossbeam_parker]
@@ -279,8 +289,19 @@ is a `Future`.
>registers interest with the global `Reactor` and no reference is needed.
We can call this kind of `Future` a "leaf Future", since it's some operation
we'll actually wait on and that we can chain operations on which are performed
once the leaf future is ready.
we'll actually wait on and which we can chain operations on which are performed
once the leaf future is ready.
The reactor we create here will also create **leaf-futures**, accept a waker and
call it once the task is finished.
The task we're implementing is the simplest I could find. It's a timer that
only spawns a thread and puts it to sleep for a number of seconds we specify
when acquiring the leaf-future.
To be able to run the code here in the browser there is not much real I/O we
can do so just pretend that this is actually represents some useful I/O operation
for the sake of this example.
**Our Reactor will look like this:**
@@ -288,10 +309,12 @@ once the leaf future is ready.
// This is a "fake" reactor. It does no real I/O, but that also makes our
// code possible to run in the book and in the playground
struct Reactor {
// we need some way of registering a Task with the reactor. Normally this
// would be an "interest" in an I/O event
dispatcher: Sender<Event>,
handle: Option<JoinHandle<()>>,
// This is a list of tasks that are ready, which means they should be polled
// for data.
readylist: Arc<Mutex<Vec<usize>>>,
@@ -316,11 +339,13 @@ impl Reactor {
// This `Vec` will hold handles to all threads we spawn so we can
// join them later on and finish our programm in a good manner
let mut handles = vec![];
// This will be the "Reactor thread"
let handle = thread::spawn(move || {
for event in rx {
let rl_clone = rl_clone.clone();
match event {
// If we get a close event we break out of the loop we're in
Event::Close => break,
Event::Timeout(waker, duration, id) => {
@@ -328,12 +353,15 @@ impl Reactor {
// When we get an event we simply spawn a new thread
// which will simulate some I/O resource...
let event_handle = thread::spawn(move || {
//... by sleeping for the number of seconds
// we provided when creating the `Task`.
thread::sleep(Duration::from_secs(duration));
// When it's done sleeping we put the ID of this task
// on the "readylist"
rl_clone.lock().map(|mut rl| rl.push(id)).unwrap();
// Then we call `wake` which will wake up our
// executor and start polling the futures
waker.wake();
@@ -360,6 +388,7 @@ impl Reactor {
}
fn register(&mut self, duration: u64, waker: Waker, data: usize) {
// registering an event is as simple as sending an `Event` through
// the channel.
self.dispatcher
@@ -416,6 +445,7 @@ fn main() {
// Many runtimes create a glocal `reactor` we pass it as an argument
let reactor = Reactor::new();
// Since we'll share this between threads we wrap it in a
// atmically-refcounted- mutex.
let reactor = Arc::new(Mutex::new(reactor));
@@ -451,6 +481,7 @@ fn main() {
// This executor will block the main thread until the futures is resolved
block_on(mainfut);
// When we're done, we want to shut down our reactor thread so our program
// ends nicely.
reactor.lock().map(|mut r| r.close()).unwrap();
@@ -471,15 +502,6 @@ fn main() {
# val
# }
#
# fn spawn<F: Future>(future: F) -> Pin<Box<F>> {
# let mywaker = Arc::new(MyWaker{ thread: thread::current() });
# let waker = waker_into_waker(Arc::into_raw(mywaker));
# let mut cx = Context::from_waker(&waker);
# let mut boxed = Box::pin(future);
# let _ = Future::poll(boxed.as_mut(), &mut cx);
# boxed
# }
#
# // ====================== FUTURE IMPLEMENTATION ==============================
# #[derive(Clone)]
# struct MyWaker {
@@ -632,12 +654,6 @@ The last point is relevant when we move on the the last paragraph.
## Async/Await and concurrent Futures
This is the first time we actually see the `async/await` syntax so let's
finish this book by explaining them briefly.
Hopefully, the `await` syntax looks pretty familiar. It has a lot in common
with `yield` and indeed, it works in much the same way.
The `async` keyword can be used on functions as in `async fn(...)` or on a
block as in `async { ... }`. Both will turn your function, or block, into a
`Future`.
@@ -645,13 +661,14 @@ block as in `async { ... }`. Both will turn your function, or block, into a
These `Futures` are rather simple. Imagine our generator from a few chapters
back. Every `await` point is like a `yield` point.
Instead of `yielding` a value we pass in, it yields the `Future` we're awaiting.
In turn this `Future` is polled.
Instead of `yielding` a value we pass in, it yields the `Future` we're awaiting,
so when we poll a future the first time we run the code up until the first
`await` point where it yields a new Future we poll and so on until we reach
a **leaf-future**.
Now, as is the case in our code, our `mainfut` contains two non-leaf futures
which it awaits, and all that happens is that these state machines are polled
as well until some "leaf future" in the end is finally polled and either
returns `Ready` or `Pending`.
until some "leaf future" in the end either returns `Ready` or `Pending`.
The way our example is right now, it's not much better than regular synchronous
code. For us to actually await multiple futures at the same time we somehow need
@@ -672,254 +689,14 @@ Future got 1 at time: 1.00.
Future got 2 at time: 2.00.
```
To accomplish this we can create the simplest possible `spawn` function I could
come up with:
Now, this is the point where I'll refer you to some better resources for
implementing just that. You should have a pretty good understanding of the
concept of Futures by now.
```rust, ignore, noplaypen
fn spawn<F: Future>(future: F) -> Pin<Box<F>> {
// We start off the same way as we did before
let mywaker = Arc::new(MyWaker{ thread: thread::current() });
let waker = waker_into_waker(Arc::into_raw(mywaker));
let mut cx = Context::from_waker(&waker);
// But we need to Box this Future. We can't pin it to this stack frame
// since we'll return before the `Future` is resolved so it must be pinned
// to the heap.
let mut boxed = Box::pin(future);
// Now we poll and just discard the result. This way, we register a `Waker`
// with our `Reactor` and kick of whatever operation we're expecting.
let _ = Future::poll(boxed.as_mut(), &mut cx);
// We still need this `Future` since we'll await it later so we return it...
boxed
}
```
The next step should be getting to know how more advanced runtimes work and
how they implement different ways of running Futures to completion.
Now if we change our code in `main` to look like this instead.
```rust, edition2018
# use std::{
# future::Future, pin::Pin, sync::{mpsc::{channel, Sender}, Arc, Mutex},
# task::{Context, Poll, RawWaker, RawWakerVTable, Waker},
# thread::{self, JoinHandle}, time::{Duration, Instant}
# };
fn main() {
let start = Instant::now();
let reactor = Reactor::new();
let reactor = Arc::new(Mutex::new(reactor));
let future1 = Task::new(reactor.clone(), 1, 1);
let future2 = Task::new(reactor.clone(), 2, 2);
let fut1 = async {
let val = future1.await;
let dur = (Instant::now() - start).as_secs_f32();
println!("Future got {} at time: {:.2}.", val, dur);
};
let fut2 = async {
let val = future2.await;
let dur = (Instant::now() - start).as_secs_f32();
println!("Future got {} at time: {:.2}.", val, dur);
};
// You'll notice everything stays the same until this point
let mainfut = async {
// Here we "kick off" our first `Future`
let handle1 = spawn(fut1);
// And the second one
let handle2 = spawn(fut2);
// Now, they're already started, and when they get polled in our
// executor now they will just return `Pending`, or if we somehow used
// so much time that they're already resolved, they will return `Ready`.
handle1.await;
handle2.await;
};
block_on(mainfut);
reactor.lock().map(|mut r| r.close()).unwrap();
}
# // ============================= EXECUTOR ====================================
# fn block_on<F: Future>(mut future: F) -> F::Output {
# let mywaker = Arc::new(MyWaker{ thread: thread::current() });
# let waker = waker_into_waker(Arc::into_raw(mywaker));
# let mut cx = Context::from_waker(&waker);
# let val = loop {
# let pinned = unsafe { Pin::new_unchecked(&mut future) };
# match Future::poll(pinned, &mut cx) {
# Poll::Ready(val) => break val,
# Poll::Pending => thread::park(),
# };
# };
# val
# }
#
# fn spawn<F: Future>(future: F) -> Pin<Box<F>> {
# let mywaker = Arc::new(MyWaker{ thread: thread::current() });
# let waker = waker_into_waker(Arc::into_raw(mywaker));
# let mut cx = Context::from_waker(&waker);
# let mut boxed = Box::pin(future);
# let _ = Future::poll(boxed.as_mut(), &mut cx);
# boxed
# }
#
# // ====================== FUTURE IMPLEMENTATION ==============================
# #[derive(Clone)]
# struct MyWaker {
# thread: thread::Thread,
# }
#
# #[derive(Clone)]
# pub struct Task {
# id: usize,
# reactor: Arc<Mutex<Reactor>>,
# data: u64,
# is_registered: bool,
# }
#
# fn mywaker_wake(s: &MyWaker) {
# let waker_ptr: *const MyWaker = s;
# let waker_arc = unsafe {Arc::from_raw(waker_ptr)};
# waker_arc.thread.unpark();
# }
#
# fn mywaker_clone(s: &MyWaker) -> RawWaker {
# let arc = unsafe { Arc::from_raw(s).clone() };
# std::mem::forget(arc.clone()); // increase ref count
# RawWaker::new(Arc::into_raw(arc) as *const (), &VTABLE)
# }
#
# const VTABLE: RawWakerVTable = unsafe {
# RawWakerVTable::new(
# |s| mywaker_clone(&*(s as *const MyWaker)), // clone
# |s| mywaker_wake(&*(s as *const MyWaker)), // wake
# |s| mywaker_wake(*(s as *const &MyWaker)), // wake by ref
# |s| drop(Arc::from_raw(s as *const MyWaker)), // decrease refcount
# )
# };
#
# fn waker_into_waker(s: *const MyWaker) -> Waker {
# let raw_waker = RawWaker::new(s as *const (), &VTABLE);
# unsafe { Waker::from_raw(raw_waker) }
# }
#
# impl Task {
# fn new(reactor: Arc<Mutex<Reactor>>, data: u64, id: usize) -> Self {
# Task {
# id,
# reactor,
# data,
# is_registered: false,
# }
# }
# }
#
# impl Future for Task {
# type Output = usize;
# fn poll(mut self: Pin<&mut Self>, cx: &mut Context<'_>) -> Poll<Self::Output> {
# let mut r = self.reactor.lock().unwrap();
# if r.is_ready(self.id) {
# Poll::Ready(self.id)
# } else if self.is_registered {
# Poll::Pending
# } else {
# r.register(self.data, cx.waker().clone(), self.id);
# drop(r);
# self.is_registered = true;
# Poll::Pending
# }
# }
# }
#
# // =============================== REACTOR ===================================
# struct Reactor {
# dispatcher: Sender<Event>,
# handle: Option<JoinHandle<()>>,
# readylist: Arc<Mutex<Vec<usize>>>,
# }
# #[derive(Debug)]
# enum Event {
# Close,
# Timeout(Waker, u64, usize),
# }
#
# impl Reactor {
# fn new() -> Self {
# let (tx, rx) = channel::<Event>();
# let readylist = Arc::new(Mutex::new(vec![]));
# let rl_clone = readylist.clone();
# let mut handles = vec![];
# let handle = thread::spawn(move || {
# // This simulates some I/O resource
# for event in rx {
# println!("REACTOR: {:?}", event);
# let rl_clone = rl_clone.clone();
# match event {
# Event::Close => break,
# Event::Timeout(waker, duration, id) => {
# let event_handle = thread::spawn(move || {
# thread::sleep(Duration::from_secs(duration));
# rl_clone.lock().map(|mut rl| rl.push(id)).unwrap();
# waker.wake();
# });
#
# handles.push(event_handle);
# }
# }
# }
#
# for handle in handles {
# handle.join().unwrap();
# }
# });
#
# Reactor {
# readylist,
# dispatcher: tx,
# handle: Some(handle),
# }
# }
#
# fn register(&mut self, duration: u64, waker: Waker, data: usize) {
# self.dispatcher
# .send(Event::Timeout(waker, duration, data))
# .unwrap();
# }
#
# fn close(&mut self) {
# self.dispatcher.send(Event::Close).unwrap();
# }
#
# fn is_ready(&self, id_to_check: usize) -> bool {
# self.readylist
# .lock()
# .map(|rl| rl.iter().any(|id| *id == id_to_check))
# .unwrap()
# }
# }
#
# impl Drop for Reactor {
# fn drop(&mut self) {
# self.handle.take().map(|h| h.join().unwrap()).unwrap();
# }
# }
```
Now, if we try to run our example again
If you add this code to our example and run it, you'll see:
```ignore
Future got 1 at time: 1.00.
Future got 2 at time: 2.00.
```
Exactly as we expected.
Now this `spawn` method is not very sophisticated but it explains the concept.
I've [challenged you to create a better version](./conclusion.md#building-a-better-exectuor) and pointed you at a better resource
in the next chapter under [reader exercises](./conclusion.md#reader-exercises).
I [challenge you to create a better version](./conclusion.md#building-a-better-exectuor).
That's actually it for now. There are probably much more to learn, but I think it
will be easier once the fundamental concepts are there and that further

View File

@@ -46,9 +46,11 @@ fn block_on<F: Future>(mut future: F) -> F::Output {
let mywaker = Arc::new(MyWaker{ thread: thread::current() });
let waker = waker_into_waker(Arc::into_raw(mywaker));
let mut cx = Context::from_waker(&waker);
// SAFETY: we shadow `future` so it can't be accessed again.
let mut future = unsafe { Pin::new_unchecked(&mut future) };
let val = loop {
let pinned = unsafe { Pin::new_unchecked(&mut future) };
match Future::poll(pinned, &mut cx) {
match Future::poll(future.as_mut(), &mut cx) {
Poll::Ready(val) => break val,
Poll::Pending => thread::park(),
};
@@ -56,15 +58,6 @@ fn block_on<F: Future>(mut future: F) -> F::Output {
val
}
fn spawn<F: Future>(future: F) -> Pin<Box<F>> {
let mywaker = Arc::new(MyWaker{ thread: thread::current() });
let waker = waker_into_waker(Arc::into_raw(mywaker));
let mut cx = Context::from_waker(&waker);
let mut boxed = Box::pin(future);
let _ = Future::poll(boxed.as_mut(), &mut cx);
boxed
}
// ====================== FUTURE IMPLEMENTATION ==============================
#[derive(Clone)]
struct MyWaker {
@@ -86,7 +79,7 @@ fn mywaker_wake(s: &MyWaker) {
}
fn mywaker_clone(s: &MyWaker) -> RawWaker {
let arc = unsafe { Arc::from_raw(s).clone() };
let arc = unsafe { Arc::from_raw(s) };
std::mem::forget(arc.clone()); // increase ref count
RawWaker::new(Arc::into_raw(arc) as *const (), &VTABLE)
}

View File

@@ -1,56 +1,66 @@
# Futures Explained in 200 Lines of Rust
This book aims to explain `Futures` in Rust using an example driven approach.
This book aims to explain `Futures` in Rust using an example driven approach,
exploring why they're designed the way they are, the alternatives and how
they work.
The goal is to get a better understanding of "async" in Rust by creating a toy
runtime consisting of a `Reactor` and an `Executor`, and our own futures which
we can run concurrently.
Going into the level of detail I do in this book is not needed to use futures
or async/await in Rust. It's for the curious out there that want to know _how_
it all works.
We'll start off a bit differently than most other explanations. Instead of
deferring some of the details about what `Futures` are and how they're
implemented, we tackle that head on first.
## What this book covers
I learn best when I can take basic understandable concepts and build piece by
piece of these basic building blocks until everything is understood. This way,
most questions will be answered and explored up front and the conclusions later
on seems natural.
This book will try to explain everything you might wonder about up until the
topic of different types of executors and runtimes. We'll just implement a very
simple runtime in this book introducing some concepts but it's enough to get
started.
I've limited myself to a 200 line main example so that we need keep
this fairly brief.
[Stjepan Glavina](https://github.com/stjepang) has made an excellent series of
articles about async runtimes and executors, and if the rumors are right he's
even working on a new async runtime that should be easy enough to use as
learning material.
In the end I've made some reader exercises you can do if you want to fix some
of the most glaring omissions and shortcuts we took and create a slightly better
example yourself.
The way you should go about it is to read this book first, then continue
reading the [articles from stejpang](https://stjepang.github.io/) to learn more
about runtimes and how they work, especially:
1. [Build your own block_on()](https://stjepang.github.io/2020/01/25/build-your-own-block-on.html)
2. [Build your own executor](https://stjepang.github.io/2020/01/31/build-your-own-executor.html)
I've limited myself to a 200 line main example (hence the title) to limit the
scope and introduce an example that can easily be explored further.
However, there is a lot to digest and it's not what I would call easy, but we'll
take everything step by step so get a cup of tea and relax.
I hope you enjoy the ride.
> This book is developed in the open, and contributions are welcome. You'll find
> [the repository for the book itself here][book_repo]. The final example which
> you can clone, fork or copy [can be found here][example_repo]
> you can clone, fork or copy [can be found here][example_repo]. Any suggestions
> or improvements can be filed as a PR or in the issue tracker for the book.
## What does this book give you that isn't covered elsewhere?
## Reader exercises and further reading
There are many good resources and examples already. First
of all, this book will focus on `Futures` and `async/await` specifically and
not in the context of any specific runtime.
In the last [chapter](conclusion.md) I've taken the liberty to suggest some
small exercises if you want to explore a little further.
Secondly, I've always found small runnable examples very exiting to learn from.
Thanks to [Mdbook][mdbook] the examples can even be edited and explored further
by uncommenting certain lines or adding new ones yourself. I use that quite a
but throughout so keep an eye out when reading through editable code segments.
It's all code that you can download, play with and learn from.
We'll and end up with an understandable example including a `Future`
implementation, an `Executor` and a `Reactor` in less than 200 lines of code.
We don't rely on any dependencies or real I/O which means it's very easy to
explore further and try your own ideas.
This book is also the fourth book I have written about concurrent programming
in Rust. If you like it, you might want to check out the others as well:
- [Green Threads Explained in 200 lines of rust](https://cfsamson.gitbook.io/green-threads-explained-in-200-lines-of-rust/)
- [The Node Experiment - Exploring Async Basics with Rust](https://cfsamson.github.io/book-exploring-async-basics/)
- [Epoll, Kqueue and IOCP Explained with Rust](https://cfsamsonbooks.gitbook.io/epoll-kqueue-iocp-explained/)
## Credits and thanks
I'll like to take the chance of thanking the people behind `mio`, `tokio`,
`async_std`, `Futures`, `libc`, `crossbeam` and many other libraries which so
much is built upon. Even the RFCs that much of the design is built upon is
very well written and very helpful. So thanks!
much is built upon.
A special thanks to [Johnhoo](https://github.com/jonhoo) who was kind enough to
give me some feedback on an early draft of this book. He has not read the
finished product and has in no way endorsed it, but a thanks is definitely due.
[mdbook]: https://github.com/rust-lang/mdBook
[book_repo]: https://github.com/cfsamson/books-futures-explained