version3 start

This commit is contained in:
Carl Fredrik Samson
2020-04-01 22:56:43 +02:00
parent 5c10caf9c5
commit 7f7fe098f3
16 changed files with 1427 additions and 30 deletions

View File

@@ -13,18 +13,6 @@ information that will help demystify some of the concepts we encounter.
Actually, after going through these concepts, implementing futures will seem
pretty simple. I promise.
## Popular alternatives for writing concurrent programs
So let's kick this off by first taking a brief look into what the popular
options we have for writing concurrent programs
### Callback based approcah
You probably already know this from Javascript since it's extremely common:
## Futures
So what is a future?

400
src/1_why_futures.md Normal file
View File

@@ -0,0 +1,400 @@
# Why Futures
Before we go into the details about Futures in Rust, let's take a quick look
at the alternatives for handling concurrent programming in general and some
pros and cons for each of them.
## Threads provided by the operating system
Now one way of accomplishing this is letting the OS take care of everything for
us. We do this by simply spawning a new OS thread for each task we want to
accomplish and write code like we normally would.
**Pros:**
- Simple
- Easy to use
- Switching between tasks is reasonably fast
- You get parallelism for free
**Cons:**
- OS level threads come with a rather large stack. If you have many tasks
waiting simultaneously (like you would in a web-server under heavy load) you'll
run out of memory pretty soon.
- There are a lot of syscalls involved. This can be pretty costly when the number
of tasks is high.
- The OS has many things it needs to handle. It might not switch back to your
thread as fast as you'd wish.
- Might not be an option on some systems
Using OS threads in Rust looks like this:
```rust
use std::thread;
fn main() {
println!("So we start the program here!");
let t1 = thread::spawn(move || {
thread::sleep(std::time::Duration::from_millis(200));
println!("We create tasks which gets run when they're finished!");
});
let t2 = thread::spawn(move || {
thread::sleep(std::time::Duration::from_millis(100));
println!("We can even chain callbacks...");
let t3 = thread::spawn(move || {
thread::sleep(std::time::Duration::from_millis(50));
println!("...like this!");
});
t3.join().unwrap();
});
println!("While our tasks are executing we can do other stuff here.");
t1.join().unwrap();
t2.join().unwrap();
}
```
## Green threads
Green threads has been popularized by GO in the recent years. Green threads
uses the same basic technique as operating systems does to handle concurrency.
Green threads are implemented by setting up a stack for each task you want to
execute and make the CPU "jump" from one stack to another to switch between
tasks.
The typical flow will be like this:
1. Run som non-blocking code
2. Make a blocking call to some external resource
3. CPU jumps to the "main" thread which schedules a different thread to run and
"jumps" to that stack
4. Run some non-blocking code on the new thread until a new blocking call or the
task is finished
5. "jumps" back to the "main" thread and so on
These "jumps" are know as context switches. Your OS is doing it many times each
second as you read this.
The main advantages are:
1. Simple to use. The code will look like it does when using OS threads.
2. A "context switch" is reasonably fast
3. Each stack only gets a little memory to start with so you can have hundred of
thousands of green threads running.
4. It's easy to incorporate [_preemtion_](https://cfsamson.gitbook.io/green-threads-explained-in-200-lines-of-rust/green-threads#preemptive-multitasking)
which puts a lot of control in the hands of the runtime implementors.
The main cons are:
1. The stacks might need to grow. Solving this is not easy and will have a cost.
2. You need to save all the CPU state on every switch
3. It's not a _zero cost abstraction_ (which is one of the reasons Rust removed
them early on).
4. Complicated to implement correctly if you want to support many different
platforms.
If you were to implement green threads in Rust, it could look something like
this:
The example presented below is from an earlier book I wrote about green
threads called [Green Threads Explained in 200 lines of Rust.](https://cfsamson.gitbook.io/green-threads-explained-in-200-lines-of-rust/)
If you want to know what's going on everything is explained in detail
in that book.
```rust
#![feature(asm)]
#![feature(naked_functions)]
use std::ptr;
const DEFAULT_STACK_SIZE: usize = 1024 * 1024 * 2;
const MAX_THREADS: usize = 4;
static mut RUNTIME: usize = 0;
pub struct Runtime {
threads: Vec<Thread>,
current: usize,
}
#[derive(PartialEq, Eq, Debug)]
enum State {
Available,
Running,
Ready,
}
struct Thread {
id: usize,
stack: Vec<u8>,
ctx: ThreadContext,
state: State,
}
#[derive(Debug, Default)]
#[repr(C)]
struct ThreadContext {
rsp: u64,
r15: u64,
r14: u64,
r13: u64,
r12: u64,
rbx: u64,
rbp: u64,
}
impl Thread {
fn new(id: usize) -> Self {
Thread {
id,
stack: vec![0_u8; DEFAULT_STACK_SIZE],
ctx: ThreadContext::default(),
state: State::Available,
}
}
}
impl Runtime {
pub fn new() -> Self {
let base_thread = Thread {
id: 0,
stack: vec![0_u8; DEFAULT_STACK_SIZE],
ctx: ThreadContext::default(),
state: State::Running,
};
let mut threads = vec![base_thread];
let mut available_threads: Vec<Thread> = (1..MAX_THREADS).map(|i| Thread::new(i)).collect();
threads.append(&mut available_threads);
Runtime {
threads,
current: 0,
}
}
pub fn init(&self) {
unsafe {
let r_ptr: *const Runtime = self;
RUNTIME = r_ptr as usize;
}
}
pub fn run(&mut self) -> ! {
while self.t_yield() {}
std::process::exit(0);
}
fn t_return(&mut self) {
if self.current != 0 {
self.threads[self.current].state = State::Available;
self.t_yield();
}
}
fn t_yield(&mut self) -> bool {
let mut pos = self.current;
while self.threads[pos].state != State::Ready {
pos += 1;
if pos == self.threads.len() {
pos = 0;
}
if pos == self.current {
return false;
}
}
if self.threads[self.current].state != State::Available {
self.threads[self.current].state = State::Ready;
}
self.threads[pos].state = State::Running;
let old_pos = self.current;
self.current = pos;
unsafe {
switch(&mut self.threads[old_pos].ctx, &self.threads[pos].ctx);
}
self.threads.len() > 0
}
pub fn spawn(&mut self, f: fn()) {
let available = self
.threads
.iter_mut()
.find(|t| t.state == State::Available)
.expect("no available thread.");
let size = available.stack.len();
unsafe {
let s_ptr = available.stack.as_mut_ptr().offset(size as isize);
let s_ptr = (s_ptr as usize & !15) as *mut u8;
ptr::write(s_ptr.offset(-24) as *mut u64, guard as u64);
ptr::write(s_ptr.offset(-32) as *mut u64, f as u64);
available.ctx.rsp = s_ptr.offset(-32) as u64;
}
available.state = State::Ready;
}
}
fn guard() {
unsafe {
let rt_ptr = RUNTIME as *mut Runtime;
(*rt_ptr).t_return();
};
}
pub fn yield_thread() {
unsafe {
let rt_ptr = RUNTIME as *mut Runtime;
(*rt_ptr).t_yield();
};
}
#[naked]
#[inline(never)]
unsafe fn switch(old: *mut ThreadContext, new: *const ThreadContext) {
asm!("
mov %rsp, 0x00($0)
mov %r15, 0x08($0)
mov %r14, 0x10($0)
mov %r13, 0x18($0)
mov %r12, 0x20($0)
mov %rbx, 0x28($0)
mov %rbp, 0x30($0)
mov 0x00($1), %rsp
mov 0x08($1), %r15
mov 0x10($1), %r14
mov 0x18($1), %r13
mov 0x20($1), %r12
mov 0x28($1), %rbx
mov 0x30($1), %rbp
ret
"
:
:"r"(old), "r"(new)
:
: "volatile", "alignstack"
);
}
fn main() {
let mut runtime = Runtime::new();
runtime.init();
runtime.spawn(|| {
println!("THREAD 1 STARTING");
let id = 1;
for i in 0..10 {
println!("thread: {} counter: {}", id, i);
yield_thread();
}
println!("THREAD 1 FINISHED");
});
runtime.spawn(|| {
println!("THREAD 2 STARTING");
let id = 2;
for i in 0..15 {
println!("thread: {} counter: {}", id, i);
yield_thread();
}
println!("THREAD 2 FINISHED");
});
runtime.run();
}
```
### Callback based approach
You probably already know this from Javascript since it's extremely common.
The whole idea behind a callback based approach is to save a pointer to a
set of instructions we want to run later on.
The basic idea of not involving threads as a primary way to achieve concurrency
is the common denominator for the rest of the approaches. Including the one
Rust uses today which we'll soon get to.
**Advantages:**
- Easy to implement in most languages
- No context switching
- Low memory overhead (in most cases)
**Drawbacks:**
- Each task must save the state it needs for later, the memory usage will grow
linearly with the number of tasks i .
- Can be hard to reason about, also known as "callback hell".
- Sharing state between tasks is a hard problem in Rust using this approach due
to it's ownership model.
The
If we did that in Rust it could look something like this:
```rust
fn program_main() {
println!("So we start the program here!");
set_timeout(200, || {
println!("We create tasks which gets run when they're finished!");
});
set_timeout(100, || {
println!("We can even chain callbacks...");
set_timeout(50, || {
println!("...like this!");
})
});
println!("While our tasks are executing we can do other stuff here.");
}
fn main() {
RT.with(|rt| rt.run(program_main));
}
use std::sync::mpsc::{channel, Receiver, Sender};
use std::{cell::RefCell, collections::HashMap, thread};
thread_local! {
static RT: Runtime = Runtime::new();
}
struct Runtime {
callbacks: RefCell<HashMap<usize, Box<dyn FnOnce() -> ()>>>,
next_id: RefCell<usize>,
evt_sender: Sender<usize>,
evt_reciever: Receiver<usize>,
}
fn set_timeout(ms: u64, cb: impl FnOnce() + 'static) {
RT.with(|rt| {
let id = *rt.next_id.borrow();
*rt.next_id.borrow_mut() += 1;
rt.callbacks.borrow_mut().insert(id, Box::new(cb));
let evt_sender = rt.evt_sender.clone();
thread::spawn(move || {
thread::sleep(std::time::Duration::from_millis(ms));
evt_sender.send(id).unwrap();
});
});
}
impl Runtime {
fn new() -> Self {
let (evt_sender, evt_reciever) = channel();
Runtime {
callbacks: RefCell::new(HashMap::new()),
next_id: RefCell::new(1),
evt_sender,
evt_reciever,
}
}
fn run(&self, program: fn()) {
program();
for evt_id in &self.evt_reciever {
let cb = self.callbacks.borrow_mut().remove(&evt_id).unwrap();
cb();
if self.callbacks.borrow().is_empty() {
break;
}
}
}
}
```

View File

@@ -2,6 +2,7 @@
[Introduction](./introduction.md)
- [Why Futures](./1_why_futures.md)
- [Some background information](./1_background_information.md)
- [Waker and Context](./2_waker_context.md)
- [Generators](./3_generators_pin.md)