version3 start

2020-04-01 22:56:43 +02:00
parent 5c10caf9c5
commit 7f7fe098f3
16 changed files with 1427 additions and 30 deletions
--- a/src/1_background_information.md
+++ b/src/1_background_information.md
@@ -13,18 +13,6 @@ information that will help demystify some of the concepts we encounter.
 Actually, after going through these concepts, implementing futures will seem
 pretty simple. I promise.

-## Popular alternatives for writing concurrent programs
-
-So let's kick this off by first taking a brief look into what the popular
-options we have for writing concurrent programs
-
-### Callback based approcah
-
-You probably already know this from Javascript since it's extremely common:
-
-
-
-
 ## Futures

 So what is a future?
--- a/src/1_why_futures.md
+++ b/src/1_why_futures.md
@@ -0,0 +1,400 @@
+# Why Futures
+
+Before we go into the details about Futures in Rust, let's take a quick look
+at the alternatives for handling concurrent programming in general and some
+pros and cons for each of them.
+
+## Threads provided by the operating system
+
+Now one way of accomplishing this is letting the OS take care of everything for 
+us. We do this by simply spawning a new OS thread for each task we want to
+accomplish and write code like we normally would.
+
+**Pros:**
+
+- Simple
+- Easy to use
+- Switching between tasks is reasonably fast
+- You get parallelism for free
+
+**Cons:**
+
+- OS level threads come with a rather large stack. If you have many tasks
+waiting simultaneously (like you would in a web-server under heavy load) you'll
+run out of memory pretty soon.
+- There are a lot of syscalls involved. This can be pretty costly when the number
+of tasks is high.
+- The OS has many things it needs to handle. It might not switch back to your
+thread as fast as you'd wish.
+- Might not be an option on some systems
+
+Using OS threads in Rust looks like this:
+
+```rust
+use std::thread;
+
+fn main() {
+    println!("So we start the program here!");
+    let t1 = thread::spawn(move || {
+        thread::sleep(std::time::Duration::from_millis(200));
+        println!("We create tasks which gets run when they're finished!");
+    });
+
+    let t2 = thread::spawn(move || {
+        thread::sleep(std::time::Duration::from_millis(100));
+        println!("We can even chain callbacks...");
+        let t3 = thread::spawn(move || {
+            thread::sleep(std::time::Duration::from_millis(50));
+            println!("...like this!");
+        });
+        t3.join().unwrap();
+    });
+    println!("While our tasks are executing we can do other stuff here.");
+
+    t1.join().unwrap();
+    t2.join().unwrap();
+}
+```
+
+## Green threads
+
+Green threads has been popularized by GO in the recent years. Green threads
+uses the same basic technique as operating systems does to handle concurrency.
+
+Green threads are implemented by setting up a stack for each task you want to
+execute and make the CPU "jump" from one stack to another to switch between
+tasks.
+
+The typical flow will be like this:
+
+1. Run som non-blocking code
+2. Make a blocking call to some external resource
+3. CPU jumps to the "main" thread which schedules a different thread to run and
+"jumps" to that stack
+4. Run some non-blocking code on the new thread until a new blocking call or the
+task is finished
+5. "jumps" back to the "main" thread and so on
+
+These "jumps" are know as context switches. Your OS is doing it many times each
+second as you read this.
+
+
+The main advantages are:
+
+1. Simple to use. The code will look like it does when using OS threads.
+2. A "context switch" is reasonably fast
+3. Each stack only gets a little memory to start with so you can have hundred of
+thousands of green threads running.
+4. It's easy to incorporate [_preemtion_](https://cfsamson.gitbook.io/green-threads-explained-in-200-lines-of-rust/green-threads#preemptive-multitasking)
+which puts a lot of control in the hands of the runtime implementors.
+
+The main cons are:
+
+1. The stacks might need to grow. Solving this is not easy and will have a cost.
+2. You need to save all the CPU state on every switch
+3. It's not a _zero cost abstraction_ (which is one of the reasons Rust removed
+them early on).
+4. Complicated to implement correctly if you want to support many different
+platforms.
+
+If you were to implement green threads in Rust, it could look something like
+this:
+
+    The example presented below is from an earlier book I wrote about green
+    threads called [Green Threads Explained in 200 lines of Rust.](https://cfsamson.gitbook.io/green-threads-explained-in-200-lines-of-rust/)
+    If you want to know what's going on everything is explained in detail
+    in that book.
+
+```rust
+#![feature(asm)]
+#![feature(naked_functions)]
+use std::ptr;
+
+const DEFAULT_STACK_SIZE: usize = 1024 * 1024 * 2;
+const MAX_THREADS: usize = 4;
+static mut RUNTIME: usize = 0;
+
+pub struct Runtime {
+    threads: Vec<Thread>,
+    current: usize,
+}
+
+#[derive(PartialEq, Eq, Debug)]
+enum State {
+    Available,
+    Running,
+    Ready,
+}
+
+struct Thread {
+    id: usize,
+    stack: Vec<u8>,
+    ctx: ThreadContext,
+    state: State,
+}
+
+#[derive(Debug, Default)]
+#[repr(C)]
+struct ThreadContext {
+    rsp: u64,
+    r15: u64,
+    r14: u64,
+    r13: u64,
+    r12: u64,
+    rbx: u64,
+    rbp: u64,
+}
+
+impl Thread {
+    fn new(id: usize) -> Self {
+        Thread {
+            id,
+            stack: vec![0_u8; DEFAULT_STACK_SIZE],
+            ctx: ThreadContext::default(),
+            state: State::Available,
+        }
+    }
+}
+
+impl Runtime {
+    pub fn new() -> Self {
+        let base_thread = Thread {
+            id: 0,
+            stack: vec![0_u8; DEFAULT_STACK_SIZE],
+            ctx: ThreadContext::default(),
+            state: State::Running,
+        };
+
+        let mut threads = vec![base_thread];
+        let mut available_threads: Vec<Thread> = (1..MAX_THREADS).map(|i| Thread::new(i)).collect();
+        threads.append(&mut available_threads);
+        Runtime {
+            threads,
+            current: 0,
+        }
+    }
+
+    pub fn init(&self) {
+        unsafe {
+            let r_ptr: *const Runtime = self;
+            RUNTIME = r_ptr as usize;
+        }
+    }
+
+    pub fn run(&mut self) -> ! {
+        while self.t_yield() {}
+        std::process::exit(0);
+    }
+
+    fn t_return(&mut self) {
+        if self.current != 0 {
+            self.threads[self.current].state = State::Available;
+            self.t_yield();
+        }
+    }
+
+    fn t_yield(&mut self) -> bool {
+        let mut pos = self.current;
+        while self.threads[pos].state != State::Ready {
+            pos += 1;
+            if pos == self.threads.len() {
+                pos = 0;
+            }
+            if pos == self.current {
+                return false;
+            }
+        }
+        if self.threads[self.current].state != State::Available {
+            self.threads[self.current].state = State::Ready;
+        }
+        self.threads[pos].state = State::Running;
+        let old_pos = self.current;
+        self.current = pos;
+        unsafe {
+            switch(&mut self.threads[old_pos].ctx, &self.threads[pos].ctx);
+        }
+        self.threads.len() > 0
+    }
+
+    pub fn spawn(&mut self, f: fn()) {
+        let available = self
+            .threads
+            .iter_mut()
+            .find(|t| t.state == State::Available)
+            .expect("no available thread.");
+        let size = available.stack.len();
+        unsafe {
+            let s_ptr = available.stack.as_mut_ptr().offset(size as isize);
+            let s_ptr = (s_ptr as usize & !15) as *mut u8;
+            ptr::write(s_ptr.offset(-24) as *mut u64, guard as u64);
+            ptr::write(s_ptr.offset(-32) as *mut u64, f as u64);
+            available.ctx.rsp = s_ptr.offset(-32) as u64;
+        }
+        available.state = State::Ready;
+    }
+}
+
+fn guard() {
+    unsafe {
+        let rt_ptr = RUNTIME as *mut Runtime;
+        (*rt_ptr).t_return();
+    };
+}
+
+pub fn yield_thread() {
+    unsafe {
+        let rt_ptr = RUNTIME as *mut Runtime;
+        (*rt_ptr).t_yield();
+    };
+}
+
+#[naked]
+#[inline(never)]
+unsafe fn switch(old: *mut ThreadContext, new: *const ThreadContext) {
+    asm!("
+        mov     %rsp, 0x00($0)
+        mov     %r15, 0x08($0)
+        mov     %r14, 0x10($0)
+        mov     %r13, 0x18($0)
+        mov     %r12, 0x20($0)
+        mov     %rbx, 0x28($0)
+        mov     %rbp, 0x30($0)
+   
+        mov     0x00($1), %rsp
+        mov     0x08($1), %r15
+        mov     0x10($1), %r14
+        mov     0x18($1), %r13
+        mov     0x20($1), %r12
+        mov     0x28($1), %rbx
+        mov     0x30($1), %rbp
+        ret
+        "
+    :
+    :"r"(old), "r"(new)
+    :
+    : "volatile", "alignstack"
+    );
+}
+
+fn main() {
+    let mut runtime = Runtime::new();
+    runtime.init();
+    runtime.spawn(|| {
+        println!("THREAD 1 STARTING");
+        let id = 1;
+        for i in 0..10 {
+            println!("thread: {} counter: {}", id, i);
+            yield_thread();
+        }
+        println!("THREAD 1 FINISHED");
+    });
+    runtime.spawn(|| {
+        println!("THREAD 2 STARTING");
+        let id = 2;
+        for i in 0..15 {
+            println!("thread: {} counter: {}", id, i);
+            yield_thread();
+        }
+        println!("THREAD 2 FINISHED");
+    });
+    runtime.run();
+}
+```
+
+### Callback based approach
+
+You probably already know this from Javascript since it's extremely common.
+The whole idea behind a callback based approach is to save a pointer to a
+set of instructions we want to run later on.
+
+The basic idea of not involving threads as a primary way to achieve concurrency
+is the common denominator for the rest of the approaches. Including the one
+Rust uses today which we'll soon get to.
+
+**Advantages:**
+
+- Easy to implement in most languages
+- No context switching
+- Low memory overhead (in most cases)
+
+**Drawbacks:**
+
+- Each task must save the state it needs for later, the memory usage will grow
+linearly with the number of tasks i .
+- Can be hard to reason about, also known as "callback hell".
+- Sharing state between tasks is a hard problem in Rust using this approach due
+to it's ownership model.
+
+The
+
+If we did that in Rust it could look something like this:
+
+```rust
+fn program_main() {
+    println!("So we start the program here!");
+    set_timeout(200, || {
+        println!("We create tasks which gets run when they're finished!");
+    });
+    set_timeout(100, || {
+        println!("We can even chain callbacks...");
+        set_timeout(50, || {
+            println!("...like this!");
+        })
+    });
+    println!("While our tasks are executing we can do other stuff here.");
+}
+
+fn main() {
+    RT.with(|rt| rt.run(program_main));
+}
+
+use std::sync::mpsc::{channel, Receiver, Sender};
+use std::{cell::RefCell, collections::HashMap, thread};
+
+thread_local! {
+    static RT: Runtime = Runtime::new();
+}
+
+struct Runtime {
+    callbacks: RefCell<HashMap<usize, Box<dyn FnOnce() -> ()>>>,
+    next_id: RefCell<usize>,
+    evt_sender: Sender<usize>,
+    evt_reciever: Receiver<usize>,
+}
+
+fn set_timeout(ms: u64, cb: impl FnOnce() + 'static) {
+    RT.with(|rt| {
+        let id = *rt.next_id.borrow();
+        *rt.next_id.borrow_mut() += 1;
+        rt.callbacks.borrow_mut().insert(id, Box::new(cb));
+        let evt_sender = rt.evt_sender.clone();
+        thread::spawn(move || {
+            thread::sleep(std::time::Duration::from_millis(ms));
+            evt_sender.send(id).unwrap();
+        });
+    });
+}
+
+impl Runtime {
+    fn new() -> Self {
+        let (evt_sender, evt_reciever) = channel();
+        Runtime {
+            callbacks: RefCell::new(HashMap::new()),
+            next_id: RefCell::new(1),
+            evt_sender,
+            evt_reciever,
+        }
+    }
+
+    fn run(&self, program: fn()) {
+        program();
+        for evt_id in &self.evt_reciever {
+            let cb = self.callbacks.borrow_mut().remove(&evt_id).unwrap();
+            cb();
+            if self.callbacks.borrow().is_empty() {
+                break;
+            }
+        }
+    }
+}
+```
--- a/src/SUMMARY.md
+++ b/src/SUMMARY.md
@@ -2,6 +2,7 @@

 [Introduction](./introduction.md)

+- [Why Futures](./1_why_futures.md)
 - [Some background information](./1_background_information.md)
 - [Waker and Context](./2_waker_context.md)
 - [Generators](./3_generators_pin.md)