Fork me on GitHub
#code-reviews
<
2021-12-30
>
stopa18:12:37

Hey team, would love some multi-threading advice. So, I’m writing an interpreter for Paul Graham’s “Bel”. I want to make it multi-threaded. Right now, my single single-threaded interpreter works like this:

(defn bel-eval [expression-stack return-stack] 
  (loop [es expression-stack rs return-stack] 
    (if (empty? es) (pop rs) 
    ;; nothing more to evaluate, return top of the return-stack 
    (let [top (peek es)
          rest-es (pop es)
          [es' rs'] (bel-eval-step rest-es rs top)] 
      (recur es' rs')))))
We work through an expression stack until it’s empty. When that’s the case, we return the top of the return-stack. Now, to make this multi-threaded. We need a way to: 1. “create” new threads 2. to run multiple “bel-eval-step”s for each thread in parallel 3. to “lock” evaluation, so only one thread can work when needed Here’s pseudo-code for what I am thinking we can do
(defn bel-eval-multi-threaded [expression-stack return-stack] 
  (let 
   [lock todo-1 ;; when taken, only one bel-thread is allowed to work
    workers todo-2 ;; a pool to run evaluation steps for different bel-threads in parallel.
    schedule todo-2 ;; when called, we schedule an "evaluation step"
    evaluation-step ;; each worker runs a bel-thread's evaluation step
    (fn [thread]
      ;; we do one evaluation step. 
      (let [next-step (bel-eval-step thread)]
        (cond
          ;; if it's a command to lock, we lock workers to this thread-id, so only the worker for this thread can do it's job
          (lock? next-step)
          (do (lock! (thread-id thread))
              (schedule (pop next-step)))
          ;; if it's unlock, we let all the threads work again
          (unlock? next-step)
          (do (unlock! (thread-id thread))
              (schedule (pop next-step)))
          ;; evalaution can produce a new thread. If that's the case 
          ;; we schedule the new thread, and our current next step too
          (new-thread? next-step)
          (schedule (peek next-step) (pop next-step))
          :else 
          ;; otherwise, we give the workers the next evaluation step for this thread
          (schedule next-step))))
    
    orchestrate todo-3 
    ;; some way to keep listening to workers, until there's no more threads
    ;; when that happens, we pop the return-stack of the last thread and return it
])
  ;; kick off by adding one piece of work for the workers
  (schedule [[(make-thread-id) expression-stack return-stack]])
)
Does this make sense? Basically, I want to create a loop, where one evaluation step runs per thread. Sometimes a thread can “stop” all other thread’s evaluation steps. What would be the most idiomatic way to do this? I haven’t done too much multi-threaded programming, and am not quite sure what would be the best combination of tools to use to achieve this. Thoughts much appreciated!

emccue18:12:57

do you want concurrency or parallelism?

emccue18:12:54

I ask because you can make logical threads that are handled by only one “platform” thread pretty easily with continuations in a loom preview

stopa18:12:42

I would love to achieve parallelism! (with the support for locking)

emccue18:12:04

ah okay, so thats harder

😆 1
stopa18:12:09

Indeed! Quite a doosey to think through

emccue18:12:34

just thinking out loud here - lets model each thread like this

❤️ 1
emccue18:12:26

a thread executing code has a loop like this

- read top of stack
- execute code, get new stack
- see if it has a message from its orchestrator
  - if the message tells it to wait on a lock, it waits on a lock
- if top of stack from before indicates that it wants other threads to wait, send a message to its orchestrator

emccue18:12:13

and you have one topmost thread that controls all the others

emccue18:12:24

and the threads communicate via channels

❤️ 1
emccue18:12:34

and the messages can just be mutexes they want to be waited on - so one thread aquires a mutex and sends it up to its orchestrator to be sent down again to the others

❤️ 1
emccue18:12:13

but its still tricky and im like 90% sure i dont have a full picture - like do you need confirmation each thread is locked before continuing?

emccue18:12:06

i guess you could track the number of locks and an orchestrator would know exactly how many threads it controls

emccue18:12:33

so it can send a confirmation back to a thread allowing it to continue

emccue18:12:37

or send an “i am waiting” message back

stopa18:12:54

Indeed, this is quite like what I have in mind too! I’m not quite sure how to convert this to code, but will give it a shot. and update

stopa19:12:53

Refined the pseudo-code a bit: https://github.com/stopachka/bel-clojure/pull/18/files#diff-714d27f379dceef9ed3877d5690c96717565dc82f03fe3942cf31ab5e92da813R720-R752 In essence, following your idea @emccue. If I can figure out the right way to represent orchestration, and the lock? unlock? lock! functions, we may be somewhere!

phronmophobic20:12:06

what types of mutable references does bel have? Does Bel have any specification for what should happen in the presence of multiple parallel evaluations?

phronmophobic20:12:51

One option is to use refs for all mutable references, but that still doesn't make it easy for users of bel to write correct multi-threaded code.

phronmophobic20:12:35

will your Bel implementation provide any interop to jvm code? If so, you could take clojure's approach and just provide access to JVM primitives like locks, volatile, etc and leave it up to the Bel user to figure out how to structure their program for parallelism

phronmophobic20:12:32

It's not clear if you're trying to implement a GIL (global interpreter lock). If not, in what cases would multiple threads be allowed to run simultaneously?

stopa21:12:59

@smith.adriane, your question cleared things up quite a bit, thank you! I looked over his spec once more, and I realized…he doesn’t actually do parallelism, but concurrency. The way PG does it is this: A user can create multiple “bel” threads. The interpreter runs on one system thread, but cycles through each “bel thread”. When deciding to cycle the thread, if lock is bound to some truthy value in that thread’s environment, the scheduler will stop cycling and only work on that thread. Bam. Will implement above!

toot 1
stopa23:12:26

https://github.com/stopachka/bel-clojure/pull/19/files <- for posterity — it woorked! Really appreciate this community — it’s made me a better engineer

🙌 1