Fork me on GitHub
#off-topic
<
2022-01-01
>
mauricio.szabo02:01:58

One thing that's quite hard for me when I'm purchasing a computer is the speed of CPUs. I usually use https://www.passmark.com/ to check for things, because Intel's naming is quite bizarre and I never know if a CPU is faster or not (and vendors LOVE to use buzzwords like "9th generation" and such)

lilactown06:01:22

I'd like to describe a specific problem I have and see if someone can tell me it fits a more general problem so I can find more resources on it. Preamble: I have a task I'm trying to execute whose process I can pause and resume between each step. I have a single thread I'm cooperating with other code to use, and want to make sure that I don't spend more than a deadline (let's say 50ms) to run it. Naively, after each step I check to see how much time the task has taken, and once 50ms has passed I yield and resume at a later time. Problem: Going way over the deadline is bad, but each time the deadline is checked costs CPU. I'd like to optimize how often I check the deadline. E.g. checking the deadline every step is obviously wasting cycles if each step takes on average 0.05ms, but checking once every 60 steps means I'm definitely going to overshoot. My vague idea is once I have a sampling of how long a step takes, I can tune how often I check the deadline. Beyond that I'm not sure what a good strategy is just yet.

hiredman06:01:23

You might take a look at beam, the Erlang vm, historically at least the way it does scheduling is by counting "reductions" instead of counting actual runtime of processes, and counting something like that as a proxy for time might be less painful

lilactown07:01:30

hmm interesting

Jimmy Miller18:01:57

I'd imagine you could tune this as it went. Have a variable n that represents the number of steps to run. Run n steps. Check the time. If > deadline decrement if < deadline increment. Of course that isn't fool proof. If you can have a random task that takes the full deadline and it is completely unpredicatable, that might be bad. But if that is the case, you probably want to be looking at different processes getting their “fairshare” so if they exceed a deadline they might not get scheduled right away again. I can imagine various other ways if you can do things like async timers. But what you are talking about is cooperative scheduling. So looking into that literature might help.

lilactown19:01:44

yeah. right now I've implemented what is essentially a https://en.wikipedia.org/wiki/Round-robin_scheduling, and now I'm trying to see if I can optimize it by changing the n steps between checks intelligently. Going from every step to checking every 10 steps speeds up calculating a fibonacci number by about 10%

emccue02:01:57

hmm, can you identify tasks discretely?

emccue02:01:16

like maybe the solution is to run tasks and learn “weights” for each of them?

lilactown03:01:32

yes, each task gets an ID. I can also use the function and args used to start the task as potential identity

emccue04:01:52

What are the "workloads" of the tasks?

emccue04:01:19

Oh fib, nvm

chucklehead06:01:34

It's possible I've been doing too much network programming lately, but it strikes me as analogous to trying to size the TCP congestion window to maximize in-flight packets (consecutive tasks) while avoiding drops (missed deadlines). So - in terms of strategy - you might be able to crib from TCP congestion control/avoidance stuff like additive-increase/multiplicative-decrease to converge on a # tasks per cycle. Or slow start/fast recovery, random early detection, etc. if you need to account for intermittent slowdowns or bursty tasks.

👀 1
chucklehead06:01:09

e.g. something like

init:
  m = n = 1

loop forever:
  i = 0
  while i < m: 
    # run m tasks without pausing
    run task[i++]
  while i < n and rand(n) > i: 
    # run up to n-m tasks with some probability of early interruption (presumes rand cheaper than checking deadline)
    run task[i++]
  while !deadline: 
    # run single tasks until deadline
    run task[i++]
  if i < n: 
    # deadline was caught by an early interrupt, multiplicative decrease on low end, cap high end to current
    m = max(m/2,1); n = i
  else: 
    # could've done more work, additive increase on low end, multiplicative on high
    m += (n-m)/2; n *= 2 # or more conservative: m++; n = i

  yield

Cora (she/her)06:01:04

can checking the deadline be made cheaper, somehow, so you don't need to do all of this?

lilactown07:01:40

@U3JH98J4R the tasks are arbitrary, but I'm testing this with fib rn

lilactown07:01:17

@U015879P2F8 that's super interesting, I'll look into TCP congestion

lilactown07:01:54

@U02N27RK69K I'm not sure, the deadline check code is fairly simple:

(let [start (js/performance.now)]
      (fn check-time []
        (- ms (- (js/performance.now) start))))
it's possible there's some complex magic to make this faster, but I think I'm dying from a thousand (or hundred thousand 😂 ) paper cuts and I need to slow the bleeding