Fork me on GitHub
#specter
<
2017-12-18
>
jrheard00:12:09

i’m sure there’s a better and faster way to express this, will stare at it until something comes to mind 😄

jrheard00:12:33

hell, sometimes the performance seems noticeably slower, sometimes it seems precisely the same or even faster

jrheard00:12:02

perf is actually pretty acceptable when run through advanced compliation, nice

jrheard00:12:55

benchmarking stuff now; as you’d expect, that code runs pretty fast (35ms in 1000000 runs) when x1 = x2, but about 50x slower (1400 ms in 1000000 runs) when x1 != x2

jrheard00:12:02

so the codepath i need to focus on optimizing is

(for [x (if (< x1 x2)
          (range x1 (inc x2))
          (reverse (range x2 (inc x1))))]
  (-> data
      (nth x)
      (nth y1)))
. hm - i’m actually not sure how to do this in a more performant way - the ranges and reverses don’t seem to be the issue afaict, it’s just the bit where we have to use nth to index into each column and then call nth on that column to find our value

nathanmarz01:12:41

@jrheard the select* codepath should call next-fn on each grid cell, rather than calling next-fn on a sequence of cells

nathanmarz01:12:47

that will be a lot more performant

jrheard01:12:51

o nice, i’ll try that

jrheard01:12:25

will do, thanks!

jrheard03:12:32

hm, that may have caused a speedup but not a drastic one

jrheard03:12:47

ended up with this:

(defnav
  grid-values
  [x1 y1 x2 y2]
  (select* [this structure next-fn]
           (assert (or (= x1 x2)
                       (= y1 y2)))

           (if (not (cell-is-on-grid x1 y1))
             ; If your starting cell isn't on the grid, you get nothing.
             []

             (let [x2 (bound-between x2 0 (dec GRID-WIDTH))
                   y2 (bound-between y2 0 (dec GRID-HEIGHT))]

               (if (= x1 x2)
                 (let [column (nth structure x1)]
                   (doseq [value (if (< y1 y2)
                                 (subvec column y1 (inc y2))
                                 (reverse (subvec column y2 (inc y1))))]
                     (next-fn value)))

                 (doseq [x (if (< x1 x2)
                           (range x1 (inc x2))
                           (reverse (range x2 (inc x1))))]
                   (next-fn
                     (-> structure
                         (nth x)
                         (nth y1))))))))

jrheard03:12:58

i noticed i had to use doseq rather than for - presumably next-fn is side-effecty?

jrheard03:12:02

nbd, just interesting

jrheard04:12:44

the only other perf thing i can think of is: rather than having grid be a 2d array, have it be a 1d array that’s primarily accessed via some function like (get-cell grid x y), so most consumers don’t treat it like a 1d array, but certain places like select* can treat it like a 1d array and perhaps get a speedup

jrheard04:12:51

which, i think i just have to try and see what happens, and benchmark before and after

nathanmarz04:12:04

doseq is not very fast

nathanmarz04:12:07

neither is reverse

jrheard04:12:11

is there anything less drastic that you’d recommend i try instead, or is this

(doseq [x (if (< x1 x2)
           (range x1 (inc x2))
           (reverse (range x2 (inc x1))))]
   (next-fn
     (-> structure
         (nth x)
         (nth y1))))
unimproveable?

jrheard04:12:34

i’ve benchmarked the reverse codepath and it doesn’t seem noticeably slower, it seems to really be the two-nths situation afaict

nathanmarz04:12:49

also, you're making two nth calls per element, when you could do one nth call per row

jrheard04:12:50

reverse is always getting a list of at most five items fwiw

jrheard04:12:22

i see how i can do one nth call in the case where i’m just doing down the Y axis - eg going from x=4,y=4 to x=4,y=8, and getting all those values

jrheard04:12:38

but i don’t see how i can do it if i’m going down the x axis, eg x=4,y=4 to x=8, y=4

jrheard04:12:38

i extremely agree with you that the two nth calls seem to be the culprit - is there some obvious better way of doing things that i’m not seeing?

nathanmarz04:12:01

calling range is also materializing an additional sequence that isn't necessary

nathanmarz04:12:29

the fastest way to do this is probably with just loop

jrheard04:12:30

(fwiw, in my second-to-most-recent snippet, there’s an if with two paths - the (= x1 x2) path is pretty fast, and it still has the range and reverse)

jrheard04:12:37

loop’s a great idea! i forgot it existed 🙂

jrheard04:12:42

i’ll try loop in the morning, thanks nathan!

jrheard17:12:48

started benchmarking things (just sloppily using cljs’s built-in simple-benchmark, nothing fancy). i have a pick-move function that i’m calling 50 times, which causes this navigator codepath to be executed about a jillion times. when i call next-fn a single time on the whole sequence, i see timings of about 9300ms for my benchmark. when i call next-fn multiple times, once per each item of the sequence, i see the same timings. this is probably because the sequence is always going to be at most five items long - i imagine that if calling next-fn once per item tends to give a speedup, perhaps the speedup is more obvious if you’re doing it on a ton of items as opposed to five? (i should also note that i’m not using advanced compilation while gathering this data)

jrheard17:12:20

i switched my slow, two-`nth`-calls doseq with range and reverse to a handrolled loop, and that brought me down from 9300ms to 7200ms, which is a good start!

jrheard17:12:57

i still wish i could figure out a way to not have to do two nth calls - the only one i can think of is to switch from having a 2d vector to having a 1d vector, and measuring the before-and-after

jrheard17:12:36

at this point i’m mainly digging into this just to satisfy my curiosity, perf’s acceptable for my toy app but i’d like to see if the 1d vector approach causes a huge speed boost, none at all, or somewhere inbetween; i’ll post the results later today when i’m done, and then i will stop spamming this channel 🙂

jrheard18:12:16

huh, basically zero speedup! my base benchmark timings are different for this one because i used a different board/hand combination to test the 2d vs 1d vector implementations; the 2d vector implementation had a benchmark time of 3100ms-3300ms, and so did the 1d vector implementation (presumably because vectors are implemented under the hood as trees, and so a 1d vector doesn’t guarantee you a contiguous block of memory)

jrheard18:12:13

so i guess this is as good as it’s gonna get; my thing still spends most of its time making those two nth calls, and i don’t see how that can be avoided if you’ve got a 2d vector of values, the first dimension holds the columns, and you’re trying to select all the values from eg x=4,y=4 to x=8,y=4. it seems to me that you have to call nth two times in that situation, unless i’m missing something

jrheard18:12:49

but anyway this has been very interesting and educational for me but no longer has much to do with specter. my app’s specter-related perf seems to be as good as it’s going to get for this query, so again, thanks for all the help and for building this great tool!

nathanmarz19:12:15

yea, for up and down it's two nth calls per element, but for left and right it's less

jrheard19:12:54

right exactly

jrheard19:12:12

the one-`nth`-call-and-then-`subvec` dimension is super fast, the two-`nth`-calls-per-element one is 50x slower, and it seems like that’s just a fact of life