This page is not created by, affiliated with, or supported by Slack Technologies, Inc.
2017-12-05
Channels
- # adventofcode (419)
- # aleph (8)
- # aws (6)
- # beginners (148)
- # boot (9)
- # cider (24)
- # cljs-dev (37)
- # cljsjs (8)
- # clojure (134)
- # clojure-android (6)
- # clojure-brasil (15)
- # clojure-dev (8)
- # clojure-dusseldorf (2)
- # clojure-greece (67)
- # clojure-italy (8)
- # clojure-japan (3)
- # clojure-russia (3)
- # clojure-spec (8)
- # clojure-uk (13)
- # clojurescript (54)
- # clojurex (6)
- # cursive (5)
- # data-science (12)
- # datomic (15)
- # defnpodcast (11)
- # emacs (25)
- # fulcro (95)
- # graphql (3)
- # lein-figwheel (1)
- # leiningen (27)
- # luminus (1)
- # lumo (6)
- # mount (2)
- # off-topic (112)
- # om (3)
- # onyx (24)
- # perun (3)
- # re-frame (20)
- # reagent (1)
- # reitit (2)
- # ring-swagger (13)
- # rum (10)
- # shadow-cljs (45)
- # spacemacs (24)
- # sql (2)
- # unrepl (78)
- # yada (1)
In the end, I made a mistake by using http://repl.it
I started my solution with also updating the new location, made the problem a lot harder and incorrect 😕 Part B took some time indeed
Heading to bed, but I used a vector and update
, no transients, and part 2 took about 10s to run for me.
I used iterate
- easy to think about a state -> state function and an exit condition on that state.
@dpsutton I used an int-array, took about 120ms to solve 2 🙂
So yeah, Clojure is fast 😇
wooo, it’s finally finished! Probably took around 10 minutes, though I was running it through a REPL instance in Visual studio code.
Using update-in and a vector - Part 2 took 36 seconds on my little computer.
This is my slow solution: https://pastebin.com/Cj4VZVR3 I’m new to Clojure, if anyone can see something I’m doing that’s really stupid (and maybe explains the slowness) that’d be very helpful
@U892TNBRN you should inline inc-accordingly, when you define it as a function it will convert primitive ints to objects
@U892TNBRN and maybe more importantly, you should probably type-hint nrs with ^ints nrs
evaluate (set! *warn-on-reflection* true)
before running your code to see if that's the case
thanks for the suggestions, very interesting! I’m missing the understanding of why those things make a difference though, I’ll do some googling to find that out 🙂
null:12 recur arg for primitive local: current_index is not matching primitive, had: Object, needed: long
Auto-boxing loop arg: current-index
Reflection warning, projects/clojure/advent-of-code/day5/solution.clj:8:37 - call to static method alength on clojure.lang.RT can't be resolved (argument types: unknown).
Reflection warning, projects/clojure/advent-of-code/day5/solution.clj:10:21 - call to static method aget on clojure.lang.RT can't be resolved (argument types: int, int).
@U892TNBRN you have a bug actually, you want to do (alength nrs)
well not a bug actually, but probably not what you intended either
How can I get rid of this warning:
Boxed math warning, ... - call: public static boolean (java.lang.Object,long).
Trying mutable arrays for speedupI have one variant with a vector, a transient and an array: https://github.com/borkdude/aoc2017/blob/master/src/day5.clj I’m still wondering why the array version is so slow
@val_waeselynck Do you have your solution somewhere?
yeah, transient seems fastest for me too:
advent.2017.day5> (c/quick-bench (solve2 input))
Evaluation count : 6 in 6 samples of 1 calls.
Execution time mean : 6.091951 sec
Execution time std-deviation : 254.233063 ms
Execution time lower quantile : 5.895512 sec ( 2.5%)
Execution time upper quantile : 6.518767 sec (97.5%)
Overhead used : 1.887221 ns
Found 1 outliers in 6 samples (16.6667 %)
low-severe 1 (16.6667 %)
Variance from outliers : 13.8889 % Variance is moderately inflated by outliers
advent.2017.day5> (c/quick-bench (solve2-int-array input))
Evaluation count : 6 in 6 samples of 1 calls.
Execution time mean : 1.403268 sec
Execution time std-deviation : 18.226029 ms
Execution time lower quantile : 1.381179 sec ( 2.5%)
Execution time upper quantile : 1.419574 sec (97.5%)
Overhead used : 1.887221 ns
advent.2017.day5> (c/quick-bench (solve2-transient input)))
Evaluation count : 6 in 6 samples of 1 calls.
Execution time mean : 1.306124 sec
Execution time std-deviation : 15.056565 ms
Execution time lower quantile : 1.291127 sec ( 2.5%)
Execution time upper quantile : 1.320360 sec (97.5%)
Overhead used : 1.887221 ns
@orestis https://github.com/vvvvalvalval/advent-of-code-2017/blob/master/src/aoc2017/day05.clj
doing so gives:
advent.2017.day5> (c/quick-bench (solve2-int-array input))
Evaluation count : 6 in 6 samples of 1 calls.
Execution time mean : 1.393645 sec
Execution time std-deviation : 10.345934 ms
Execution time lower quantile : 1.381836 sec ( 2.5%)
Execution time upper quantile : 1.404963 sec (97.5%)
Overhead used : 1.887221 ns
Some performance tips: - Vars lookups are slower and harder to optimize than locals lookups - Make sure to type hint primitives and primitive arrays when necessary - Primitive values (ints, bytes, booleans etc.) cannot travel across functions (they get boxed into objects) - Protocols are way faster than multimethods - Looking up a key in a record type is 5 to 10x faster than looking it up in a plain map
length doesn't change so i set that in a let before the loop.. for me it was the comparison not being (int 3)
which was slowing things down
@val_waeselynck I modified your version a bit, almost halved the execution time (650ms -> 380ms) - just moved the alength call outside.
@orestis Do you mean that moving the alength
call outside improved the perf of my version by 2x?
I think that might be it, yes. I’m still a noob so I’m not really sure what I’m doing there; type hinting is a bit opaque, but I don’t get any reflection warnings so…
I thought I would be able to typehint a single ^int but I get this: https://stackoverflow.com/questions/15230061/cant-type-hint-a-local-with-a-primitive-initializer
advent.2017.day5> (c/quick-bench (bork-int-arr input))
Evaluation count : 6 in 6 samples of 1 calls.
Execution time mean : 195.021813 ms
Execution time std-deviation : 516.193459 µs
Execution time lower quantile : 194.408773 ms ( 2.5%)
Execution time upper quantile : 195.624313 ms (97.5%)
Overhead used : 1.887221 ns
(defn solve2-int-array [input]
(let [i (into-array Integer/TYPE input)
len (count input)]
(loop [idx (int 0) c (int 0)]
(if (or (> 0 idx) (<= len idx)) c
(let [v (aget ^ints i idx)]
(if (<= (int 3) v)
(aset-int i idx (dec v))
(aset-int i idx (inc v)))
(recur (+ idx v)
(inc c)))))))
@borkdude’s version takes 1000ms on my machine… Is running this inside a CIDER repl a bad idea?
@minikomi put this in your code:
(set! *warn-on-reflection* true)
(set! *unchecked-math* :warn-on-boxed)
Tiny adaptation; I create the array with int-array, and don’t have the reducible-resource
stuff; but I do this for all the adaptations.
This is from 2015 btw: https://stackoverflow.com/questions/34153369/why-is-this-clojure-program-working-on-a-mutable-array-so-slow
@borkdude Have a look at https://github.com/orestis/adventofcode/blob/6537d8dda0825871fdd9acad0daf95570d82ef87/clojure/aoc/src/aoc/2017_day5.clj#L95 ; it’s the fastest in this file.
weird.. it's like, make an array of type integer (by the way, it's an integer array!)
BTW, everyone has different inputs so we need a baseline if we are going optimize this silly thing 🙂
tfw when you get out of bed somewhat earlier for Advent of Code, but CIDER won’t connect to your nrepl anymore and you have to fall back on inf-clojure …
Is there any project that could easily take those different implementations and run proper benchmarks etc? Not sure how warm/cold JIT makes any difference here.
aset:
Evaluation count : 6 in 6 samples of 1 calls.
Execution time mean : 102.748740 ms
aset-int:
Evaluation count : 6 in 6 samples of 1 calls.
Execution time mean : 1.391871 sec
@borkdude Well, the “naive” Clojure version that uses a vector can look really similar to the one that uses int arrays. Still a 10 line function anyway 🙂
Googling aset-int slow, this book seems to concur: https://books.google.co.jp/books?id=4QacPa1vwMUC&pg=PA148&lpg=PA148&dq=%22aset-int%22+slow&source=bl&ots=2AACgvj4qk&sig=ysLjEPVkInOBPUdMj9Wgdgyqrz0&hl=en&sa=X&ved=0ahUKEwitpbCt0_LXAhUDNJQKHYiRDo8Q6AEIQzAE#v=onepage&q=%22aset-int%22%20slow&f=false Seems like something to stay away from in the future then!
Is there a usecase for aset-int
if it is slower than aset
? Or is it just a legacy function?
@chrisblom very good question
I usually start with reduce, but when I need multiple things as the argument of the reducing function, I give up and fall back on loop
if i macro expand the definition and clean it up a bit there are only 2 differences: aget has :inline metadata and uses .set instead of .setInt
The nice trick I missed is relying on get
to return nil
, as opposed to pre-checking the index
Yeah, that's the story. You can write clean Clojure, and then when you want it to go faster, you delve into messiness. (I suppose that's true in any language.)
Well, Scala really is better (not the right word, I must say, friendlier/easier) at this kind of stuff, while still allowing you to do functional. I think the type system helps to generate more primitive stuff if needed.
But I only need mutability this during Advent of Code really, and then only for extra performance, not for getting the answer, so 🙂
Ahh, right. Does Scala essentially generate code like C++ does when instantiating templates?
(The closest I think I've seen Clojure come to this is how ClojureScript has macro variants of functions that can try to generate optimal code based on what is available at compile time.)
@mfikes Don’t know, I just didn’t have as much trouble with it as in Clojure: https://gist.github.com/borkdude/b37563939639b40013f483361a7c5d8f
An example of how (I suppose essentially anything) can be done with macros is how str
in ClojureScript passes the literal strings in something like (str "a" 1 "b")
untouched. (I think in Clojure it may still end up invoking (.toString "a")
and (.toString "b")
.)
That may not be the best example, but it illustrates that anything can be done at compile time with macros if you are trying to eek out perf for the generated code.
I'm just saying that if Scala can do nice things with types, Clojure may also be capable of the same if you go down the road of using macros (and if you have access to inferred types).
With the macro approach you can get rid of some calls like (str "foo" "bar")
, but will it get you very far, I don’t know?
There was also a talk about this topic at EuroClojure 2017: http://2017.euroclojure.org/nada-amin/
where the type system was used to generate fast low level code: https://www.youtube.com/watch?v=QuJ-cEvH_oI
I haven't thought about this too much, being on the JavaScript side of things 🙂 In that world, the engine watches the values fly by and infers their types, dynamically generating optimal machine code, and de-opting if it sees a value that violates assumptions, etc.
The end effect is that where in Clojure you can manually hint, in ClojureScript, you need to let the lower layers do it for you.
Yeah, I kind of wondered why you don’t need to add type hints in ClojureScript, so that’s how it works
PureScript solution: https://github.com/krisajenkins/AdventOfCode/blob/master/src/Year2017/Day5.purs
The primary type hints in ClojureScript are ones that help avoid checked if
: If the compiler knows it is dealing with a Boolean value for example, it can just use JavaScript's notion of truthiness.
@mfikes Do you perhaps have a clojurescript version of day 5 which runs on Node, to compare it with the performance of PureScript day 5?
I can take your fastest mutable array version, and produce an optimized Node-compatible JavaScript file for you to use to compare with PureScript.
I don’t about the absolute fastest, but this is my fastest: https://github.com/borkdude/aoc2017/blob/master/src/day5.clj#L54
Cool. I'm putting together a GitHub repo so it is possible to revise the code if needed and build and run it in Node
@mfikes I only asked if you had this laying around, not if you wanted to put time in it, but if you like to do it, still thanks 🙂
@borkdude For me, ClojureScript / Node.js is running that array-based solution 25% faster than Clojure. Here it is, set up so it is trivial for anyone to reproduce perf results or change the code: https://github.com/mfikes/advent-day-5-cljs-node
Interestingly Planck runs it at about the same speed as the Closure optimized JavaScript under Node.js. (JavaScriptCore is wicked fast.)
Maybe the JVM didn't have enough time to warm up. V8 is probably more optimized than the JVM for quick warmup
@val_waeselynck I specifically have the perf test running over and over again to avoid warm up issues. (Good point, though 🙂 )
@mfikes yeah I noticed but I'm wondering if that's enough, Criterium typically seems to do much more sophisticated things to take both warmup and GC into account :thinking_face:
@val_waeselynck I agree. But we have a huge perf diff right now to sort out 🙂
Yeah, with any perf tests you really really need to provide the code so others can try it as well 🙂
@borkdude so what are your numbers now?
@val_waeselynck same as mike’s repo, ~ 180 ms on my machine in the JVM
My numbers are Clojure: 167 ms, ClojureScript, Closure / Node.js: 384 ms Planck: 570 ms
Alright, thanks!
Yes, but was it the same input? 🙂
especially when you haven't compiled it with heavy optimizations
JIT is probably the best strategy for low latency (including the whole write + debug + compile + run process)
@mfikes AFAIK, Scala's generics aren't like C++ templates. Scala's generics are mostly the same as Java's generics - type erasure and boxing. There are conveniences, like implicit parameters to capture class tokens. But they're fundamentally not a code generation mechanism, as C++ templates are.
@bytecrafter Cool. I vaguely (perhaps mis-) recall something in Scala generating code covering the possible permutations of primitives, but that was perhaps 5 years ago for me.
Ahh, "specializing" is what I recall. For example http://www.scala-notes.org/2011/04/specializing-for-primitive-types/
my array implementation performs much worse, so I'm going to take this opportunity to learn some things ....
I optimized for readability, not worrying about perf: https://github.com/mfikes/advent-of-code/blob/master/src/advent_2017/day_05.cljc#L14-L19
Came up with another perf optimization, but didn’t help: https://github.com/borkdude/aoc2017/blob/master/src/day5.clj#L127 Still 232 ms
@bhauman Didn’t see your solution, but aset-int
are not so good for performance, we found out today
@bhauman Also:
(set! *warn-on-reflection* true)
(set! *unchecked-math* :warn-on-boxed)
helps@bhauman According to the docs, I think you probably want (int ...)
instead of ^int ...
https://clojure.org/reference/java_interop#optimization
@bhauman I'm currently writing a home-made GraphQL engine, so I'm totally in the performance business right now 🙂
it is 🙂 actually not exactly a GraphQL engine, since it's an alternative to GraphQL with some added features.
However, I guess the core server-side algorithm could be reused to implement both GraphQL engines and Om Next routers
@borkdude mine is similar, but i'm not passing the mutable array via recur, runs in 130ms on my machine: https://github.com/ChrisBlom/advent-of- @code/blob/master/src/adventofcode/2017/day05.clj#L16-L33
@borkdude What kind of black magic is involved in annotating a local as a primitive ^int
but treating it as a Boolean?
I'm referring to the annotation on this line https://github.com/mfikes/advent-day-5/blob/master/src/advent_day_5/core.cljc#L17
@mfikes The bound value is an int, because and returns the int. But if it’s non-true, it won’t be bound at all
(set! *warn-on-reflection* true)
(set! *unchecked-math* :warn-on-boxed)
(if-let [^int x (and true 1)] (inc x) :else) ;; good
(if-let [x (and true 1)] (inc x) :else) ;; bad, boxed
(macroexpand ’(if-let [^int x (and true 1)] (inc x) :else))
(let* [temp__4655__auto__ (and true 1)] (if temp__4655__auto__ (clojure.core/let [x temp__4655__auto__] (inc x)) :else))
@mfikes Interesting, I didn't know about that in Scala. Thanks!
@mfikes I thought about that, but since I got the answer, I didn’t worry about it anymore 😉
This is just awesome. Advent of Code in pure Postgres… https://github.com/xocolatl/advent-of-code-2017/blob/master/Day05/script.sql
I was trying to eliminate a checked if in what ClojureScript produces, and got it to run faster, but it also made the Clojure faster (if everything pans out)
Here is the change https://github.com/mfikes/advent-day-5/commit/45aa111e8df606884f5af4ffd886aa2a8af393eb?w=1
It is surprising that ternary <
in Clojure botches the perf by an order of magnitude, but not so in ClojureScript. Specificially, trying to do the right thing and testing (< -1 cur-pos length)
instead of just (< cur-pos length)
makes it run in 900 ms instead of 87 ms.
Anyway, I put the 87 ms-version on master https://github.com/mfikes/advent-day-5/blob/master/src/advent_day_5/core.cljc#L8-L27
Well, I could in fact shave of a couple of ms with this trick: https://github.com/borkdude/aoc2017/blob/master/src/day5.clj#L115
but maybe this is no good, I don’t see a real change when applied to your benchmark with dotimes
Run 58 of 60.
“Elapsed time: 102.893639 msecs”
24315397
Run 59 of 60.
“Elapsed time: 90.53835 msecs”
24315397
Run 60 of 60.
“Elapsed time: 100.259889 msecs”
24315397
---
Run 1 of 60.
“Elapsed time: 129.65845 msecs”
24315397
Run 2 of 60.
“Elapsed time: 104.405081 msecs”
24315397
Run 3 of 60.
“Elapsed time: 96.121724 msecs”
Posted it under the C++ solution: https://www.reddit.com/r/adventofcode/comments/7hngbn/2017_day_5_solutions/dqt0s5v/ 😎
In Node, my reduce-based "normal" solution takes 8 seconds, while the optimized version takes 80 ms in Node.
I should post this as an 80-ms JavaScript based solution: https://gist.github.com/mfikes/4ef3d2c3efc8f72a848e9149e1229e84
@chrisblom Removed the maze from the loop arguments, another 4 ms 🙂
@mfikes Could you try this commit? It seems to be 30 ms faster on Node on my machine than the original: https://github.com/mfikes/advent-day-5/commit/3fdda4ba34d144fb795674f51ba81927c9125d1c
to unroll a loop you need to know in advance how many times you will loop, no? and that isn't known until you get there
unless you mean make the loop include several iterations with escape hatches in each?
but i thought loop unrolling was a strategy to evade the loop test which we cannot escape so we always pay the branch penalty
On my machine, branch master:
24315397
Run 57 of 60.
“Elapsed time: 122.000000 msecs”
24315397
Run 58 of 60.
“Elapsed time: 136.000000 msecs”
24315397
Run 59 of 60.
“Elapsed time: 125.000000 msecs”
24315397
Run 60 of 60.
“Elapsed time: 128.000000 msecs”
24315397
Branch faster:
“Elapsed time: 114.000000 msecs”
24315397
Run 58 of 60.
“Elapsed time: 109.000000 msecs”
24315397
Run 59 of 60.
“Elapsed time: 106.000000 msecs”
24315397
Run 60 of 60.
“Elapsed time: 109.000000 msecs”
24315397
Right, something was messed up with the squashed commit I put on master. It was causing an inexplicable regression.
I noticed when compiling some code on the background, severely impacted the performance, so these times also depend on what your machine is doing. For some real benchmarking criterium or something is needed.
@mfikes do you mean this commit is the newest now? https://github.com/mfikes/advent-day-5/commit/45aa111e8df606884f5af4ffd886aa2a8af393eb
Right... the if-some
should be cleaned up. It is unnecessary, but AFAICT it doesn't hurt perf.
With let:
"Elapsed time: 132.000000 msecs"
24315397
Run 59 of 60.
"Elapsed time: 129.000000 msecs"
24315397
Run 60 of 60.
"Elapsed time: 138.000000 msecs"
24315397
With if-some:
Run 57 of 60.
"Elapsed time: 111.000000 msecs"
24315397
Run 58 of 60.
"Elapsed time: 106.000000 msecs"
24315397
Run 59 of 60.
"Elapsed time: 111.000000 msecs"
24315397
Run 60 of 60.
"Elapsed time: 106.000000 msecs"
24315397
(Node)Ahh... indeed, converting it to a let
and dropping the else branch makes it slower for me on node than having an if-some
... going to look at the generate JavaScript
@bhauman thanks 🙂 I'm amazed at all of the different ways people approach this, and I'm sure what seems obvious to one seems clever to someone else. I'm learning a lot by reading. Trying to figure out a way to make it all stick as efficiently as possible.
@spfeiffer We discovered that Node gets speedup from adding obsolete code…
@borkdude I did a bunch of tests. It seems to boil down to the fact that an extra nil
-check, while not free, pays off because it lets the Node optimizer know that it is doing arithmetic on non-`null` values. (We as humans know that it the values pulled out of the array are not nil
, but the optimizer doesn't.) In fact, I saw one test run with Lumo where it started off slow, and then "figured it out" on its own, and started running at the faster speed, while in a loop, without me doing anything. I also found that you can get a lesser speedup by changing (aget maze cur-pos)
to (int (aget maze cur-pos))
. This gets compiled down to JavaScript that involves | 0
, which is evidently a ostensibly useless appendage added to JavaScript arithmetic code to "hint" to the optimizer that it is doing integer arithmetic. So, when-some
or even a more straightforward (when-not (nil? ...)) ...
, while not changing the functionality, ends up acting like a hint, or an assertion that the optimizer can use to its advantage.
One interesting question in my mind: In the early days, types were used for performance. Will we reach a point where types are inferred at runtime sufficiently well so that the performance of such code behaves just as well as statically typed code. Or, to put it another way, with enough money being dumped into JavaScript VMs, will ClojureScript end up running faster than Clojure one day. It can already do this at times for unhinted Clojure, but I'm wondering if it will all get to be so good where you can just write your dynamic unhinted Lispy code and it will run extremely fast.
PureScript probably won’t have much benefit from typing to performance because of its target?
An analogy: I used to find it odd that you didn't pass flags like -O3
to the javac
compiler. It was difficult to just "let go" of the idea optimizing at compile time, and defer it all to runtime. One argument is that if you let the runtime do the optimization, it can leverage the call and memory access patterns in your program, and perhaps do a better job than a static analyzer could do. If static type info is available, then, sure, it can be useful. I'm just wondering if we as human developers will be in the low-level business of type hinting a decade or two from now.
@mfikes I'd argue we're already there with pypy
I've seen entire transducer pipelines compile down to half a dozen assembly instructions.
if a jit finally believes you that they are ints, wouldn't it believe you if you said they were ints in the first place?
@mfikes what you describe is pretty much the premise of tracing JITs. https://en.wikipedia.org/wiki/Tracing_just-in-time_compilation
@dpsutton hints can also be used for storage optimization. In that case they're a bit stronger than assertions.
But yeah, hints are a way of papering over the inability of a JIT to deoptimize.
JS and PyPy do deoptimization. Put only ints in a list and the JIT will create a typed array of ints. Once you try to put an object in the array it will convert the list into an list of objects and then add the object to it.
So the question becomes, why should you ever have to hint at all? In Clojure it's because the JVM doesn't allow user-level deoptimization.
And it's a method jit, and that doesn't help either
Well with types and something like C# or C++ deoptimization can never occur, since all the types are known at compile time.
On the JVM it's a bit strange, but it will deoptimize a callsite. So as you implement IFoo on more and more types the JVM may recompile every method that calls IFoo.foo.
That's why adding stuff like tuples to Clojure doesn't work since the perf improvement is a wash compared to the de-optimized callsites.
What tracing jits like PyPy do, is look at the context for a loop. Within a single loop they inline all methods and reduce all types to their lowest level parts. So as long as you never call a method with a map within that single loop, the code for handling that map will never exist in the loop.
Right, on the JVM transducers are always boxed. On JS the JIT is a bit more tracing-like, so you get primitive math loops.
Do a transduce over an array in CLJ and CLJS and the CLJS version will be much faster if all you're doing is math.
For the day 5 problem, it seems that it reads and writes rapidly to lots of parts of a big array. I wonder if memory bandwidth is affecting things more than any of the math / logic.
@mfikes in what language? and what access pattern?
I've seen 4x perf improvement from making sure I scanned an array from 0 to X instead of walking it randomly.
Pretty satisfied that we got it down to 75 ms, the same number the C++ and Rust guys are mentioning on Reddit 🙂
I wonder how much using unchecked math would improve that?
ah, cool, I missed that post
@tbaldridge Mike discovered that if-let generates less performant code than just if and let…
The array is only about 1000 integers. We read from a spot which tells us where to jump to next, and we write to the spot we just read from with a new offset derived from the old one. Repeat until we hop out of the array.
Oh, I'd summarize it this way: By guarding a block of code with a nil
check, that code runs faster under Node, even though we know (as humans) that none of the values are nil
.
we went from if-let to if-some and then to let… if-some turned out to be beneficial on Node… it’s hard to optimize for both worlds
yeah, I doubt memory perf is the issue then. As 1000 integers at 8 bytes that should easily fit in the cache.
while the if-some was an obsolete check, the Node runtime still benefited from that, crazy
Don't we already have a solution that is around the same as one of the C++ ones out there?
@mfikes Yes, I posted it 5 minutes ago: (quick-bench (part-2-array-faster?)) ;; 74 ms
, or did you mean something else?
hey @mfikes you had a twitter thing about emitting machine code form JS right?
got the machine code for this?
Yes, you can enable JavaScriptCore's logging of compilation. But it is very hard to actually catch the code you'd like. (It spews stuff faster than you can consume it.) I'll try and see 🙂
Here is the tweet @tbaldridge referred to https://twitter.com/mfikes/status/935527618274873344
In short if you
JSC_dumpDisassembly=true planck src/advent_day_5/core.cljc
in https://github.com/mfikes/advent-day-5, you will see lots of disassemblyPerhaps this is some of the assembly JavaScriptCore is creating for our loop, but it is hard for me to tell: https://gist.github.com/mfikes/5c68510b6ab2207b85cc92697a005a3e
It's remarkable we are able to produce pretty much the same result with some carefully crafted Clojure.
FWIW I tried this one: https://github.com/vesche/adventofcode-2017/blob/master/day05.c
I can reproduce the 57 ms. I also tried inlining the input data array (instead of having the program read it from disk), but it still runs at the same speed—dominated by the loop.
an extra branch (simple boolean/int check in C) is more costly on the lower level than just doing the same loop all the time?
Talking about this one:
while (current <= 1036 && current >= 0) {
move = numbers[current];
/* if (move == 0) { */
/* numbers[current] = 2; */
/* steps += 2; */
/* current += 1; */
/* } */
/* else */
{
if (move >= 3)
numbers[current] -= 1;
else
numbers[current] += 1;
current += move;
steps += 1;
}
}
With code uncommented, it becomes slower, not faster.I also tried to move the >= 3
branch to the first place, but that becomes even slower
branch prediction at work most likely
easier to to calculate something than do a conditional