Fork me on GitHub
#clojure-dev
<
2019-10-02
>
andy.fingerhut20:10:59

Performance benchmarks published for the latest core.rrb-vector library, with comparisons to the same set of JVM vector/list libraries as published in the bifurcan set a couple of years ago (and run using the same benchmarking code used by bifurcan): https://github.com/clojure/core.rrb-vector/blob/master/doc/benchmarks/benchmarks.md . Definitely room for improvement in core.rrb-vector pretty evident there. I've been focusing on making it correct before making it fast.

seancorfield22:10:48

@andy.fingerhut Do you have a sense of how many people are using core.rrb-vector in production code? Are there specific use cases where it makes a dramatic difference?

seancorfield22:10:15

(if I should just study the readme and it's references, LMK)

andy.fingerhut22:10:25

The README could be better in terms of suggesting use cases than it is now. The original research papers which describe the data structure suggest one: map-reduce kinds of things on a big memory multi-core machine where one step divides up the vector into chunks, operates on each chunk in parallel, and the results of each of those could vary significantly from the input chunk sizes. Now you want to do a second map-reduce step, but divide up the intermediate results into equal size pieces very quickly, even though the first step output chunks vary significantly in size.

andy.fingerhut22:10:52

In general, though, whatever a vector is good for, but with faster-than-linear time concatenation and subvec operations.

andy.fingerhut22:10:43

My current fascination with them is based more upon understanding the data structure than having an application ready at hand that I want it for.

andy.fingerhut22:10:47

I know that the fipp pretty printer uses core.rrb-vector, and one of the recently fixed bugs came from a fipp user. I doubt the library had heavy production use, given the number of bugs that caused it to throw exceptions or return incorrect results.

Alex Miller (Clojure team)22:10:57

I seem to pretty regularly run into people using it transitively through various things, more often than I would expect

Alex Miller (Clojure team)22:10:16

I have not tried to catalog those cases so can't really say any specifics

andy.fingerhut22:10:31

I would guess that fipp, through some pretty-printing plugins for Leiningen and/or other REPLs, might be a noticeable fraction of those.

seancorfield22:10:41

Interesting. Thank you!

andy.fingerhut22:10:38

I know that cljs-test-runner depends transitively on core.rrb-vector, I think through fipp, because I can't use cljs-test-runner for the core.rrb-vector itself unless cljs-test-runner is changed to use shading of the libraries it needs.