Fork me on GitHub
#code-reviews
<
2017-01-18
>
poooogles14:01:53

Not sure if this is allowed, but any pointers on http://codereview.stackexchange.com/questions/152944/clojure-flatten-a-nested-map would be great 🙂

sveri15:01:34

@poooogles Does the input map always have the same structure?

sveri15:01:49

or can it be arbitrarily recursive?

sveri15:01:56

Running that function from SO with the provided example with criterium gives an execution mean time of 2.947716 µs

poooogles15:01:27

@sveri Arbitrary I'm afraid. I've got some criterium tests on the real structures which are much bigger that are nowhere near as fast.

poooogles15:01:45

FAIL in (test-normalise-performance) (idbevt_test.clj:51)
Check performance of our normalisation.
expected: (< (first (:sample-mean result)) target-time)
  actual: (not (< 5.225852301972124E-4 2.5E-5))

poooogles16:01:29

x86_64 Mac OS X 10.9.5 4 cpu(s)
Java HotSpot(TM) 64-Bit Server VM 25.51-b03
Runtime arguments: -Dfile.encoding=UTF-8 -XX:+TieredCompilation -XX:TieredStopAtLevel=1 -XX:-OmitStackTraceInFastThrow -Dclojure.compile.path=/Users/spegler/Documents/Clojure/clj-utils/target/base+system+user+dev/classes -Dclj-utils.version=0.3.3 -Dclojure.debug=false
Evaluation count : 91320 in 60 samples of 1522 calls.
      Execution time sample mean : 810.484728 µs
             Execution time mean : 811.323195 µs
Execution time sample std-deviation : 132.183916 µs
    Execution time std-deviation : 134.245365 µs
   Execution time lower quantile : 628.892823 µs ( 2.5%)
   Execution time upper quantile : 1.069945 ms (97.5%)
                   Overhead used : 12.416037 ns

poooogles16:01:03

is with the "real" data, I just posted the most simple solution I could as I assumed optimisations would scale.

sveri16:01:07

My first assumption would be that its the recursion that causes this. Would be interesting to find out how deeply nested they are in the most common case and setup test cases for these

sveri16:01:19

You will need them later anyway for comparison if you improve the performance

poooogles16:01:13

I'll see if I can post a gist with some better examples.

sveri19:01:12

@poooogles Hm, this brings me to 21,695200 µs which is an order of magnitude more than before but still an order less than your ones.

sveri20:01:25

Apart from that, is there maybe a way to tackle this by redesigning the data? Maybe if you know the structure of the data in advance you could pick out the keys?

sveri20:01:56

Or when you build the map, build it in a different way? Or keep track of the map building process and cache the paths to the keys you are interested in?