Fork me on GitHub
#clojure-uk
<
2020-04-04
>
Gulli11:04:15

Good afternoon

folcon11:04:58

Hmm, does anyone else feel like getting a better handle of the memory performance of your application is a bit arcane?

Wes Hall12:04:06

@folcon on the JVM (can't speak too much about node or browser), arcane is something of an understatement 🙂

folcon12:04:09

For example: Just did this in datascript on the jvm…

(time (repeatedly 100 #(d/transact! conn (repeatedly 10000 create-household))))
"Elapsed time: 0.196 msecs"
That was fast for creating 6million entities…
(time (repeatedly 100 #(do (println :transacting)
                             (d/transact! conn (repeatedly 10000 create-household)))))
"Elapsed time: 0.331 msecs"
:transacting
:transacting
:transacting
:transacting
Ah, it’s being lazy
(time (dotimes [_ 100]
          (d/transact! conn (repeatedly 10000 create-household))))
"Elapsed time: 1019076.203 msecs"
Now this feels too slow? Am I holding onto something which is preventing it from garbage collecting? And so it goes…

folcon12:04:49

Though datascript itself works quite nicely in the jvm I must say =)…

Wes Hall12:04:29

1 "saved household" per millisecond

Wes Hall12:04:15

It's not cry worthy perhaps, depending on hardware etc. I mean, I would expect maybe a little more but...

folcon12:04:20

Hmm, I’m wondering if the garbage collector needs to be setup in some way that I’m not aware off… I thought the whole point of the current setup was that short lived objects got GC collected aggressively? Instead everything works totally fine right up until the memory taken up by my application hits ~4.3GB and then I get a huge gc pause…

folcon12:04:35

Yea, that too…

Wes Hall12:04:36

I'm not up to date with the GC stuff anymore. There did used to be sever and client flags which would change some of the settings at a high level but that might be ancient knowledge now. My general experience is that it is very hard to twiddle those controls to achieve anything too reliable. GC these days is massively complicated. It is (or was) quite generational, yes, so short lived objects shouldn't give it too much headache. I seem to remember that it can even put them on the stack depending on all the compiler magic.... but yes, arcane for sure 😉

Wes Hall12:04:12

I need to run out, wife wants to brave the shops. Time to put my best zombie apocalypse outfit on, wish me luck, bbs.

folcon12:04:59

Good luck indeed!

folcon14:04:27

Ok, Shenandoah is doing pretty well…

yogidevbear15:04:33

Morning 😄

4
😁 4
dharrigan16:04:43

So, a question about transducers. I have the follow transducer configuration

dharrigan16:04:46

(transduce (comp (partition-all 1000)
                   (map #(p/lookup-addresses %))
                   (map #(p/process-addresses % )))
             (constantly nil)
             nil
             addresses))

dharrigan16:04:14

p/lookup-addresses returns a list containing clojure maps of address details

dharrigan16:04:13

I would have though, that in the next step p/process-addresses %, that each individual item of the previous step (each item in the list) would be passed to p/process-addresses to be processed. But what happens is the entire list of the previous step is passed in.

dharrigan16:04:13

What have I failed to understand (since I'm basing this on my understanding of map which takes each item in a list in turn...)?

Ben Hammond21:04:58

so that initial (partition-all 1000) ensures that the subsequent p/lookup-addresses gets passed a collection of size 1000

Ben Hammond21:04:44

and then p/process-addresses gets a collection of however many items were emitted from p/lookup-addresses

Ben Hammond21:04:16

I'd be a bit suspicious of (constantly nil) as a reducing function though

dharrigan22:04:55

How might that look if you don't mind?

Ben Hammond22:04:07

(run! (comp p/process-addresses p/lookup-addresses) addresses)
?

Ben Hammond22:04:31

perhaps I don't understand why the partition was in there

dharrigan22:04:11

Reading from a db that can contain a few thousand results

dharrigan22:04:31

each chunk has to be sent off to an API that can only handle up to 1000 at a time

dharrigan22:04:52

anythooo. time for me to catch some kip, I'll pick this up again in tomorrow 🙂

👍 4
Ben Hammond22:04:03

(dorun 
  (sequence (comp (partition-all 1000)
              (map #(p/lookup-addresses %))
              (map #(p/process-addresses % )))
    addresses))
would keep you in transducer-land

Ben Hammond22:04:16

you could use

(mapcat #(p/lookup-addresses %))
if you need to flatten out the 1000-element collection before it gets passed to p/process-addresses

rickmoynihan23:04:14

why use transducers, it looks like you only require eagerness for the effects. Isn’t this much clearer?

(doseq [batch (partition-all 1000 addresses)
        response (lookup-addresses batch)]
    (process-addresses response))

dharrigan07:04:06

Thanks all for the input. I'm going to digest everything!

alexlynham08:04:29

I think I would do it how rick suggests, fwiw

folcon17:04:48

Do you have a generic version of this @dharrigan?

dharrigan17:04:43

Not easily, I'm reading from an API...I would have to spend some time to fake it all out

folcon17:04:40

I was more thinking of the shape? Thinking about it, it sounds worthwhile to work out how to walk a complex nested thing with transducers…

Ben Hammond21:04:11

you create lots of small functions to process each individual level of nesting

Ben Hammond21:04:21

and then you compose them

Ben Hammond21:04:33

but when it is a complex nested structure, what benefit are you hoping to gain by using transducers right the way though?

Ben Hammond21:04:52

transducers are great when the data is very big

Ben Hammond21:04:13

or when you want to be able to reuse code in different situations

Ben Hammond21:04:20

but if is is a complex data structure, then once you process a few levels into the data, those two things (often) become less important

folcon22:04:21

That’s true, but usually I’m not thinking of processing multiple levels of nesting, just 1 level.

folcon22:04:08

There’s a lot of:

[{:users [{:username "Mike"}]}]
Or in my case things like:
{:entities {1 {:id 1 :allies [2 4]} 2 {:id 2 :allies [1]} ...}}
Where transducing to update :entities

folcon17:04:54

@dharrigan The only other thing I can suggest is to pick up either specter or xforms

folcon17:04:42

I’ve used specter a bit, but it does end to permeate ones thinking when using it, whereas xforms is one I’m less familiar with but feels like it would be a good fit =)…

dharrigan17:04:08

Thank you. Investigating.

folcon19:04:57

Let me know if you decide to go with xforms and figure something out =)… I’d be interested as because I’ve mentioned I’m less familiar with them 😃..