Fork me on GitHub
#clojure
<
2020-06-26
>
potetm00:06:29

Please, for the love of all that is holy, if something is runtime dynamic, pass it around.

☝️ 6
jumar04:06:31

I was profiling my clojure app with JFR and noticed that most of the allocations are for java.lang.reflect.Method Is that something you would consider unusual? It says that bunch of them is coming from clojure.data.csv.write-csv* - line 125:

(defn- write-csv*
  [^Writer writer records sep quote quote? ^String newline]
  (loop [records records]
    (when-first [record records]
      (write-record writer record sep quote quote?)
      (.write writer newline)
      (recur (next records))))) ; this is the line 125
How can I find what's happening here and is there something I could do to "fix" it?

seancorfield04:06:38

@jumar As a first step, I'd add (set! *warn-on-reflection* true) to each and every file in your project, just after the ns form and see what reflection warnings pop up when try to run the code.

jumar04:06:30

I tried that in the repl and then eval data.csv ns but I guess that's too naive. I'll try to call the function

kwladyka10:06:46

https://clojure.wladyka.eu/posts/how-to-improve-algorithm-performance/#avoid-reflections

(set! *warn-on-reflection* true)
(require 'main-namespace :reload-all)
you can also try this

kwladyka10:06:23

hmm I am curious if there is some deps.edn which I can use to validate if I have any reflection in system and fail build

kwladyka10:06:29

I have to take a look on that

seancorfield04:06:23

(I don't see anything obvious in your code fragment but I suspect line numbers are not always reported entirely accurately)

hiredman04:06:33

I haven't used flight recorder, but I don't think you are interpreting what it is showing you correctly

hiredman04:06:08

The stacktrace there doesn't contain any Java reflection, therefore it is not a place where method objects are being allocated for reflection

hiredman04:06:01

Which also explains why you aren't getting a reflection warning when compiling that code

jumar04:06:23

Interestingly, I got this, BUT only when I tried to instrument the write-csv* function with Cider debugger:

Reflection warning, /Users/jumar/.m2/repository/org/clojure/data.csv/1.0.0/data.csv-1.0.0.jar:clojure/data/csv.clj:125:7 - call to method write on java.io.Writer can't be resolved (argument types: unknown).

jumar04:06:13

But yeah, a standard invocation of write-csv looks good

hiredman04:06:17

Ah, interesting

seancorfield04:06:09

.write on a Writer with a String argument seems unambiguous -- am I missing something?

hiredman04:06:34

That is why it is interesting, doesn't seem possible

hiredman04:06:43

I bet the debugger is stripping metadata

jumar04:06:46

It might be false positive caused by the debugger instrumentation?

jumar04:06:06

I'll try to manually call the real function which under the hood uses clojure.data.csv (it's a bit harder); now I just tried to call write-csv by hand. Maybe the stacktrace from JFR is indeed misleading I'm just curious what those 195 GB come from 🙂

hiredman04:06:49

If you set warn on reflection in all your namespaces you will get a warning printed out anytime the compiler generates a reflective call

jumar04:06:01

Isn't it enough to just set in the repl or in the "main" ns?

seancorfield04:06:13

No, it's per-file.

hiredman04:06:23

It is complicated

hiredman04:06:06

Setting it in your main namespace is usually too late (everything is already compiled by the time the set! Is run)

jumar04:06:57

Ok, that will be a bit of work 🙂

hiredman04:06:08

Setting it in the repl can work, but you have to really aware of when you are loading code and set it before you load anything

jumar04:06:33

Understood. Maybe lein check could help with that too?

hiredman04:06:39

Yes definitely, I forget about lein check

👍 3
seancorfield04:06:19

Ah, lein, a tool from the before times... 🙂

😎 9
deactivateduser05:06:08

There’s also https://github.com/athos/clj-check for those using tools.deps.

borkdude08:06:27

why is there an API to change the root value of a var, but not to only inspect it without interop: (.getRawRoot ...)?

restenb10:06:07

anybody here that has used Electron framework to bundle a Clojure/ClojureScript webapp?

walterl10:06:46

I haven't, but I hope you'll share whatever you find 🙂

phronmophobic16:06:26

what's the question behind the question?

restenb10:06:16

as a desktop application

kwladyka10:06:43

me a few years ago, so not sure if I will be able to help

dominicm17:06:15

Where can I remind myself of the details of false being true in clojure if something uses the "wrong" java.Lang.Boolean methods?

😮 3
dpsutton17:06:19

public Object eval() {
		Object t = testExpr.eval();
		if(t != null && t != Boolean.FALSE)
			return thenExpr.eval();
		return elseExpr.eval();
	}

dpsutton17:06:39

in compiler.java line 2710. the compairison to Boolean.FALSE occurs

dominicm17:06:37

Does java share this behavior? I feel like it doesn't.

noisesmith17:06:51

no, clojure uses a shortcut where if is just an identity check

noisesmith17:06:00

java actually compares values

dominicm17:06:36

So clojure is faster than java! 😃 jk

noisesmith17:06:45

(ins)user=> (boolean? (Boolean. "TRUE"))
true
(ins)user=> (= true (Boolean. "TRUE"))
true
(ins)user=> (boolean? (Boolean. "FALSE"))
true
(ins)user=> (= false (Boolean. "FALSE"))
true
(ins)user=> (if (Boolean. "FALSE") :yes :no)
:yes

hiredman17:06:57

the compiler eval method call is basically never called. the eval methods in the compiler are kind of vestigial. they are called in a very limited set of cases when running code in the repl can skip generating bytecode, the majority of code in the repl and otherwise is compiled to bytecode which doesn't use the eval methods in the compiler

dominicm17:06:11

I think I need to learn how java.io.Serialization works, because I'm pretty sure that a round trip through that with a Boolean caused me grief

dpsutton17:06:57

@hiredman meaning the example i posted is never called? public static class IfExpr implements Expr, MaybePrimitiveExpr{?

dpsutton17:06:24

oh you mean the eval method on that class. the doEmit is used though?

hiredman17:06:53

it may be called for some limited number of trivial expressions when evaluating code in the repl, but for the most part the eval methods are vestigial

dpsutton17:06:15

ok. so the check is actually

gen.getStatic(BOOLEAN_OBJECT_TYPE, "FALSE", BOOLEAN_OBJECT_TYPE);
			gen.visitJumpInsn(IF_ACMPEQ, falseLabel);

andy.fingerhut18:06:56

Here is one place with some details and background where this is documented: http://clojuredocs.org/clojure.core/if

andy.fingerhut18:06:28

Feel free to edit to add more details for future reference to folks interested in the implementation details, if you wish.

dominicm19:06:51

user=> (let [fos (java.io.FileOutputStream. "/tmp/t") oos (java.io.ObjectOutputStream. fos)] (.writeObject oos false) (.close oos))
nil
user=> (let [fis (java.io.FileInputStream. "/tmp/t") ois (java.io.ObjectInputStream. fis)] (def x? (.readObject ois)) (.close ois))
nil
user=> x?
false
user=> (if x? :was_true :was_false)
:was_true
This is indirectly what ruined my day :) Except I didn't know what serialization quartz was using.

andy.fingerhut19:06:29

serialization libraries seem to be a semi-common source of introducing freshly constructed Boolean objects.

dominicm19:06:18

I've been spoiled by json/edn

Kazuki Yokoyama19:06:22

In "The Joy of Clojure" 2nd edition, there is an interesting discussion on the perils of using Java's Boolean class in section 3.1.2 "Don't create Boolean objects"

Kazuki Yokoyama20:06:46

As noted in the just mentioned section, the right way to parse a boolean value is this:

(if (Boolean/valueOf "false")
  :truthy
  :falsey)
;=> :falsey

dominicm20:06:24

Yup. That's easier when you're not removed from that decision by libraries.

dominicm20:06:57

In java, this doesn't matter. So it makes interop difficult.

viesti19:06:12

Hmm, this https://twitter.com/practical_li/status/1276481899045797889 got me thinking, that is there a bash/zsh completions, that would fill the aliases for the Clojure CLI (like clj -A:re<TAB> -> clj -A:rebel)?

dominicm20:06:19

Does anyone have a durable, distributed, job scheduler that runs in the jvm? I'm using quartz now, but I'm reevaluating. I'd prefer if it was durable to jdbc like quartz.

jacklombard22:06:48

Can someone tell me why

(defn using-pmap
  [update-fn coll]
  (dorun (pmap update-fn coll)))
is significantly faster than
(defn using-async
  [n update-fn coll]
  (let [coll-ch (a/chan n)
        _ (a/onto-chan! coll-ch coll)]
    (let [processed-ch (-> (repeatedly
                             n
                             #(a/go-loop [item (<! coll-ch)]
                                (if item
                                  (do
                                    (update-fn item)
                                    (recur (<! coll-ch)))
                                  :done)))
                           doall)]
      (doseq [ch processed-ch]
        (<!! ch))
      nil)))

rutledgepaulv23:06:34

https://eli.thegreenplace.net/2017/clojure-concurrency-and-blocking-with-coreasync/ is a good read. blocking io and go blocks don't mix, you should be doing blocking io on a dedicated thread and not a dispatch thread. async/thread will run work on an independent thread and return a channel that you can use to still participate in coordination activities

rutledgepaulv23:06:01

core.async is sort of similar to an event loop. go blocks are used for "select" loop activities and threads are used for event handling. the one exception being if your event handlers are cpu bound then it doesn't make much difference whether you run it on a dispatch thread or a dedicated thread because either way the cpu is busy. blocking io is important to differentiate and separate because it ties up a thread but doesn't tie up the cpu

rutledgepaulv23:06:57

(this is how i've come to understand it at least, though i am not an expert)

jacklombard23:06:44

Thanks that analogy with event loop is very useful

jacklombard23:06:09

And thanks for the link and about blocking io on go blocks

👍 3
noisesmith22:06:48

go is rarely a performance optimization, its primary purpose is to make it harder to implement concurrency bugs

noisesmith22:06:04

if you don't need coordination between threads, core.async won't help you much

jacklombard22:06:07

The CPU for pmap goes beyond 200% while the async one stays put at 40%

noisesmith22:06:33

for starters, all go blocks share a small number of threads that doesn't expand

noisesmith22:06:04

if your work has a significant IO element, the go blocks simply starve each other without increasing CPU usage

jacklombard22:06:36

Does it expand for pmap, I thought pmap was also something like number of cores + 2 threads

hiredman22:06:43

pmap is also not great

noisesmith22:06:55

which is the same number shared by all go blocks, not just the ones you created in your demo

noisesmith22:06:06

and their coordination is more expensive than that done by pmap

jacklombard22:06:15

Yes it involves significant IO

noisesmith22:06:19

you're paying for features you aren't using in the core.async case

jacklombard22:06:44

Oh so even if I don’t try to coordinate, there is an overhead

✔️ 3
hiredman22:06:49

pmap is limitted, but that limit doesn't compose well with some features of lazy-seqs which can result in going past the limit

3
hiredman22:06:52

pmap is a bad abstraction

jacklombard22:06:56

@hiredman mind expanding a bit?

hiredman22:06:04

on which part?

jacklombard22:06:34

The going past the limit part

jumar03:06:20

This often means it will run 32 threads in parallel rather than n+2

hiredman22:06:21

lazy-seqs are really "not strict seqs" meaning sometimes you can realize more than the next element you asked for

jacklombard22:06:30

If not pmap what else, write my own parallel way to process the coll?

hiredman22:06:32

there is a feature called chunking

noisesmith22:06:42

pmap uses laziness to control thread count, rather than using a pool

jacklombard22:06:54

That is fine in my case, I am using a dorun anyway

hiredman22:06:27

just use mapv and future then

noisesmith22:06:30

@frozenfire1992 I've had great luck with claypoole, and often the underlying ExecutorService which comes with the vm is enough

👍 3
jacklombard22:06:37

Yeah chunking I know, will try to chunk my coll and see if it improves perf

hiredman22:06:15

(->> (mapv #(future (f %)) x) (map deref))

hiredman22:06:36

but that is almost never going to do well

jacklombard22:06:06

That’s trying to make way too many threads

hiredman22:06:10

you almost always want access to the executor and to do things in a finer grained way

jacklombard22:06:23

@noisesmith Thanks will look into it 👍

hiredman22:06:03

the other issue with building your parallelism on top of seqs is your are limiting the concurrency by imposing order

jacklombard22:06:55

Yeah I guess the pmap threads have to wait for the other ones to finish and then move forward in the seq

hiredman22:06:14

and you are collecting all the results just to throw them away

hiredman22:06:25

use an executor

jacklombard22:06:40

cool will check it out, thanks a lot

ghadi23:06:26

it depends on what the "update-fn" is doing (you might be blocking threads without knowing) @frozenfire1992

jacklombard05:06:19

The update-fn in my actual code is for making some db writes and calling an API endpoint

jacklombard05:06:59

But I did test it with simple and heavy calculations without any IO and pmap still performed significantly better

jacklombard05:06:23

Also in my actual code I had the update-fn wrapped in (<! (a/thread (update-fn item)))

jacklombard05:06:27

I guess I shouldnt be doing such heavy IO in a go block because of core.async coordinate the parking and bringing the processes back

ghadi23:06:13

and try clojure.core.async/pipeline-async too, depending on the nature of the work being farmed out