clojure 2019-04-19 | Slack Archive

seancorfield00:04:30

I have all three editions of Programming Clojure 🙂 And Clojure Applied and Getting Clojure, both editions of Clojure In Action and both of The Joy Of Clojure, also Clojure Cookbook and Clojure Programming. All in PDF, sync'd to every device I own via Dropbox. And in OneDrive I have three Clojure books from Packt (not a very good publisher, IMO) and Zach Tellman's Elements Of Clojure. And I'm still missing some!

seancorfield00:04:27

Carin Meier's Living Clojure is noticeably absent. What else?

Eric Ervin21:04:59

Sotnikov's Web Development with Clojure. I think it is a great 1.5th or 2nd Clojure book to get people to "hello http"

Eric Ervin21:04:50

Quick Clojure is a good one for when I've spent some time away from the language and I need a reminder.

mg00:04:49

Professional Clojure is absent, although don't know about "notably"

seancorfield00:04:01

Ah, who publishes that?

mg00:04:33

Wrox

seancorfield00:04:07

Ah... not a publisher I look at very often... isn't it part of O'Reilly these days?

seancorfield00:04:21

(mind you, Packt is also part of O'Reilly now I think?)

mg00:04:32

I have no idea

seancorfield00:04:29

Oh. You're one of the authors! 'grats! Writing a book is a major achievement! (I've started three and never got past the outline)

mg00:04:43

Thanks! I would have never finished, except it went down like, "hey want to write about Datomic in our Clojure book?" "Great, I'm in!" "Ok you have a month"

seancorfield00:04:23

Hahaha... yeah, writing schedules are why I've never gotten further than the outline and those early discussions with publishers...

mg00:04:02

It was fun to write. Although everything else was misery

Ivan Koz09:04:31

@seancorfield how was the Elements of Clojure for you?

seancorfield16:04:12

#Also sent to the channel

I really like it -- because it tackles topics that a lot of books don't cover. Some of it was "old news" but it was deep and thought-provoking, for the most part.

dbernal15:04:17

I'm using

(defn foo
  [y z]
  (let [x (some-long-operation y)] (map #(+ x %) z)))

as a way to force eval the some-long-operation function. Are there any other ways of forcing the evaluation of the form inside the # macro?

dbernal15:04:54

In order to keep the structure something like this

(defn foo
  [y z]
  (map #(+ x (some-long-operation y)) z))

benoit15:04:27

(defn foo
  [y z]
  (map (partial + (some-long-operation y)) z))

noisesmith15:04:30

(doall (some-long-operation y)) will force evaluation of the lazy-seq, your original version doesn't actually force anything

noisesmith15:04:17

=> (let [foo (map println (range))] nil)
nil

dbernal15:04:55

got it, thanks for the suggestions!

dbernal16:04:35

and by force I guess I meant executes only once 🙂

benoit16:04:47

@U2JACTBMX That was my understanding. The partial should do that for you. But your first piece of code works as well.

noisesmith16:04:44

oh, in clojure I've only heard "force" mean making sure a lazy seq is realized at a specific code boundary, I've never seen it used as a synonym for cache or reuse

seancorfield16:04:12

replied to a thread:@seancorfield how was the Elements of Clojure for you?

I really like it -- because it tackles topics that a lot of books don't cover. Some of it was "old news" but it was deep and thought-provoking, for the most part.

andrew17:04:43

xpost from #beginners for a broader audience: I tried a simple ETL pipeline in Clojure and Python to get a sense of pros/cons of implementing in both languages. I was surprised that Clojure wasn't much faster and only marginally shorter. Any thoughts why? https://github.com/andharris/pipeline

noisesmith17:04:25

I'm not sure, but one suspicion I have here is the usage of spec. My vague understanding is that it's not optimized for performance, and the intended usage is to use it during development / testing and turn it off for production code.

noisesmith17:04:35

I think you could write a much more efficient data validator by hand (or use a less featureful and more performance oriented validation library)

noisesmith17:04:36

Also, core.async is great for coordinating tasks, but I don't think it's the most performant option here either. If the goal is parallelization rather than coordination, you'll get much better throughput with ExecutorService or a wrapper like claypoole.

noisesmith17:04:22

https://github.com/TheClimateCorporation/claypoole

andrew17:04:10

Thanks for the feedback

andrew17:04:54

I'll see if swapping out claypoole for parallelization helps

vemv17:04:27

I have used claypoole successfully in a handful of projs. Here's a self-contained example of how I tend to use it (eagerly partition the work to match cores + distribute the workload evenly) https://github.com/clojure-emacs/refactor-nrepl/pull/247/files

noisesmith17:04:15

also, if throughput is bottlenecked by IO rather than CPU (which is very possible here), limiting by cores is probably a mistake

👍 4

andrew17:04:47

Yeah I guess if IO is the bottleneck it will slow down both implementations

noisesmith17:04:32

you might try making a /dev/null variant for both - that might give a better idea of the language perf difference if IO dominates the task

noisesmith17:04:51

or perhaps the message here is "if IO dominates the task time, just pick any language you like"

andrew17:04:08

good point

eggsyntax18:04:24

The I/O possibility occurred to me too, but I notice that in the original Grammarly post they show all 4 cores maxed out, so I figured it probably wasn't that.

eggsyntax18:04:56

I hadn't noticed the use of spec (I only looked at the Grammarly version) but that seems like a real possibility too. I'd definitely be curious to see whether swapping the spec/valid calls for a simple hand-written validator would make a big difference (and that could be one place where type hints would make a difference).

eggsyntax18:04:47

Or just comment out the filter line and see how much difference that makes.

noisesmith18:04:20

order of magnitude difference for a simple case

(cmd)user=> (s/def ::foo int)
:user/foo
(ins)user=> (time (dotimes [_ 1000000] (int? 1)))
"Elapsed time: 11.172124 msecs"
nil
(ins)user=> (time (dotimes [_ 1000000] (s/valid? ::foo 1)))
"Elapsed time: 123.692794 msecs"
nil

👍 4

noisesmith18:04:31

to be fair, that int? check might be friendlier to inlining by hotspot than the actual code checking your data format

jumpnbrownweasel18:04:49

the s/def uses int instead of int?

noisesmith18:04:21

~~that's idiomatic~~ -- it's actually not, that's my mistake

noisesmith18:04:18

(ins)user=> (s/def ::foo int?)
:user/foo
(cmd)user=> (time (dotimes [_ 1000000] (s/valid? ::foo 1)))
"Elapsed time: 88.051523 msecs"
nil

andrew18:04:05

The interesting thing to me is that I use a pretty heavy handed approach to schema validation in the python implementation too. I'm surprised spec adds this much overhead, but I must have a misunderstanding of it's intended use. I was assuming this was the exact use case for spec

noisesmith19:04:16

just because spec is 8 times slower doesn't mean it's a bottleneck in this case - it's a good idea to profile for stuff like this

noisesmith19:04:36

and I could be wrong about it's intended use case, I'm not a spec expert, I've heard the truism "don't use it in hot loops, only use it at boundaries" but really this code is both a hot loop and a boundary

andrew19:04:32

Tested without validation and it doesn't have a material impact, so not the bottleneck. Time for me to learn how to profile a Clojure program.

👍 8

andrew19:04:33

> doesn't have a material impact

andrew19:04:12

maybe that's not fair. it has an impact but doesn't seem to be a primary bottleneck. Really just seems like it will need some profiling to really understand

noisesmith19:04:49

you can use a standard java profiler, visualvm (sometimes known as jvisualvm) is free, yourkit gives full licenses for use on open source projects

noisesmith19:04:05

there's an art to translating from the vm level stuff (designed to map more or less directly to java classes) when profiling clojure (classes made via weird generated bytecode from a handwritten compiler)

jumpnbrownweasel14:04:35

FWIW I recently did a bunch of profiling and discovered some unexpected changes in the free Java tools. - jvisualvm was changed to jmc (Java Mission Control) as of Java 8 - jmc is also available in Java 9 - in Java 10 to 12 there is no jmc in the jdk jmc was spun out by Oracle as an open source project as of Java 10 but has not been released yet. They're trying to get jmc 7 done but it's not there yet. You can build it from sources, and I did that and it seems to work. But the simpler thing is to use Java 8/9 for profiling until jmc 7 is released. It doesn't work to use the Java 8/9 jmc on JFR files generated by later releases of Java, the format is incompatible.

vemv17:04:50

I'm looking for a proxy [java.io.Writer] that I can bind to *out*, such that concurrent threads printlning to it won't cause jumbled output ...might be super easy to implement, but who knows, maybe an existing solution covers some edge cases, has unit tests etc ^^

hiredman17:04:42

or just use a real logging library

vemv17:04:51

I cannot mutate arbitrary libraries to use my logging library. They use println

hiredman17:04:33

life is too short for bad dependencies

hiredman17:04:05

I am not sure that is solvable by binding out

hiredman17:04:23

a binding of out can only see when .write is called on itself, which doesn't tell it when a logical unit of output is complete from one thread

vemv17:04:06

I'll give it a think. Given this stub:

(proxy [java.io.Writer] []
    (append [& _])
    (close [])
    (flush [])
    (write [& _]))

...`close` cleanly delimits the end of a message. And (Thread/currentThread) delimits the concurrent parts

vemv17:04:35

scratch that

hiredman17:04:22

yeah, no one every calls .close on *out*

vemv17:04:08

Maybe flush then. And I'd manually bind *flush-on-newline* true to ensure flush is invoked at the end of a message

hlolli18:04:59

I want to do aot but there's one function call that I'd like to not run during the aot compilation. Without skipping aot on the whole class, can I use some variable to know if clojure is compiling?

noisesmith18:04:37

user=> (doc *compile-files*)
-------------------------
clojure.core/*compile-files*
  Set to true when compiling files, false otherwise.
nil

noisesmith18:04:53

it's good practice to put nothing with side effects on the top level of your code

noisesmith18:04:35

(delay can be helpful for this - you can create a delay globally but only evaluate it once it's forced)

hlolli18:04:27

ah good point, yes, it's a graal app that I'm working on, and it needs to start making native calls, but the pointers must be created at runtime

hlolli18:04:50

works! thanks @noisesmith 🙂

hiredman18:04:47

you can always not aot compile too

hiredman18:04:49

oh, graal, meh

dpsutton20:04:19

pmap doesn't do any parallelization below 512 elements, is that right?

dpsutton21:04:49

starting to doubt that but remember something cares about 512

Alex Miller (Clojure team)21:04:33

reducers

dpsutton21:04:28

thanks so much! that was bothering me 🙂

Alex Miller (Clojure team)21:04:55

pmap is parallel over 2+# processors

dchelimsky21:04:58

[ANN] Cognitect Labs' aws-api 0.8.301 https://groups.google.com/forum/#!topic/clojure/TXxEx-2OV2s

2019-04-19

Channels