Fork me on GitHub
#clojure
<
2019-03-29
>
yuhan01:03:08

what's a simple way of testing if a given object is a clojure function? (as defined by defn/`fn`)

yuhan01:03:33

oops, how did I ever miss that 😅

yuhan01:03:58

ah, so it's testing for instance? of clojure.lang.Fn instead of clojure.lang.IFn

potetm02:03:28

Yeah the docstring explains the diff. Also available: ifn?

Alex Miller (Clojure team)02:03:20

ifn? is almost always preferable

Alex Miller (Clojure team)02:03:10

@borkdude it would be worth double-checking that all these paths are buffered. your edn code goes through io/reader and spit which are both buffered, but I'm not sure about the others. certainly the transit perf seems higher than I'd expect, if it's not buffering, then might be worth a check to make sure there's not a bug that's crept in. I doubt this has been profiled in a long while.

Alex Miller (Clojure team)02:03:42

also, for a big file like this you could probably use larger buffers than the default

todo05:03:12

What is the data structure behind Clojure persistent maps? (I need to port them to another language.)

hipster coder05:03:50

Which language?

potetm11:03:27

> A persistent rendition of Phil Bagwell’s Hash Array Mapped Trie

jaide05:03:39

https://github.com/clojure/clojure/blob/master/src/clj/clojure/core.clj#L379 hash-map -> clojure.lang.PersistentHashMap -> clojure.lang.APersistentMap -> Map if I’m following your question and inheritance chain correctly.

hipster coder05:03:41

I just watched Rich's video... vectors are based on tries (trees). Hash maps are probably similar. They also use shared data.

hipster coder05:03:18

There is hashed and sorted maps. I am not 100% certain they use tries. But probably some kind of branching data structure similar to what underlies vectors.

borkdude07:03:58

@alexmiller when I write to a (java.io.ByteArrayOutputStream.) it’s already much faster

borkdude08:03:55

I tried to use fressian with GraalVM native. This code works:

(fressian/create-reader (io/input-stream (io/file clj-cache-file)))
but this didn’t:
(fressian/read-object (fressian/create-reader (io/input-stream (io/file clj-cache-file))))
This is the output: https://gist.github.com/borkdude/5f19d6bb0c3f180273786ae34f63f60f

borkdude08:03:42

I see spec being mentioned in the error message, I see it often with Clojure libs compiled with GraalVM, but it’s weird how this can be triggered by only adding fressian/read-object

dharrigan08:03:07

I believe this is a problem with Clojure 1.10.0 and Spec 0.2.176+

dharrigan08:03:16

It's referenced, indirectly here: https://dev.clojure.org/jira/browse/CLJ-1472 (ignore the bit about android, the juicy bit is at the end)

dharrigan08:03:48

I hit the same issue yesterday, doing some experimentation with Graal and Clojure. I had to go back to Clojure 1.9.0 to compile a native image.

borkdude08:03:54

thank you, I’ll try 1.9.0

borkdude08:03:02

At least Transit JSON works with Graal and 1.10. I couldn’t get MessagePack working:

Mar 29, 2019 9:41:43 AM org.msgpack.template.builder.TemplateBuilderChain createForceTemplateBuilder
WARNING: Failed to create a TemplateBuilder reflectively

borkdude09:03:17

Confirmed, I can get fressian working when dropping back to Clojure 1.9.0 and adding some type hints about FressianWriter + running with --report-unsupported-elements-at-runtime. But since Fressian and Transit are pretty close in performance, I’m going with 1.10 and without the flag.

tdantas09:03:36

hey clojurians, how you guys usually do the follow I have one business rule function ( create-invoice ) which could create an invoice or based on some business rule fail. should my function return two different shapes ( on success the invoice and on error a map with ::error key ) ? should my function throw an exception and my handler dispatch properly to handle that ? how you guys normally do that ?

borkdude09:03:47

@dharrigan did you try a fork of Clojure with that patch applied btw? With tools.deps that should be possible I think?

dharrigan09:03:26

Hi, no. Didn't get round to it sorry.

myguidingstar09:03:24

@oliv use pure data and avoid Exceptions whenever you can

tdantas09:03:12

@myguidingstar yeah, so you mean create a wrapper result ?

lloydshark09:03:26

I think it would be fine to always return a map.

lloydshark09:03:45

either

{:invoice {}}

lloydshark09:03:08

or

{:error "I am broken"}

lloydshark09:03:28

And then surrounding code can easily check for the presence of an error.

lloydshark09:03:41

But like most things it comes down to style and possibly what else is happening nearby in the code.

tdantas10:03:18

Yeah, I see @U4T99SHSB . I’m wrapping inside my Result map format

orestis12:03:08

Depends on the context — if you can’t really do anything about it, you might throw an exception and let your top-level error handle deal with that (wrap it in a nice HTTP status, for example). Don’t forget ex-info which allows you to put arbitrary data in the exception so you can do further dispatching there as well.

tdantas15:03:33

so @U7PBP4UVA , Yeah, I tend to not control my flow using exceptions . business case rules I’m returing a Result Wrapper map which the caller should if the ::error key exists or not … can you enumerate some cases of if you can't really do anything about it ?

orestis19:03:07

If the caller can recover from that error, and is expected to be able to, then perhaps an exception is not a good fit. If the caller can’t really handle the error, but have to deal with it anyway to avoid things like nils or errors maps propagating throughout the system, then an exception is perhaps better suited?

borkdude09:03:43

re: https://clojurians.slack.com/archives/C03S1KBA2/p1553827570192000 what should I do to buffer the writing to a file? I currently have something like:

(let [writer (transit/writer (java.io.BufferedOutputStream. (io/output-stream (io/file "/Users/Borkdude/temp/transit.json")) (* 1024 1024)) :json)]
                      (time (transit/write writer edn)))
` But I still get very slow writes

borkdude09:03:19

I’m not seeing any difference with older transit versions, so it must be something on my side

borkdude09:03:10

If I need a BufferedWriter I’m not sure how to do this with transit

borkdude09:03:25

As a workaround I can first write to a bytearrayoutputstream.

dartov10:03:35

Hello clojurians, I’m looking for a tutorial project implementing simple CRUD API powered by Datomic (ideally with front end in cljs) to use for training purposes with some newcomers. I went through some awesome-clojure lists, with no luck. My next best option would be to write something on top of mbrainz.

greywolve11:03:06

This is the only one that I can think of , off hand: https://github.com/robert-stuttaford/bridge cc @U0509NKGK

dartov11:03:19

thanks, that looks promising

quadron11:03:28

what is the equivalent of the following in core.async? foobar :: (a -> b) -> (Chan a) -> (Chan b)

Ben Hammond11:03:16

http://clojure.github.io/core.async/#clojure.core.async/chan lets you pass a transform transducer into the channel itself, if you are in control of constructing the channel

Ben Hammond11:03:56

if you have been given a pre-existing channel then I think you would have to use http://clojure.github.io/core.async/#clojure.core.async/pipe

✔️ 4
Ben Hammond12:03:39

that sort of thing

greglook16:03:50

@borkdude I’ll throw CBOR’s hat into the ring if you’re looking for binary serialization formats: https://github.com/greglook/clj-cbor

greglook16:03:37

haven’t specifically tried it with Graal yet, but it has no dependencies outside Clojure itself, so it ought to work.

greglook16:03:55

Also it’s an open standard, so there are other tools out there for working with the data. 🙂

lilactown16:03:12

how much of an anti-pattern is it to run clojure.test/is in a seq? like:

(doseq [ex examples]
  (t/is (f ex))

greglook16:03:57

is is side-effecting, so that is not unusual from what I’ve seen

dpsutton16:03:42

maybe use are?

lilactown16:03:21

hm yeah that might be better

borkdude17:03:07

@greg316 Cool, thank you.

borkdude17:03:35

I’m still a bit unsatisfied with this: https://clojurians.slack.com/archives/C03S1KBA2/p1553851903211300 Anyone got a better alternative?

greglook17:03:18

Don’t the functions already return buffered streams?

greglook17:03:24

you’re going to have a small buffer in front of your big buffer

borkdude17:03:22

@greg316 I’m now taking a workaround by first writing to a ByteArrayOutputBuffer and then copy that to a file, which is a bit insane. But that only took 45ms whereas directly writing to a file took 1s (EDN read from a 2MB .edn file)

greglook17:03:29

have you tried making a FileOutputStream directly?

greglook17:03:48

(then wrapping in a large buffer as above)

greglook17:03:36

FWIW:

% ls -lh bench/reddit.edn
-rw-r--r--  1 greg  staff   123K Jan  9 09:31 bench/reddit.edn
=> (def reddit-data (clojure.edn/read-string (slurp "bench/reddit.edn")))
=> (time (cbor/spit-all "reddit.cbor" (repeat 17 reddit-data)))
"Elapsed time: 158.625145 msecs"
1683493

borkdude18:03:58

I was preparing a full repro. Here it is:

$ wget  -O test.edn 
$ clj -Sdeps '{:deps {com.cognitect/transit-clj {:mvn/version "0.8.313"}}}'
Clojure 1.10.0
user=>  (require '[clojure.edn :as edn])
nil
user=> (def edn (edn/read-string (slurp "test.edn")))
#'user/edn
user=>  (count (keys edn))
799
user=> (require '[cognitect.transit :as transit])
nil
user=> (require '[ :as io])
nil
user=> (def writer (transit/writer (io/output-stream (io/file "transit.json")) :json))
#'user/writer
user=> (time (transit/write writer edn))
"Elapsed time: 1151.116438 msecs"
nil

borkdude18:03:07

Now I can look at your suggestion…

greglook18:03:30

now I’m pretty curious how cbor would do - I tried to copy enough of the sample dataset I had to make ~2 MB, but maybe there’s something gnarly in your data :man-shrugging:

borkdude18:03:58

2MB is right 🙂

borkdude18:03:51

@greg316 worse:

user=> (def writer (transit/writer (java.io.FileOutputStream. "transit.json") :json))
#'user/writer
user=> (time (transit/write writer edn))
"Elapsed time: 2039.522802 msecs"

greglook18:03:09

I meant to also wrap your own buffered stream around that

borkdude18:03:18

ok, I’ll try now

greglook18:03:41

because io/output-stream is going to return you a FileOutputStream wrapped in a BufferedOutputStream with the default buffer size, so wrapping a larger buffer around that isn’t going to help you

borkdude18:03:06

something like this?

user=> (def writer (transit/writer (java.io.BufferedOutputStream. (java.io.FileOutputStream. "transit.json") 1024) :json))
#'user/writer
user=> (time (transit/write writer edn))
"Elapsed time: 1078.45894 msecs"

greglook18:03:21

yeah, but with a larger buffer size

borkdude18:03:45

user=> (def writer (transit/writer (java.io.BufferedOutputStream. (java.io.FileOutputStream. "transit.json") (* 1024 1024)) :json))
#'user/writer
user=> (time (transit/write writer edn))
"Elapsed time: 1016.702847 msecs"

greglook18:03:54

well that’s surprising

borkdude18:03:03

Note:

user=> (def bos (java.io.ByteArrayOutputStream. (* 1024 1024)))
#'user/bos
user=> (def writer (transit/writer bos :json))
#'user/writer
user=> (time (transit/write writer edn))
"Elapsed time: 31.144918 msecs"
nil
user=> (time (io/copy (.toByteArray bos) (io/file "transit.json")))
"Elapsed time: 16.471751 msecs"
nil

borkdude18:03:08

btw, the size for the bos buffer didn’t matter very much here

borkdude18:03:39

If desired, I can make an issue for this, but first I’d like to know if I’m not overseeing something

ghadi18:03:40

I don't understand what the problem is @borkdude

ghadi18:03:28

the file writing being slow?

borkdude18:03:36

when I write directly to a file it’s slow (1s). when I write via a bytearrayoutputstream and then to a file it’s 47ms.

borkdude18:03:51

so yeah, the file writing

ghadi19:03:49

@borkdude try to add a call to .flush() on the File outputstream on your other (non-Transit) benchmarks and see if it makes them 1sec

borkdude19:03:49

@ghadi Like this?

user=> (time (let [fos (java.io.FileOutputStream. "/tmp/foo.edn") w (io/writer fos)] (.write w (str edn)) (.flush fos)))
"Elapsed time: 77.273308 msecs"

ghadi19:03:31

yeah, need close the stream too

ghadi19:03:34

with-open

ghadi19:03:52

There's a flush() in transit I'm trying to understand

borkdude19:03:13

user=> (time (with-open [fos (java.io.FileOutputStream. "/tmp/foo.edn")] (let [w (io/writer fos)] (.write w (str edn)) (.flush fos))))
"Elapsed time: 73.316135 msecs"

ghadi19:03:41

ok... I can't explain this yet.

Alex Miller (Clojure team)19:03:44

sorry, no time to look atm

vlaaad19:03:48

@borkdude have you tried profiling it with clj-async-profiler?

borkdude19:03:32

I just used that tool an hour ago, but not for this problem. I could try it now 🙂

ghadi19:03:51

neat tool, how does it contrast with the other code?

Lennart Buit19:03:54

(what are you profiling, cpu samples?)

Lennart Buit19:03:11

(you can change what is profiled ^^)

borkdude19:03:22

yes, this is CPU

mping19:03:20

do you have the file graph?

borkdude19:03:30

it seems indeed that transit spends most of its time flushing…

ghadi19:03:33

clearly FileOutputStream.flush() is slow (the red is a native method). What I don't understand is why we can't make the EDN writing incur the same cost

ghadi19:03:09

by explicitly flushing. (Mind opening a ticket to cognitect/transit-java @borkdude?)

borkdude19:03:22

is that in JIRA?

ghadi19:03:59

no github

ghadi19:03:20

maybe transit-clj might be better

ghadi19:03:33

until we know more about what the source of the problem is

ghadi19:03:38

I suspect flush

dchelimsky19:03:26

[ANN] Cognitect Labs' aws-api 0.8.283 is now available! https://groups.google.com/forum/#!topic/clojure/cOIF5cjAecI

greglook19:03:48

flame graphs are pretty nifty - I was using the same tool while profiling clj-cbor :male-scientist: 📊

Lennart Buit19:03:21

I am so amazed in what you can put in a svg

Lennart Buit19:03:35

also, terrified, why does it support scripts :’)

borkdude20:03:02

and why does my computer open .svg files as text in Atom

borkdude20:03:05

this is absolutely great for deciding what to optimize. fuzzing about details that only contribute to a small gain is a risk

borkdude20:03:36

does this also exist for cljs?

Lennart Buit20:03:25

its built on a java library iirc

lilactown20:03:05

I would imagine in CLJS it would be more useful to use e.g. the Chrome profiler

lilactown20:03:26

or the React DevTools profiler if you're doing React

noisesmith20:03:48

there are definitely graphs in that style available with chrome/react