This page is not created by, affiliated with, or supported by Slack Technologies, Inc.
2019-03-29
Channels
- # announcements (9)
- # aws (1)
- # beginners (133)
- # boot (2)
- # calva (94)
- # cider (48)
- # cljdoc (7)
- # cljsrn (22)
- # clojure (128)
- # clojure-europe (22)
- # clojure-finland (7)
- # clojure-greece (6)
- # clojure-losangeles (3)
- # clojure-nl (81)
- # clojure-spec (30)
- # clojure-uk (60)
- # clojure-ukraine (1)
- # clojurescript (45)
- # core-async (26)
- # cursive (18)
- # datomic (12)
- # defnpodcast (1)
- # duct (4)
- # editors (4)
- # emacs (6)
- # fulcro (37)
- # graphql (4)
- # jobs (2)
- # jobs-rus (1)
- # juxt (7)
- # kaocha (2)
- # leiningen (1)
- # nrepl (22)
- # off-topic (2)
- # re-frame (16)
- # reagent (8)
- # reitit (22)
- # ring-swagger (5)
- # shadow-cljs (81)
- # tools-deps (4)
what's a simple way of testing if a given object is a clojure function? (as defined by defn
/`fn`)
ifn?
is almost always preferable
@borkdude it would be worth double-checking that all these paths are buffered. your edn code goes through io/reader and spit which are both buffered, but I'm not sure about the others. certainly the transit perf seems higher than I'd expect, if it's not buffering, then might be worth a check to make sure there's not a bug that's crept in. I doubt this has been profiled in a long while.
also, for a big file like this you could probably use larger buffers than the default
What is the data structure behind Clojure persistent maps? (I need to port them to another language.)
Which language?
https://github.com/clojure/clojure/blob/master/src/jvm/clojure/lang/PersistentHashMap.java
https://github.com/clojure/clojure/blob/master/src/clj/clojure/core.clj#L379 hash-map
-> clojure.lang.PersistentHashMap
-> clojure.lang.APersistentMap
-> Map
if I’m following your question and inheritance chain correctly.
I just watched Rich's video... vectors are based on tries (trees). Hash maps are probably similar. They also use shared data.
There is hashed and sorted maps. I am not 100% certain they use tries. But probably some kind of branching data structure similar to what underlies vectors.
@alexmiller when I write to a (java.io.ByteArrayOutputStream.)
it’s already much faster
I tried to use fressian with GraalVM native. This code works:
(fressian/create-reader (io/input-stream (io/file clj-cache-file)))
but this didn’t:
(fressian/read-object (fressian/create-reader (io/input-stream (io/file clj-cache-file))))
This is the output: https://gist.github.com/borkdude/5f19d6bb0c3f180273786ae34f63f60fI see spec being mentioned in the error message, I see it often with Clojure libs compiled with GraalVM, but it’s weird how this can be triggered by only adding fressian/read-object
It's referenced, indirectly here: https://dev.clojure.org/jira/browse/CLJ-1472 (ignore the bit about android, the juicy bit is at the end)
I hit the same issue yesterday, doing some experimentation with Graal and Clojure. I had to go back to Clojure 1.9.0 to compile a native image.
At least Transit JSON works with Graal and 1.10. I couldn’t get MessagePack working:
Mar 29, 2019 9:41:43 AM org.msgpack.template.builder.TemplateBuilderChain createForceTemplateBuilder
WARNING: Failed to create a TemplateBuilder reflectively
Confirmed, I can get fressian working when dropping back to Clojure 1.9.0 and adding some type hints about FressianWriter + running with --report-unsupported-elements-at-runtime
. But since Fressian and Transit are pretty close in performance, I’m going with 1.10 and without the flag.
hey clojurians, how you guys usually do the follow I have one business rule function ( create-invoice ) which could create an invoice or based on some business rule fail. should my function return two different shapes ( on success the invoice and on error a map with ::error key ) ? should my function throw an exception and my handler dispatch properly to handle that ? how you guys normally do that ?
@dharrigan did you try a fork of Clojure with that patch applied btw? With tools.deps that should be possible I think?
@oliv use pure data and avoid Exceptions whenever you can
@myguidingstar yeah, so you mean create a wrapper result ?
I think it would be fine to always return a map.
either
{:invoice {}}
or
{:error "I am broken"}
And then surrounding code can easily check for the presence of an error.
But like most things it comes down to style and possibly what else is happening nearby in the code.
Yeah, I see @U4T99SHSB . I’m wrapping inside my Result map format
Depends on the context — if you can’t really do anything about it, you might throw an exception and let your top-level error handle deal with that (wrap it in a nice HTTP status, for example). Don’t forget ex-info
which allows you to put arbitrary data in the exception so you can do further dispatching there as well.
so @U7PBP4UVA , Yeah, I tend to not control my flow using exceptions . business case rules I’m returing a Result Wrapper map which the caller should if the ::error key exists or not …
can you enumerate some cases of if you can't really do anything about it
?
If the caller can recover from that error, and is expected to be able to, then perhaps an exception is not a good fit. If the caller can’t really handle the error, but have to deal with it anyway to avoid things like nils or errors maps propagating throughout the system, then an exception is perhaps better suited?
re: https://clojurians.slack.com/archives/C03S1KBA2/p1553827570192000 what should I do to buffer the writing to a file? I currently have something like:
(let [writer (transit/writer (java.io.BufferedOutputStream. (io/output-stream (io/file "/Users/Borkdude/temp/transit.json")) (* 1024 1024)) :json)]
(time (transit/write writer edn)))
`
But I still get very slow writesI’m not seeing any difference with older transit versions, so it must be something on my side
Hello clojurians, I’m looking for a tutorial project implementing simple CRUD API powered by Datomic (ideally with front end in cljs) to use for training purposes with some newcomers. I went through some awesome-clojure lists, with no luck. My next best option would be to write something on top of mbrainz.
This is the only one that I can think of , off hand: https://github.com/robert-stuttaford/bridge cc @U0509NKGK
what is the equivalent of the following in core.async? foobar :: (a -> b) -> (Chan a) -> (Chan b)
http://clojure.github.io/core.async/#clojure.core.async/chan lets you pass a transform transducer into the channel itself, if you are in control of constructing the channel
if you have been given a pre-existing channel then I think you would have to use http://clojure.github.io/core.async/#clojure.core.async/pipe
that sort of thing
@borkdude I’ll throw CBOR’s hat into the ring if you’re looking for binary serialization formats: https://github.com/greglook/clj-cbor
haven’t specifically tried it with Graal yet, but it has no dependencies outside Clojure itself, so it ought to work.
Also it’s an open standard, so there are other tools out there for working with the data. 🙂
how much of an anti-pattern is it to run clojure.test/is
in a seq? like:
(doseq [ex examples]
(t/is (f ex))
I’m still a bit unsatisfied with this: https://clojurians.slack.com/archives/C03S1KBA2/p1553851903211300 Anyone got a better alternative?
@greg316 I’m now taking a workaround by first writing to a ByteArrayOutputBuffer and then copy that to a file, which is a bit insane. But that only took 45ms whereas directly writing to a file took 1s (EDN read from a 2MB .edn file)
FWIW:
% ls -lh bench/reddit.edn
-rw-r--r-- 1 greg staff 123K Jan 9 09:31 bench/reddit.edn
=> (def reddit-data (clojure.edn/read-string (slurp "bench/reddit.edn")))
=> (time (cbor/spit-all "reddit.cbor" (repeat 17 reddit-data)))
"Elapsed time: 158.625145 msecs"
1683493
I was preparing a full repro. Here it is:
$ wget -O test.edn
$ clj -Sdeps '{:deps {com.cognitect/transit-clj {:mvn/version "0.8.313"}}}'
Clojure 1.10.0
user=> (require '[clojure.edn :as edn])
nil
user=> (def edn (edn/read-string (slurp "test.edn")))
#'user/edn
user=> (count (keys edn))
799
user=> (require '[cognitect.transit :as transit])
nil
user=> (require '[ :as io])
nil
user=> (def writer (transit/writer (io/output-stream (io/file "transit.json")) :json))
#'user/writer
user=> (time (transit/write writer edn))
"Elapsed time: 1151.116438 msecs"
nil
now I’m pretty curious how cbor would do - I tried to copy enough of the sample dataset I had to make ~2 MB, but maybe there’s something gnarly in your data :man-shrugging:
@greg316 worse:
user=> (def writer (transit/writer (java.io.FileOutputStream. "transit.json") :json))
#'user/writer
user=> (time (transit/write writer edn))
"Elapsed time: 2039.522802 msecs"
because io/output-stream
is going to return you a FileOutputStream
wrapped in a BufferedOutputStream
with the default buffer size, so wrapping a larger buffer around that isn’t going to help you
something like this?
user=> (def writer (transit/writer (java.io.BufferedOutputStream. (java.io.FileOutputStream. "transit.json") 1024) :json))
#'user/writer
user=> (time (transit/write writer edn))
"Elapsed time: 1078.45894 msecs"
user=> (def writer (transit/writer (java.io.BufferedOutputStream. (java.io.FileOutputStream. "transit.json") (* 1024 1024)) :json))
#'user/writer
user=> (time (transit/write writer edn))
"Elapsed time: 1016.702847 msecs"
Note:
user=> (def bos (java.io.ByteArrayOutputStream. (* 1024 1024)))
#'user/bos
user=> (def writer (transit/writer bos :json))
#'user/writer
user=> (time (transit/write writer edn))
"Elapsed time: 31.144918 msecs"
nil
user=> (time (io/copy (.toByteArray bos) (io/file "transit.json")))
"Elapsed time: 16.471751 msecs"
nil
If desired, I can make an issue for this, but first I’d like to know if I’m not overseeing something
when I write directly to a file it’s slow (1s). when I write via a bytearrayoutputstream and then to a file it’s 47ms.
@borkdude try to add a call to .flush()
on the File outputstream on your other (non-Transit) benchmarks and see if it makes them 1sec
@ghadi Like this?
user=> (time (let [fos (java.io.FileOutputStream. "/tmp/foo.edn") w (io/writer fos)] (.write w (str edn)) (.flush fos)))
"Elapsed time: 77.273308 msecs"
user=> (time (with-open [fos (java.io.FileOutputStream. "/tmp/foo.edn")] (let [w (io/writer fos)] (.write w (str edn)) (.flush fos))))
"Elapsed time: 73.316135 msecs"
sorry, no time to look atm
Here’s the flamegraph: https://www.dropbox.com/s/4wv7bnvdrnhd5u9/Screenshot%202019-03-29%2020.23.16.png?dl=0
(what are you profiling, cpu samples?)
(you can change what is profiled ^^)
This is the EDN code from above: https://www.dropbox.com/s/e0viq6lnquavi0i/Screenshot%202019-03-29%2020.28.25.png?dl=0
The purple here is flush
: https://www.dropbox.com/s/2x8ovkcbjfszmgl/Screenshot%202019-03-29%2020.30.34.png?dl=0
Here’s the EDN flamegraph (as data): https://www.dropbox.com/s/2kd67lbhycw2gj9/edn-flamegraph.svg?dl=0 Here transit: https://www.dropbox.com/s/3dfh77wnvbqkkgv/transit-flamegraph.svg?dl=0
clearly FileOutputStream.flush() is slow (the red is a native method). What I don't understand is why we can't make the EDN writing incur the same cost
[ANN] Cognitect Labs' aws-api 0.8.283 is now available! https://groups.google.com/forum/#!topic/clojure/cOIF5cjAecI
flame graphs are pretty nifty - I was using the same tool while profiling clj-cbor :male-scientist: 📊
I am so amazed in what you can put in a svg
also, terrified, why does it support scripts :’)
this is absolutely great for deciding what to optimize. fuzzing about details that only contribute to a small gain is a risk
its built on a java library iirc
there are definitely graphs in that style available with chrome/react
just opened youtube and I saw this: https://www.youtube.com/watch?v=yqNLDpooFjw