This page is not created by, affiliated with, or supported by Slack Technologies, Inc.
2019-05-28
Channels
- # announcements (11)
- # aws (30)
- # beginners (98)
- # calva (11)
- # cider (42)
- # clj-kondo (4)
- # cljdoc (1)
- # cljsrn (5)
- # clojure (132)
- # clojure-europe (4)
- # clojure-ireland (1)
- # clojure-italy (35)
- # clojure-japan (2)
- # clojure-nl (5)
- # clojure-spec (5)
- # clojure-uk (24)
- # clojurescript (71)
- # clojutre (1)
- # core-async (6)
- # cursive (9)
- # data-science (4)
- # datascript (3)
- # datomic (78)
- # duct (16)
- # emacs (14)
- # events (2)
- # fulcro (141)
- # graalvm (5)
- # hoplon (14)
- # hyperfiddle (2)
- # jobs-discuss (14)
- # joker (8)
- # luminus (2)
- # off-topic (7)
- # om (1)
- # pathom (4)
- # pedestal (7)
- # planck (2)
- # quil (1)
- # re-frame (14)
- # reagent (2)
- # reitit (14)
- # robots (1)
- # shadow-cljs (20)
- # spacemacs (25)
- # specter (1)
- # sql (122)
- # tools-deps (63)
- # unrepl (2)
- # yada (34)
http://clojure-doc.org/articles/language/macros.html#macro-hygiene-and-gensym is a pretty good intro to how Clojure handles hygiene
I'm hitting a very strange bug atm, I'm using node-jre to run a (uber)jar. But somehow the resource paths are weird, in that I can only io/resource files from jar, if they are file and not folder, and the file is not nested. Not the newest jre running the jar, could that be the reason or something else? (think I solved it, had some pom.xml from totally unrelated java project lurking around)
anybody good at compojure routing? I want to have different wrappers for my login routes vs. other routes, then combine it all into one ring handler
@restenb it's a bit of a hassle if you're using cemerick's friend. I hade a terrible time getting the auth sent the right way. But solveable, just combine the protected routes with public routes in the handler.
@restenb You can use (routes ..)
to combine multiple set of routes (since they're just handlers).
(routes
(wrap-login (routes ... your login routes ...))
(wrap-other (routes ... your other routes ...)))
something like that I think...routes
is just a function that takes handlers as arguments and combines them. Compojure goes through the list of routes/handlers until one of them returns non-`nil`, which with routes like (GET "/foo" [] something)
which is a handler will return nil
if it doesn't match...
user=> ((routes (routes (GET "/foo" [] "Get Foo") (GET "/bar" [] "Get Bar"))
(routes (GET "/quux" [] "Get Quux") (POST "/quux" [] "Post Quux"))) {:uri "/quux" :request-method :get})
{:status 200, :headers {"Content-Type" "text/html; charset=utf-8"}, :body "Get Quux"}
user=> ((routes (routes (GET "/foo" [] "Get Foo") (GET "/bar" [] "Get Bar")) (routes (GET "/quux" [] "Get Quux") (POST "/quux" [] "Post Quux"))) {:uri "/bar" :request-method :get})
{:status 200, :headers {"Content-Type" "text/html; charset=utf-8"}, :body "Get Bar"}
user=> ((routes (routes (GET "/foo" [] "Get Foo") (GET "/bar" [] "Get Bar")) (routes (GET "/quux" [] "Get Quux") (POST "/quux" [] "Post Quux"))) {:uri "/bar" :request-method :post})
nil
I didn't add any middleware into that, but hope it gives the idea? @restenb
yeah I got the idea at your first post @seancorfield, thanks. somehow I was not seeing that (compojure/routes)
just takes handlers & assembles a handler
How to set Content-Security-Policy headers in Pedestal? Application is giving following errors.
Resolved it using this issue: https://github.com/pedestal/pedestal/issues/499
How do I profile the overhead of using lazy seq? Assuming my app is slow, how do I find out if it’s because I’m using too many lazy operations?
You can also use VisualVM
be careful though… laziness is hard to benchmark because it also includes the cost of doing the work.
Yeah I assume it’s hard, because it’s impossible to compare it to the non-lazy version, unless we have it implemented, which is hard again, and is the very reason why we implement the lazy version in the first place.
Yeah you really need two implementations… but a profiler can guide you to the right decision… lazy sequences are usually fast enough though; but it depends on how much data you have going through them; and how many transformations you’ve stacked ontop — and of course how fast you need it to be 🙂 A sign laziness is a problem is lots of GC activity; but that can also be caused by other things. Also if you’re holding onto the head of seq accidentally can cause a lot of memory pressure/OOMs etc.
how do I eval
a fn
declaration and keep type hints? I can't get it to work. the expected result is that (eval generated-fn2)
won't give me reflection warnings.
here's the problem on http://repl.it, if someone wants to give it a go: https://repl.it/repls/ExcitedDetailedRedundancy
is this an implementation detail, or one can rely on it?
(map f coll)
returns "chunked lazy seq", which gets realized 32 elements at a time
(map f coll1 coll2 ...)
returns "proper lazy seq", when gets realized 1 element at a time
(defn unchunk [s]
(when (seq s)
(lazy-seq
(cons (first s)
(unchunk (next s))))))
is usually how it's done(question came from reading lots of duplicated code effort here https://github.com/dakrone/clojure-opennlp/blob/master/src/opennlp/tools/lazy.clj )
it's map
who decides to iterate on chunks rather than els, based on the type of the input coll
(defn map
([f coll]
(lazy-seq
(when-let [s (seq coll)]
(if (chunked-seq? s)
(let [c (chunk-first s)
size (int (count c))
b (chunk-buffer size)]
(dotimes [i size]
(chunk-append b (f (.nth c i))))
(chunk-cons (chunk b) (map f (chunk-rest s))))
(cons (f (first s)) (map f (rest s)))))))
there's really no good documentation on this, because you "shouldn't be thinking about it, it's an implementation detail" is the core position I think
I can tell you that the current impl produces chunked seqs off of:
- vector or their seqs
- bounded integer range
- Iterables that are not otherwise Seqable
- the return values of sequence
used with a transducer
- the return values of iterator-seq
and most of the seq functions (`map`, filter
, keep
, for
etc) are "chunkiness preserving", while reduce
/`trasnduce` "understand" the chunkiness of its input coll
I've used Criterium to measure performance of running some data imports on my MacBook vs. my PC (PowerShell vs. WSL), and got these results:
WSL: 9.075001 seconds
PowerShell: 5.753443 seconds
MacBook: 12.337244 seconds
Things are better on the PC, but not as good as I expected (16GB 2017 MacBook vs. 32GB AMD Ryzen 2700X PC). Upon inspecting the PC, I notice that only four cores out of eight in total are utilised.
The function I'm testing is using a simple pmap
to process things in parallel. Is there a limitation to pmap
in only using four cores, or are there things I might need to tweak with the JVM?
(Oh, and BTW, the tax paid by using WSL seems very high indeed)it uses the # of available processors + 2 https://github.com/clojure/clojure/blob/clojure-1.9.0/src/clj/clojure/core.clj#L6941
Right. Well, that is 3 threads for the one processor then. I guess I have to look into other means of parallelization.
pmap should then use up to 10 parallel threads (or 18 if you you're talking about physical cores and have hyperthreading)
Ah, gotcha. Yes, 16 logical cores. Alright, so for some reason it only reached 4 physical cores. Something else is going on then.
it may depend a lot on the type of operations you're doing. you might not get benefits you hope for just by running things in parallel In extreme cases, it may even be slower
In this case it's reading fairly large XML documents, extracting data, and doing transformations. pmap
ing should be a pretty straightforward way of optimizing in my mind. Each document is fairly long-running.
Maybe, but you may get a lot of IO or memory contention. As always, it's best to measure 🙂
Btw. for pmap
this is the interesting line related to "semi-lazy" -> it tries to stay ahead "enough" (factor n
): https://github.com/clojure/clojure/blob/clojure-1.9.0/src/clj/clojure/core.clj#L6948
in general, Claypoole is interesting alternative to consider: https://github.com/TheClimateCorporation/claypoole
that may be it, also i never end up using pmap bc they dont work with transducers and transducers are typically faster for me
The entire thing is lazy from start to end, with no transducers mixed in. It should be realized only on the accumulation of results outside of pmap
.
Yep, def
ing the pmap
expression returns immediately.
If you look at the source code for pmap you can see the expression it uses to determine the maximum parallelism: (+ 2 (.. Runtime getRuntime availableProcessors))
You can run that expression in a REPL on your system to see what the JVM considers that number to be.
Even if it is (+ 2 8)
or 10, pmap can only give good performance increases if a couple of things are true: (a) the work done for each element of the sequence is large, compared to the work required for the JVM to create a new thread, which is what pmap does for each element. (b) Each element should take about the same amount of time to process. If they take wildly different amounts of time to complete, then pmap limits the parallelism because it does not work more than a certain amount ahead of the last unfinished element.
There are other libraries, e.g. using Java's ExecutorService, that try to keep N threads busy at all times, which avoids issue (b) that pmap
's implementation has.
I have not used it myself, but I believe this library offers some bit of Clojure API around Java's capabilities, but I have heard that several people go straight to Java interop for this, too: https://github.com/TheClimateCorporation/claypoole
Cool, thanks @andy.fingerhut (and @U06BE1L6T, @U0C7D2H1U). That might be the bottleneck in that case. The documents I process are wildly different in size. I think the minimum time I've spotted for a single document is ~40ms for an outlier. Is this small compared to the cost of setting up a thread? If so, I guess I might need to chunk them into linear bits. Even then, it wouldn't load every core evenly. I guess I could try to organise them into chunks of roughly the same size.
I suspect one could do a lot of fiddling to try to make pmap
use its max parallelism as often as possible, whereas the thread pool solution would be likely to get you there with less fiddling.
If max parallelism was a relatively important goal for you.
Yeah, max parallelism is by far the easiest optimization I can do, as fiddly as it is.
claypoole
and some minor refactoring sees a good improvement: 3.416247 seconds
in PowerShell.
We were just discussing keyword parameters in #beginners , and I think I remember a conversation somewhere back a ways where people were saying there was a reason to prefer plain maps over keyword params. Can’t remember what the reason was, though. Sound familiar to anyone?
the reason I don't use keyword params is because you can't pass them around as data
Yeah, as soon as you want to composite a fn with keyword params, it's caller is responsible for splatting out the options instead of just passing in some map it might have gotten from its own caller
(defn foo [big-ball-of-params]
;; I have to peel each arg one by one off of the map
(bar :a (:a big-ball-of-params)
:b (:b big-ball-of-params)
:c (:c big-ball-of-params)))
(defn bar [& {:keys [a b c]]
(println a b c))
That’s what it was, composition. Tks folks.
To be fair you can still get there with apply, but as soon as your caller is drawing the keyword args from n>1 maps your caller has a lot more to juggle.
isn’t destructuring a little bit too forgiving in the :or
here?
(let [{:keys [:a] :or {x 1}} {}] #_[a x]) ;; this is fine
(let [{:keys [:a] :or {x 1}} {}] [a b x]) ;;=> error about x
(let [{:keys [:a :x] :or {x 1}} {}] [a x]) ;; works
Macroexpand it -- see what it expands to...
It is the kind of thing a linter could warn about. Eastwood has not implemented it, although I created an issue to remind developers about it a while back: https://github.com/jonase/eastwood/issues/225
Maybe this one is related, too: Maybe this one, too: https://github.com/jonase/eastwood/issues/157
@andy.fingerhut that’s what I figured: https://github.com/borkdude/clj-kondo/issues/214
(@ghadi I saw you typing for a moment, if you have feedback on this, I would be happy to hear)
yes, the idea of clj-kondo is that it catches errors before your REPL sees it, there is this interval in time where thoughts are transformed into sexprs into your buffer, but have not been evaluated yet
works: https://www.dropbox.com/s/zjbbgpvhx7fc70e/Screenshot%202019-05-28%2023.30.11.png?dl=0
the message could maybe be clearer, I’m all ears. this is the general message you get when you don’t use a binding, e.g. in (let [x 1])
Maybe "duplicate binding id", if you can have a different detection and/or message for this case vs. an unused binding?
I don't know if you maybe already have this for clj-kondo, but a short description or message, and then a link to a place you can get more details / examples / workarounds / etc., can be useful in explaining to users what is going on.
the detection mechanism is the same for all “unused bindings” but I could make a few tweaks here and there. that link idea is nice, it’s the same that shellcheck does, I might implement something like that
I use such a small subset of bash's capabilities, that it is self-linting 🙂
if i reduce over a lazy seq, will the list get realized one at a time? e.g (reduce ... (for [x (range 2)])
See: http://clojure-doc.org/articles/language/laziness.html#lazy-sequences-chunking
I think that document answers a slightly different question
the document answers the question, I misread
But not one at a time, they're realized in chunks. I thought this was the question.
right, chunking means there's no guarantee of one at a time processing
Thanks to some helpful people here, I started using Claypoole yesterday (https://github.com/TheClimateCorporation/claypoole). I was delighted to notice that when I reduce
over the results of upmap
, reduce
is handed a result immediately when when upmap
is done with it. It doesn't wait for the entire upmap
expression to be done to do so. Feels like getting streaming for free.