This page is not created by, affiliated with, or supported by Slack Technologies, Inc.
2019-06-03
Channels
- # announcements (12)
- # beginners (44)
- # boot (27)
- # calva (73)
- # cider (1)
- # clj-kondo (9)
- # cljdoc (9)
- # cljs-dev (15)
- # cljsrn (6)
- # clojure (90)
- # clojure-dev (5)
- # clojure-europe (4)
- # clojure-ireland (3)
- # clojure-italy (22)
- # clojure-mexico (2)
- # clojure-nl (8)
- # clojure-uk (32)
- # clojurescript (12)
- # core-async (2)
- # cursive (16)
- # data-science (10)
- # datascript (3)
- # datomic (44)
- # emacs (17)
- # events (4)
- # graalvm (1)
- # hoplon (5)
- # jackdaw (17)
- # keechma (11)
- # nrepl (7)
- # off-topic (24)
- # re-frame (19)
- # reitit (4)
- # rewrite-clj (2)
- # robots (9)
- # shadow-cljs (20)
- # sql (12)
- # testing (4)
- # tools-deps (23)
- # vim (55)
also empty?
is implemented as (not (seq coll))
which is why complementing it again with (not (empty?))
is also not idiomatic
assuming you're just checking for non-emptiness, not-empty
is still useful for conj
purposes for example
This is just to short-circuit a function, so I'm looking for the thing that will do the least amount of work possible. Sounds like seq
.
I believe that many Clojure programmers consider (seq coll) as idiomatic for this purpose.
for counted collections, (zero? (count coll))
should be faster (`seq` will allocate an object if non-empty)
seq
basically just returns an iterator though doesn't it? it doesn't do anything besides that, like attempting to convert the sequence
seq
wraps the collection in a lazy sequence if not already one. If you call it on e.g a vector, it will coerce it and create a sequence. It's cheap, but not totally free.
@U06B8J0AJ eager collections are counted
Some measurements @andy.fingerhut @leonoel
;; list
(seq a-list) ;; 18,208247 ns
(count (zero? a-list)) ;; 11,566028 ms
;; Vector
(seq a-vector) ;; 14,460552 ns
(zero? (count a-vector)) ;; 5,665201 ns
(defn yo? [a-something]
(if (vector? a-something)
(zero? (count a-something))
(seq a-something)))
;; Conditional
(yo? a-list) ;; 29,218636 ns
(yo? a-vector) ;; 22,279887 ns
(zero? (count …))
appears to be faster for counted collections indeed.
Edit: Actually, rereading the numbers, using seq
is better than a conditional check. Therfore, (zero? (count …))
only when it's certain to be a counted collection.
@U06BE1L6T (criterium/bench …)
https://github.com/hugoduncan/criterium/
Hi, I'm trying to slowly convert some of my company on to Clojure. Do you have any good data on language adoption over the years?
this may be relevant (including survey data) https://clojure.org/news/2019/02/04/state-of-clojure-2019
@leonoel Remeasuring with counted?
, it seems to make a surprisingly large difference for the conditional evaluation:
(defn yo? [a-collection]
(if (counted? a-collection)
(zero? (count a-collection))
(seq a-collection)))
;; Conditional
(yo? a-list) ;; 25,774480 ns
(yo? a-vector) ;; 9,714775 ns
@mpenet Interesting, didn't know about that one. I'm getting these numbers:
(zero? (bounded-count 1 a-list)) ;; 54,374905 ns
(zero? (bounded-count 1 a-vector)) ;; 14,014822 ns
These things do make a difference. Using (zero? (count …))
instead of seq
(I know I'm working with vectors), I dropped the running time of a function from 3.41 seconds to 2.37 seconds (for a computation where I know the evaluation sits in a very hot path).
well it will just call seq on it so might not be so good with vectors, so yeah prolly useless
why are you even checking for a non-empty collection in the first place?
Parsing XML. I use it to halt evaluation early when a previous step in the parser produces no data of interest. In practice it does make a difference in performance, and choosing the right way to evaluate whether data is present or not seems to make an additional difference.
But also, it's just fun to learn the performance implications of the different alternatives. I didn't initially expect this line of investigation to yield much of interest.
what is the type of the thing being returned that you are checking?
The XML documents are in JATS format which is a right mess of data-meant-to-be-read-as-metadata, data-meant-for-display, and data-meant-to-simultaneously-work-for-display-and-metadata. My solution (which is probably suboptimal to begin with, but nevertheless) essentially invokes different parsers for different branches, that might return more data to be processed by a different parser and so on (because a "display" section, which I turn into hiccup, might return some forms that also needs to be read into proper named fields and so on). As a side effect of this solution, parsers return a collection that might or might not be empty. To avoid looking for another parser for nothing, I kill the branch at that point.
well, in general, it's better to avoid making the empty collection in the first place if you can avoid it
that's why clojure collection functions are polymorphic on nil
conj, assoc, etc
you don't need nil?, you can just use the value itself as a logical value
which is either nil or not
(if (parse ...) branch-when-parsed branch-when-nothing)
Before (parse …)
there's a lookup to find the correct parser, which reside in a map currently. I'm guessing that's the bottleneck that I'm circumventing.
Invoking ((get {…} nil) data)
, which resolves to (nil data)
isn't great, so at some point I have to shut things down before it gets to that.
And it seems to be more performant to shut it down before (get …)
than on receiving nil
from the index.
why would you not find a parser?
could you return a default "not-found" parser that did the right thing?
((get parser-table nil not-found-parser) data)
make the special case not special
Yeah, true, I could do that. not-found-parser
would return nil
no matter what. Do you reckon that would be better than checking for emptiness?
this is all general advice applicable in many places... don't make empty collections, lean on polymorphic collection append functions, use identity/nil as a predicate, etc
seq is a good predicate when doing seq things. you're not doing seq things.
the code above will be smaller, more readable, and more performant
I'll try it out and measure. Thank you. I haven't internalised the nil case into my collection of idioms like this. It's a good a time as any.
To report back, I'm not measuring any significant performance difference, but it did get rid of a conditional which is nice. In practice, I inject an mapping of nil
to (constantly nil)
in whatever handlers is passed to the function. No API change visible outwards, and overall it seems like a cleaner solution.
pretty much the only place I ever do this is when loop/recuring through a sequence (in which case you are forcing the seq anyways with first/rest)
that yo? function above returns either true/false/nil/seq which seems like a mess
I seem to remember that the -
in -main
has a special meaning, but I’m not able to google it ATM. Anyone have a link to more info about this?
it's the default prefix for static methods to genclass
there’s no special meaning as such it’s just the default option for genclass
but that can be changed with an option to genclass
Hello all, is it worth to use https://github.com/daveray/seesaw yet?
If you don’t have JavaFX or are more familiar with Swing, seesaw remains quite useful. It’s the basis of my most-used project: https://github.com/Deep-Symmetry/beat-link-trigger#beat-link-trigger
This small project was announced just recently, using Seesaw and Clojure: https://www.reddit.com/r/Clojure/comments/bx698b/i_made_a_small_epub_reader_in_clojure
Use #cljfx https://github.com/cljfx/cljfx
Hi everyone! What scheduler do you use in your Clojure projects? I am looking for something robust and reliable, like Celery for Python. I found several libraries but from the repos not one specific one was preferred more than others.
do you need persistent and distributed scheduling?
if no, use a ScheduledThreadPoolExecutor via interop
if yes, I think the only really mature and reliable option is quartz, there's some pain points in using that java lib via interop, so it's worth considering the quartzite wrapper
(if you use a specific infrastructure tool like mesos, the built in scheduling there is an even better bet, but that's going to be highly dependent on your other infrastructure decisions)
I haven't yet used mesos but seems cool. I am just making a small ETL tool that's probably not mission critical. I saw this as an opportunity to test out Clojure for this 😄
if you don't need distribution (if all your threads can run in one process, which is likely true with a small ETL tool), the ScheduledThreadPoolExecutor is much simpler and easier to use and comes with the JVM. You can persist state between restarts via a normal DB.
there's that funny thing with python where the limitation of the GIL means that any tool that needs more than one thread probably ends up being fully distributed, but with clojure threads are easy to use and you can save a lot of complexity if you don't need to be distributed
True. Being a back end engineer for like, forever, I never felt the need to go around the GIL. May be now it would actually be a problem for the data engineering tasks I would likely need to do.
what I mean is that N threads inside one process is much simpler, and better performing, than a distributed multi process solution; though maybe you don't even need threads here
Got you! Thanks for the suggestion!
I’ve used this for simple task scheduling cases https://github.com/aphyr/tea-time
It provides a nice interface. But tradeoff is that it’s an extra dependency. And ScheduledThreadPoolExecutor is probably way more battle-tested.
Thanks @valtteri! tea-time seems pretty intuitive! I will try it 😄
@U07FND2KH @U051SS2EU @valtteri @UHL84CDTP The Clojure ecosystem now has a reliable & versatile background processing & scheduling library: https://github.com/nilenso/goose
If you have a need for it, do give it a spin and ping in #goose for any issues or feature requests.
Another Quartz wrapper can be found in Immutant: http://immutant.org/documentation/current/apidoc/guide-scheduling.html
you can find some others on the https://www.clojure-toolbox.com/
under Scheduling