This page is not created by, affiliated with, or supported by Slack Technologies, Inc.
- # announcements (23)
- # babashka (66)
- # babashka-sci-dev (7)
- # beginners (24)
- # biff (2)
- # calva (19)
- # cider (10)
- # clj-kondo (12)
- # cljs-dev (3)
- # cljsrn (2)
- # clojure (37)
- # clojure-art (1)
- # clojure-europe (50)
- # clojure-gamedev (1)
- # clojure-nl (1)
- # clojure-norway (22)
- # clojure-uk (7)
- # clojurescript (6)
- # conjure (28)
- # cursive (19)
- # data-science (11)
- # fulcro (21)
- # holy-lambda (12)
- # honeysql (6)
- # hyperfiddle (2)
- # jobs (1)
- # lsp (5)
- # malli (4)
- # meander (3)
- # missionary (8)
- # nbb (5)
- # off-topic (39)
- # rdf (9)
- # reitit (1)
- # releases (1)
- # sci (21)
- # shadow-cljs (42)
- # specter (1)
- # xtdb (11)
Has anyone else had performance issues using clojure's built in set operations? I added some intersection and unions, and now my program seems to run a lot slower even when all the sets are empty. Is that normal? If so are there faster set libraries?
Can you provide a bit more context? "a lot slower" seems strange but it depends what you changed the code from/to...?
Well, you "added some intersections and unions" sounds a bit... random... so I wondered if you could be a bit more specific?
I have a list of keys, like this:
I want to perform an operation only on those keys which have been newly added. If previous iteration through event loop had
'(:a :b :c)
then in this iteration I only want to perform that operation on :c. So I did this:
In case it's relevant, event loop seemed to process > 1000 iterations/sec before adding these set operations. Speed dropped to approx. 600 iterations/sec after, which seems like a lot considering I do < 10 set operations each iteration, and the sets are really small (< 10 elements).
(require [clojure.set :as set]) (def old-set (set '(:a :b))) (def new-set (set '(:a :b :c))) (def keys-to-actually-operate-on (set/difference new-set old-set)) (do-stuff keys-to-actually-operate-on)
What's one of the better profilers for beginners? Right now all I have is a stop watch measurement. It would be nice to have a proper profiler measurement
Criterium is really good for benchmarks. I have an alias for that in my dot-clojure file if you're familiar with that repo?
You don't say how you were looping prior to adding sets so it's really hard to tell why "what you were doing" seemed faster than "what you are doing now".
I would suspect the creation of a set more than the union or difference. How large are your lists/set?
I don't think they are "notoriously slow" but there is an alternative implementation that claims to be faster https://github.com/droitfintech/fset ... I've not used it in a scenario where I have cared about the performance very much, but it be worth a look. I would suspect though that turning lists into sets is a contributor to the work that's getting done
> Really all I'm asking is, "Are Clojure's set operations notoriously slow?" That's far too vague a question to get a useful answer. Performance depends on so many things and that's why I specifically asked exactly what changes you made when you "added some intersections and unions" -- sharing code helps everyone help you and it may well be that there would be small changes to your code that dramatically speeded it up. I'd say "No, Clojure's set operations are not notoriously slow" but that's also a vague answer and not very useful for your specific situation I suspect.
Creating a clojure set of N elements from a collection (e.g. a list as in your example) takes at least O(N) time, so if those sets do not change on every one of your iterations, you could perhaps find ways to not create them as often, maintaining them somewhere from one iteration to the next, but I do not know if your application makes that a straightforward thing to do.
Yeah, I'd want to get everything converted to sets upfront and then operate only on sets instead of converting sequences to sets on the fly, as much as possible. But without seeing code, it's hard to know what to really suggest 🙂
I'm thinking the best thing to do is maintain sets alongside lists. I don't like the fact that I'll have to do some things twice but I don't see a way around it.
hello, today i have just discovered
(*'). i also found out that
(= (class (* 1 2)) (class (*' 1 2))) are the same, both are
java.lang.Long. but is there a difference in terms of performance? is it okay if i only use
(*') to avoid any integer overflow errors?
I would definitely expect a s small performance hit, as
*' does more than
*. It is essentially
* with an additional overflow check and a second
* on BigIntegers. Like so often, this probably does not matter though in practice.
Avoiding integer overflow errors might actually not be what you want. It will promote big results to
If you are expecting large numbers and can deal with
user=> (*' 4611686018427387904 4) 18446744073709551616N
BigIntegerlater on — great, use
*'! Most of the time, you probably don’t, and might just postpone the exception when passing the result so some function that cannot handle
BigInteger. In many domains, gigantic integers are actually not reasonable. E.g. if you want to represent a persons weight, distances on earth in miles, request milliseconds, etc, getting a number like
18446744073709551616Nwill probably not help you.
* will throw an exception early, which might be better in most cases:
user=> (* 4611686018427387904 4) java.lang.ArithmeticException: integer overflow [at <repl>:2:1]
The error here would be to accept a large number like
4611686018427387904 in the first place.