This page is not created by, affiliated with, or supported by Slack Technologies, Inc.
2022-02-11
Channels
- # architecture (1)
- # babashka (61)
- # babashka-sci-dev (1)
- # beginners (85)
- # calva (112)
- # clj-kondo (279)
- # cljdoc (16)
- # cljs-dev (15)
- # cljsrn (7)
- # clojure (168)
- # clojure-europe (36)
- # clojure-nl (10)
- # clojure-spec (6)
- # clojure-uk (5)
- # clojured (1)
- # clojurescript (20)
- # core-async (16)
- # crypto (2)
- # cursive (13)
- # datomic (25)
- # events (7)
- # fulcro (21)
- # google-cloud (3)
- # graalvm (2)
- # graalvm-mobile (2)
- # gratitude (3)
- # helix (20)
- # honeysql (4)
- # hugsql (15)
- # introduce-yourself (15)
- # leiningen (2)
- # lsp (24)
- # luminus (22)
- # malli (21)
- # meander (11)
- # midje (1)
- # other-languages (1)
- # pathom (8)
- # re-frame (5)
- # reagent (5)
- # releases (2)
- # reveal (1)
- # shadow-cljs (18)
- # spacemacs (17)
- # sql (9)
- # tools-build (12)
- # tools-deps (4)
- # vim (12)
If I'm using core.memoize, am I correct that there is no way to get available? from RetryingDelay ? The reason is, I would like to check if the cached function has completed or not @seancorfield I think this might be up your alley?
I'm not in a place where I can look that up right now will but will try to remember to dig into the question tomorrow, once I get to work @U01ENMKTW0J
Of course - much appreciated - thanks @seancorfield!
I might be breaking some :volatile-mutable rules, but extending
(deftype RetryingDelay [fun ^:volatile-mutable available? ^:volatile-mutable value]
by adding
clojure.lang.IPending
(isRealized [this]
available?))
Means I can call
done? (some->
(clojure.core.cache/lookup
@(::memo/cache (meta my-memoized-fn))
cache-key)
realized?)]
The context is I have a function listing items in a table. For large tables, it may take 10s to count all the items, and so I have a cache handler that checks if the full table has been cached by the memoization function, but if it hasn't completed then I have an alternative that doesn't include a count.
@U01ENMKTW0J Finally got around to looking at this and it seems reasonable to implement IPending
for the RetryingDelay
so I'll go ahead and do that...
Version 1.0.257 should be up on Maven Central "soon" with that addition.
@seancorfield thank you - that's a very useful addition!
Hi, I am working on a Clojure app that is using Java 8. Is it worth it to upgrade to Java 11?
Might as well go to 17 directly. No point in lagging behind several years. And yes, generally it is worth upgrading. Performance gets better due to improved GC and stuff.
It's a testiment to how well the JVM is engineered, ensuring backwards compatibility that things just work!
We finally have Java 11 approved at work (Java, not Clojure). Now it's a question of getting multiple teams to update all the apps they support so that the one app we all work on together can use something other than Java 8. So in other words, it probably won't happen 😉
Has anyone used orchestra with cursive? How did you make cursive recognize the defn-spec macro?
Not sure about that macro specifically, but you can alt+enter a symbol and change how it's resolved.
Often, if a macro does something like e.g. defn
, you can then resolve it as defn
and all symbols that it introduces will be treated as functions.
FWIW clj-kondo has a :lint-as {orchestra/defn-spec clj-kondo.lint-as/def-catch-all}
setting which registers the var and possibly other stuff, but doesn't emit any warnings.
kind of, this is an experiment towards that: https://github.com/clj-kondo/inspector a similar, better version exists for #malli
I might need a refresher, what performs better than (not-any? pred s)
for huge strings while still having a HOF interface? Can I use transducers here?
Not sure how transducers would help here given that it's a 1-stage operation.
But regular reduce
with reduced
when pred
is true should be faster. Just tried it on a 200MB string - it was around 3 times faster.
Nice! Yes, makes sense. Do you know what happens exactly when I reduce a string? It's the huge seq of chars that I'm afraid of. Not sure if by using reduce that is avoided.
Note that it's my code-based understanding - I haven't actually done any measurements.
not-any?
does more work, calling seq
on every single step. reduce
ends up calling seq
once, creating StringSeq
. That seq, just like subvec
, doesn't actually unroll your string into a sequence of character - it simply provides a seq interface that operates on the original string.
Oh, and reducing over StringSeq
actually does use loop
with the actual string inside, so you won't achieve much by using loop
yourself, it seems.
Yeah TIL about https://github.com/clojure/clojure/blob/master/src/jvm/clojure/lang/StringSeq.java - for some reason I thought the sequence of chars one sees with (seq "foo")
was more straightforward.
Yeah, there are a lot of special cases for most frequent concretions of the Clojure abstractions.
This is where reducing over stringseq ends up https://github.com/clojure/clojure/blob/master/src/clj/clojure/core/protocols.clj#L145
Why would using recuded be faster than not-any?
, which is just (comp not some)
, and some
has early return?
Calling seq every step ends up being a noop because on StringSeq the seq method should just return itself, and I'd expect the jit to handle that well
You're correct - now I see that it's not seq
but next
that's the problem. reduce
ends up being a plain loop
whereas some
ends up creating a new StringSeq
on each step.
use of not-any and reduced are within negligible difference of each other on performance, along with using loop
to iterate a seq over the strong. Using the .charAt
function on the string repeatedly produces code about 2x slower, regardless of if reduce or loop is used.
Even in this case where the loop can be optimized to use primitives, it's not any faster than using reduce with an anonymous function.
@U45T93RA6 there's some performance numbers for you.
@U5NCUG8NR Can you try a 10-100MB string instead? That's where I've seen a noticeable difference.
@U5NCUG8NR did your loop implantation in the end use unchecked and unboxed math? You're also boxing the char from charat
it's not unchecked, but it should be unboxed because I'm using a constant in the loop it should be a long.
More the point I was making though isn't "char at is slow", but rather "the effort required to making something faster than not-any?
is most likely not worth it"
I can try to make it a bit better with some extra hints, but also yes, the char is boxed, and that's a requirement of the original question because Clojure only has unboxed fns for long and double, and the original request is to retain the functional interface.
Hold on. How did your code even work? I get this:
No matching field found: isDigit for class java.lang.Character
oh I changed that in my repl session, forgot to update the gist. I'll update the gist without reflection and with that changed in a second.
(def digit-str "blah1blah")
(def non-digit-str "blah'blah")
(defn char-at0 [s]
(let [cnt (count s)]
(not
(loop [idx 0]
(if (>= idx cnt)
false
(if (Character/isDigit (.charAt s idx))
true
(recur (inc idx))))))))
(defn char-at1 [^String s]
(let [cnt (count s)]
(not
(loop [idx 0]
(if (>= idx cnt)
false
(if (Character/isDigit (.charAt s idx))
true
(recur (inc idx))))))))
(defn char-at2 [^String s]
(let [cnt (.length s)]
(not
(loop [idx 0]
(if (>= idx cnt)
false
(if (Character/isDigit (.charAt s (unchecked-int idx)))
true
(recur (unchecked-inc idx))))))))
(do
(cc/quick-bench (char-at0 digit-str)) ;; 33us
(cc/quick-bench (char-at1 digit-str)) ;; 27ns
(cc/quick-bench (char-at2 digit-str)) ;; 4ns
(cc/quick-bench (char-at0 non-digit-str)) ;; 60us
(cc/quick-bench (char-at1 non-digit-str)) ;; 30ns
(cc/quick-bench (char-at2 non-digit-str))) ;; 5.4ns
if you inline is-digit, yes
that's explicitly not allowed in the original constraints though
Alright, I've updated the gist, there's no reflection at this point, and the charat version is still slower.
sigh yes, you could put a hint there, but that's not the point. The point is to provide the same interface as not-any?
and requiring your user to type hint that in order to get reasonable performance is a bad requirement.
If you're going to be reflecting on the hotspot you're calling every iteration and dealing with host platform, there should not be any reflection there
Comparing performance for anything when there's reflection up in the air is meaningless
Yes, removing reflection there increases performance, but this question isn't about "how can I increase performance" it's "is there a better performance pattern for not-any?
on strings" and the answer is "no, if you don't want to put in a lot of work" which is a perfectly fine conclusion imo.
Yes you can squeeze more ns out of it
no I don't think it matters
This is basically one of those things of "is java faster than Clojure?" and the answer is "not technically because you can just write java in clojure" when the more helpful answer is "idiomatic clojure is within 1-4x of java's performance"
@U5NCUG8NR The cost of a step in not-any?
is x
, not constructing StringSeq
on any step removes a
from it, using reflection on every step adds b
to it. Where b >> a
.
To corroborate the said, here are my results, along with the code:
(use 'criterium.core)
(defn the-pred
[c]
(= c \b))
(def s (clojure.string/join (repeat 10000 \a)))
(defn not-any?* [pred s]
(not-any? pred s))
(bench (not-any?* the-pred s) :verbose)
;; Evaluation count : 134700 in 60 samples of 2245 calls.
;; Execution time sample mean : 449.648953 µs
;; Execution time mean : 449.690157 µs
;; Execution time sample std-deviation : 5.030324 µs
;; Execution time std-deviation : 5.140604 µs
;; Execution time lower quantile : 444.140272 µs ( 2.5%)
;; Execution time upper quantile : 462.200461 µs (97.5%)
;; Overhead used : 5.602629 ns
;;
;; Found 4 outliers in 60 samples (6.6667 %)
;; low-severe 2 (3.3333 %)
;; low-mild 2 (3.3333 %)
;; Variance from outliers : 1.6389 % Variance is slightly inflated by outliers
(defn reduced?* [pred s]
(not
(reduce #(when (pred %2) (reduced %2))
nil
s)))
(bench (reduced?* the-pred s) :verbose)
;; Evaluation count : 353640 in 60 samples of 5894 calls.
;; Execution time sample mean : 170.258429 µs
;; Execution time mean : 170.258658 µs
;; Execution time sample std-deviation : 432.600961 ns
;; Execution time std-deviation : 439.733013 ns
;; Execution time lower quantile : 169.535596 µs ( 2.5%)
;; Execution time upper quantile : 171.182857 µs (97.5%)
;; Overhead used : 5.602629 ns
;;
;; Found 1 outliers in 60 samples (1.6667 %)
;; low-severe 1 (1.6667 %)
;; Variance from outliers : 1.6389 % Variance is slightly inflated by outliers
sure, I'm redoing the benchmarks without any reflection involved, but what I'm observing is exactly what I expected, that performance relative to each other is still within an order of magnitude, they're just both faster.
Again, I wasn't claiming "char at is slow" but rather "`not-any?` is probably fast enough"
Then at that point we're getting into metrics that are based on benchmarking the whole program, not just this one function.
Seriously, you both seem to be missing my point entirely, that the amount of work required to get it faster is high relative to the actual performance you gain (unless this code is in a critical hot path and you know this will speed stuff up, preferably using a causal profiler)
But that is still important, and not-any?
here is objectively worse due to the creation of N objects.
> the amount of work required to get it faster is high relative to the actual performance you gain
That would be up for the OP to decide, no? IMO the reduce
solution is around 20% more complex.
Yes it's up to OP to decide, but at least as it reads to me it's asking "is there a clojure core pattern I can use that's faster and similar effort" not "this is a hot code path I need to squeeze out every ns". maybe OP can comment on that, but honestly I think this threads usefulness has decreased with its length.
To answer OP's question specifically, not-any?
is the inverse of some
, so I would reach for xform's some-rf and then benefit from the speedups of using reduce and the ability to plug in transducers, then wrap the result in not
. The speedup is about 2-6x, increasing with the size of input
That's a good answer, xforms needs more love
My question was originally somewhat misguided (since it assumed a seq on a string was just a huge list / lazy seq) so it doesn't matter that much.
I'm just happy to complete knowledge here and there.
I think it's pretty well-known that one cannot automatedly get better performance while keeping the program high-level, so obviously the more you want to squeeze, the more you'll have to contort (type hints and reduce
aren't much of a contortion by my standards... reaching for xforms
would likely be)
What's the gist of why reduce
is faster than not-any?
anyway? Is it because some
uses next
?
On every step, yes, thus creating StringSeq
on every step. Whereas reduce
uses loop
over plain indices within the same StringSeq
.
you mean using the reduce
over a seq of indices and calling charAt
? Or did I miss some way that StringSeq implements IReduce or something? I didn't spot that anywhere.
Oh does reduce have a special case for IndexedSeq
? Again, I'm having a hard time finding where you're seeing this.
reduce
-> clojure.core.protocols/coll-reduce (Object)
-> clojure.core.protocols/seq-reduce
-> clojure.core.protocols/internal-reduce (StringSeq)
clojure.lang.StringSeq
(internal-reduce
[str-seq f val]
(let [s (.s str-seq)
len (.length s)]
(loop [i (.i str-seq)
val val]
(if (< i len)
(let [ret (f val (.charAt s i))]
(if (reduced? ret)
@ret
(recur (inc i) ret)))
val))))
Literally the only reason I'm still using IDEA + Cursive instead of VS Code + Calva is the ability to navigate such things effortlessly.
Maybe not for much longer! https://github.com/BetterThanTomorrow/calva/issues/1486
(and with that post this thread goes full circle, since the question popped up while hacking on enrich-classpath)
Also, it's not just about classpath and files - it's about being able to navigate Java code. And it sounds like something way out of the scope of Calva. :(
It's kinda in scope - calva would likely inform java-lsp (or whatever it's called) of the classpath to be used. namely a classpath enriched with enrich-classpath
The last time I tried was about a year ago. Either I did something wrong or it was barely working.
@U45T93RA6 to circle back to your original point, did you profile it? Do you know where the expensive bits are?
It depends. Could be that you call it on small strings then you may not see any improvement
IDK, for technical reasons it would actually take me actually 7d or so to give you an accurate answer. Let's leave it at "I got some TILs", I'm not excessively interested in squeezing perf for what happens to be a one-off assert
@hewrin the one thing you'll have to be aware of with the jump to java 17 is that codox needs you to pass some extra command line arguments in java 16+, so if you're using codox for autogenerating docs, that might catch you. The solution is in https://github.com/weavejester/codox/issues/197
Can I extend a protocol on some (non-IMeta implementing) value? I am looking to create something that behaves like a j.u.c.CompletableFuture but implements another protocol, but I don't want to implement that protocol for all CompletableFuture’s
So I guess I'm kinda looking for extending via metadata, but I can't because it's not IMeta, right?
you could use a factory function that returns a reify - that implements IMeta and can also implement CompletableFuture and encapsulate your object
that is, you'd delegate the CompleteableFuture methods to the existing object, and reify would handle the having metadata for protocol extension part
the thing you are extending to will have a concrete class that is not CompleteableFuture, you can extend the protocol to that
Right, I was afraid so; the backing interfaces for completable future are huge tho…
(No worries, appreciate the thinking along)
there's a metaprogramming approach of looping over and delegating methods at compile time, which can save you a lot of boilerplate and lead to really weird bug fixing experiences
oh - since CompletableFuture is concrete, I think you'd need a proxy rather than a reify (but don't use proxy-super as a shortcut, it's not thread safe)
Maybe I'm just doing it wrong. I have a queue of tasks, I want to offer a function that lets you enqueue, gives you a completable future. The somewhat unrelated process will take those queue items, gets their task description, executes it some way and delivers that completable future. So at this point I'm trying to side channel my task desc in that completable future
I kinda like that my queue is just a list of 'things that can be completed’ but I don't like that I have to side channel the task desc into this execute fn
if I was doing that from scratch I'd use a hash-map with a data structure describing the task, and a promise
that you deliver from the worker on completion
Haha that was what I had initially, I made it too complicated ^^
one thing I like about promise is that if you provide it as a result handler, calling it does the right thing
ins)user=> (def p (promise))
#'user/p
(ins)user=> (p "foo")
#object[clojure.core$promise$reify__8501 0x1e9804b9 {:status :ready, :val "foo"}]
(ins)user=> @p
"foo"
Is there a good way to simulate getting killed by out-of-memory errors in a dev environment? I'm setting -J-Xmx512m
... I'm hoping for some unix trick like a memory version of a browser's network conditioner.
any reason that actually inducing an OOME wouldn't work?
I guess I want to test and re-test using parts of the application and see when I go over a threshold. I could watch VisualVM or such, I guess.
% clj
Clojure 1.10.2
user=> (throw (java.lang.OutOfMemoryError.))
Execution error (OutOfMemoryError) at user/eval1 (REPL:1).
null
user=>
I am just pointing out the assumption you are starting with, that your process is getting killed by ooms is incorrect
I don't know of anything that will force an oom other than writing code that throws one
maybe a restatement of the objective: I want, under no circumstances, for this JVM to get bigger than, say, 512mb, in RAM
that is really hard to do, in part because of the way applications use virtual memory for things like file pages
lemme repeat that back ... set up the OS to kill the JVM, then tell the JVM to go crazy take as much RAM as you want?
(there are things like systemd-oomd which will kill processes when a cgroup is low on memory)
what would it the limit do then, without the systemd add-on? Cause the java.lang.OutOfMemoryError
exception and leave the JVM in whatever state that implies?
ok maybe a better tuned version of my objective: set some limits to dev that mean we stop assuming there's as much RAM available in prd, and suffer when we forget. I guess -J-Xmx512m
is one ... are there other good ones?
it all really depends on why, and what you are doing, but if you just want to limit the garbage collected heap size of the jvm, that is what -Xmx does
if you also care about things like native memory then you have to look out side of the knobs the jvm provides