Fork me on GitHub
#clojure
<
2022-02-11
>
xlfe04:02:50

If I'm using core.memoize, am I correct that there is no way to get available? from RetryingDelay ? The reason is, I would like to check if the cached function has completed or not @seancorfield I think this might be up your alley?

seancorfield04:02:19

I'm not in a place where I can look that up right now will but will try to remember to dig into the question tomorrow, once I get to work @U01ENMKTW0J

xlfe04:02:39

Of course - much appreciated - thanks @seancorfield!

xlfe05:02:18

I might be breaking some :volatile-mutable rules, but extending (deftype RetryingDelay [fun ^:volatile-mutable available? ^:volatile-mutable value] by adding clojure.lang.IPending (isRealized [this] available?))

xlfe05:02:39

Means I can call done? (some-> (clojure.core.cache/lookup @(::memo/cache (meta my-memoized-fn)) cache-key) realized?)]

xlfe05:02:57

The context is I have a function listing items in a table. For large tables, it may take 10s to count all the items, and so I have a cache handler that checks if the full table has been cached by the memoization function, but if it hasn't completed then I have an alternative that doesn't include a count.

seancorfield22:02:32

@U01ENMKTW0J Finally got around to looking at this and it seems reasonable to implement IPending for the RetryingDelay so I'll go ahead and do that...

🎉 1
seancorfield23:02:09

Version 1.0.257 should be up on Maven Central "soon" with that addition.

xlfe23:02:23

@seancorfield thank you - that's a very useful addition!

Faris07:02:38

Hi, I am working on a Clojure app that is using Java 8. Is it worth it to upgrade to Java 11?

thheller07:02:30

Might as well go to 17 directly. No point in lagging behind several years. And yes, generally it is worth upgrading. Performance gets better due to improved GC and stuff.

👍 1
Faris07:02:55

I see, thanks!

dharrigan08:02:42

We've been using Java 17 in production for a while now. Zero issues.

1
dharrigan08:02:04

We rapidly went from 8 (which we've used for many a year) to 11 then 17.

Faris08:02:25

So the upgrade didn’t involve big changes?

dharrigan08:02:33

Zero changes.

🤯 1
dharrigan08:02:11

It's a testiment to how well the JVM is engineered, ensuring backwards compatibility that things just work!

emccue13:02:54

We had to bump a dependency on jackson

emccue13:02:57

its always jackson

😆 1
😭 1
emccue13:02:16

point in favor of data.json

mjw01:02:38

We finally have Java 11 approved at work (Java, not Clojure). Now it's a question of getting multiple teams to update all the apps they support so that the one app we all work on together can use something other than Java 8. So in other words, it probably won't happen 😉

GGfpc11:02:15

Has anyone used orchestra with cursive? How did you make cursive recognize the defn-spec macro?

p-himik11:02:04

Not sure about that macro specifically, but you can alt+enter a symbol and change how it's resolved. Often, if a macro does something like e.g. defn, you can then resolve it as defn and all symbols that it introduces will be treated as functions.

borkdude11:02:41

FWIW clj-kondo has a :lint-as {orchestra/defn-spec clj-kondo.lint-as/def-catch-all} setting which registers the var and possibly other stuff, but doesn't emit any warnings.

borkdude11:02:56

It also has a mechanism to expand custom macros to known constructs.

emccue13:02:55

and thats through the new cursive plugin?

borkdude13:02:38

#clj-extras-plugin

GGfpc15:02:15

Can clj-extras and/or clj-kondo highlight invalid typing based on fdefs?

borkdude15:02:00

kind of, this is an experiment towards that: https://github.com/clj-kondo/inspector a similar, better version exists for #malli

vemv13:02:58

I might need a refresher, what performs better than (not-any? pred s) for huge strings while still having a HOF interface? Can I use transducers here?

p-himik13:02:01

Not sure how transducers would help here given that it's a 1-stage operation. But regular reduce with reduced when pred is true should be faster. Just tried it on a 200MB string - it was around 3 times faster.

vemv13:02:43

Nice! Yes, makes sense. Do you know what happens exactly when I reduce a string? It's the huge seq of chars that I'm afraid of. Not sure if by using reduce that is avoided.

Ben Sless13:02:29

You can always loop over the string and call charAt

vemv13:02:58

I know, it's mostly a f(u)n question :)

p-himik13:02:41

Note that it's my code-based understanding - I haven't actually done any measurements. not-any? does more work, calling seq on every single step. reduce ends up calling seq once, creating StringSeq. That seq, just like subvec, doesn't actually unroll your string into a sequence of character - it simply provides a seq interface that operates on the original string.

p-himik14:02:25

Oh, and reducing over StringSeq actually does use loop with the actual string inside, so you won't achieve much by using loop yourself, it seems.

💯 1
vemv14:02:24

Yeah TIL about https://github.com/clojure/clojure/blob/master/src/jvm/clojure/lang/StringSeq.java - for some reason I thought the sequence of chars one sees with (seq "foo") was more straightforward.

p-himik14:02:15

Yeah, there are a lot of special cases for most frequent concretions of the Clojure abstractions.

🙂 1
Joshua Suskalo15:02:31

Why would using recuded be faster than not-any? , which is just (comp not some), and some has early return?

Joshua Suskalo15:02:36

Calling seq every step ends up being a noop because on StringSeq the seq method should just return itself, and I'd expect the jit to handle that well

p-himik15:02:50

You're correct - now I see that it's not seq but next that's the problem. reduce ends up being a plain loop whereas some ends up creating a new StringSeq on each step.

☝️ 1
Ben Sless15:02:26

Iteration via next / rest is costly

Ben Sless15:02:49

See varargs arities of all comparison functions

Joshua Suskalo16:02:38

use of not-any and reduced are within negligible difference of each other on performance, along with using loop to iterate a seq over the strong. Using the .charAt function on the string repeatedly produces code about 2x slower, regardless of if reduce or loop is used.

Joshua Suskalo16:02:01

Even in this case where the loop can be optimized to use primitives, it's not any faster than using reduce with an anonymous function.

Joshua Suskalo16:02:06

@U45T93RA6 there's some performance numbers for you.

p-himik16:02:29

@U5NCUG8NR Can you try a 10-100MB string instead? That's where I've seen a noticeable difference.

Ben Sless17:02:47

@U5NCUG8NR did your loop implantation in the end use unchecked and unboxed math? You're also boxing the char from charat

Joshua Suskalo17:02:20

it's not unchecked, but it should be unboxed because I'm using a constant in the loop it should be a long. More the point I was making though isn't "char at is slow", but rather "the effort required to making something faster than not-any? is most likely not worth it"

Joshua Suskalo17:02:07

I can try to make it a bit better with some extra hints, but also yes, the char is boxed, and that's a requirement of the original question because Clojure only has unboxed fns for long and double, and the original request is to retain the functional interface.

Ben Sless17:02:09

but your solution has reflection

p-himik17:02:25

Hold on. How did your code even work? I get this:

No matching field found: isDigit for class java.lang.Character

Joshua Suskalo17:02:05

oh I changed that in my repl session, forgot to update the gist. I'll update the gist without reflection and with that changed in a second.

Ben Sless17:02:54

(def digit-str "blah1blah")
(def non-digit-str "blah'blah")

(defn char-at0 [s]
  (let [cnt (count s)]
    (not
     (loop [idx 0]
       (if (>= idx cnt)
         false
         (if (Character/isDigit (.charAt s idx))
           true
           (recur (inc idx))))))))

(defn char-at1 [^String s]
  (let [cnt (count s)]
    (not
     (loop [idx 0]
       (if (>= idx cnt)
         false
         (if (Character/isDigit (.charAt s idx))
           true
           (recur (inc idx))))))))

(defn char-at2 [^String s]
  (let [cnt (.length s)]
    (not
     (loop [idx 0]
       (if (>= idx cnt)
         false
         (if (Character/isDigit (.charAt s (unchecked-int idx)))
           true
           (recur (unchecked-inc idx))))))))

(do

  (cc/quick-bench (char-at0 digit-str)) ;; 33us
  (cc/quick-bench (char-at1 digit-str)) ;; 27ns
  (cc/quick-bench (char-at2 digit-str)) ;; 4ns

  (cc/quick-bench (char-at0 non-digit-str)) ;; 60us
  (cc/quick-bench (char-at1 non-digit-str)) ;; 30ns
  (cc/quick-bench (char-at2 non-digit-str))) ;; 5.4ns

Ben Sless17:02:08

We're talking 3-4 orders of magnitude difference

Joshua Suskalo17:02:01

if you inline is-digit, yes

Joshua Suskalo17:02:19

that's explicitly not allowed in the original constraints though

Joshua Suskalo17:02:34

Alright, I've updated the gist, there's no reflection at this point, and the charat version is still slower.

Ben Sless17:02:33

Yes, there is, at the-pred

Ben Sless17:02:43

(defn digit? [^Character c] (Character/isDigit c))

Joshua Suskalo17:02:42

sigh yes, you could put a hint there, but that's not the point. The point is to provide the same interface as not-any? and requiring your user to type hint that in order to get reasonable performance is a bad requirement.

Ben Sless17:02:16

If you're going to be reflecting on the hotspot you're calling every iteration and dealing with host platform, there should not be any reflection there

Ben Sless17:02:50

reduce is 2x faster than not-any? with a hinted digit?

Ben Sless17:02:12

Comparing performance for anything when there's reflection up in the air is meaningless

Ben Sless17:02:32

the differences are literally swallowed in the time wasted on reflection

p-himik17:02:52

Heh, my original test used #(= c \b) as a predicate.

Joshua Suskalo17:02:09

Yes, removing reflection there increases performance, but this question isn't about "how can I increase performance" it's "is there a better performance pattern for not-any? on strings" and the answer is "no, if you don't want to put in a lot of work" which is a perfectly fine conclusion imo.

Joshua Suskalo17:02:17

Yes you can squeeze more ns out of it

Joshua Suskalo17:02:24

no I don't think it matters

Joshua Suskalo17:02:53

This is basically one of those things of "is java faster than Clojure?" and the answer is "not technically because you can just write java in clojure" when the more helpful answer is "idiomatic clojure is within 1-4x of java's performance"

p-himik17:02:13

@U5NCUG8NR The cost of a step in not-any? is x, not constructing StringSeq on any step removes a from it, using reflection on every step adds b to it. Where b >> a.

Ben Sless17:02:13

Even so, for a large string using reduce is 5x faster

Ben Sless17:02:33

The overhead of using reduce over not-any is not too large

p-himik17:02:28

To corroborate the said, here are my results, along with the code:

(use 'criterium.core)

(defn the-pred
  [c]
  (= c \b))

(def s (clojure.string/join (repeat 10000 \a)))

(defn not-any?* [pred s]
  (not-any? pred s))

(bench (not-any?* the-pred s) :verbose)
;; Evaluation count : 134700 in 60 samples of 2245 calls.
;;       Execution time sample mean : 449.648953 µs
;;              Execution time mean : 449.690157 µs
;; Execution time sample std-deviation : 5.030324 µs
;;     Execution time std-deviation : 5.140604 µs
;;    Execution time lower quantile : 444.140272 µs ( 2.5%)
;;    Execution time upper quantile : 462.200461 µs (97.5%)
;;                    Overhead used : 5.602629 ns
;; 
;; Found 4 outliers in 60 samples (6.6667 %)
;; 	low-severe	 2 (3.3333 %)
;; 	low-mild	 2 (3.3333 %)
;;  Variance from outliers : 1.6389 % Variance is slightly inflated by outliers

(defn reduced?* [pred s]
  (not
   (reduce #(when (pred %2) (reduced %2))
           nil
           s)))

(bench (reduced?* the-pred s) :verbose)
;; Evaluation count : 353640 in 60 samples of 5894 calls.
;;       Execution time sample mean : 170.258429 µs
;;              Execution time mean : 170.258658 µs
;; Execution time sample std-deviation : 432.600961 ns
;;     Execution time std-deviation : 439.733013 ns
;;    Execution time lower quantile : 169.535596 µs ( 2.5%)
;;    Execution time upper quantile : 171.182857 µs (97.5%)
;;                    Overhead used : 5.602629 ns
;; 
;; Found 1 outliers in 60 samples (1.6667 %)
;; 	low-severe	 1 (1.6667 %)
;;  Variance from outliers : 1.6389 % Variance is slightly inflated by outliers

Joshua Suskalo17:02:41

sure, I'm redoing the benchmarks without any reflection involved, but what I'm observing is exactly what I expected, that performance relative to each other is still within an order of magnitude, they're just both faster.

Joshua Suskalo17:02:15

Again, I wasn't claiming "char at is slow" but rather "`not-any?` is probably fast enough"

Ben Sless17:02:20

~5x is not negligible

p-himik17:02:25

Using your mental model, how could you interpret my result then?

Ben Sless17:02:27

also, you have to account for gc pressure

Ben Sless17:02:41

i.e. if the system is also doing other things, it'll be globally throttled

Joshua Suskalo17:02:49

Then at that point we're getting into metrics that are based on benchmarking the whole program, not just this one function.

Joshua Suskalo17:02:53

Seriously, you both seem to be missing my point entirely, that the amount of work required to get it faster is high relative to the actual performance you gain (unless this code is in a critical hot path and you know this will speed stuff up, preferably using a causal profiler)

p-himik17:02:01

But that is still important, and not-any? here is objectively worse due to the creation of N objects.

Ben Sless17:02:54

how is it a high amount of work to translate not-any? to use reduce instead?

p-himik17:02:59

> the amount of work required to get it faster is high relative to the actual performance you gain That would be up for the OP to decide, no? IMO the reduce solution is around 20% more complex.

Joshua Suskalo17:02:25

Yes it's up to OP to decide, but at least as it reads to me it's asking "is there a clojure core pattern I can use that's faster and similar effort" not "this is a hot code path I need to squeeze out every ns". maybe OP can comment on that, but honestly I think this threads usefulness has decreased with its length.

Ben Sless17:02:57

To answer OP's question specifically, not-any? is the inverse of some, so I would reach for xform's some-rf and then benefit from the speedups of using reduce and the ability to plug in transducers, then wrap the result in not. The speedup is about 2-6x, increasing with the size of input

☝️ 1
Joshua Suskalo17:02:51

That's a good answer, xforms needs more love

p-himik17:02:43

Huh, didn't realize xforms could be useful without transducers, thanks!

vemv17:02:51

My question was originally somewhat misguided (since it assumed a seq on a string was just a huge list / lazy seq) so it doesn't matter that much. I'm just happy to complete clj knowledge here and there. I think it's pretty well-known that one cannot automatedly get better performance while keeping the program high-level, so obviously the more you want to squeeze, the more you'll have to contort (type hints and reduce aren't much of a contortion by my standards... reaching for xforms would likely be)

vemv17:02:12

What's the gist of why reduce is faster than not-any? anyway? Is it because some uses next?

p-himik17:02:47

On every step, yes, thus creating StringSeq on every step. Whereas reduce uses loop over plain indices within the same StringSeq.

🙏 1
Joshua Suskalo17:02:03

you mean using the reduce over a seq of indices and calling charAt ? Or did I miss some way that StringSeq implements IReduce or something? I didn't spot that anywhere.

p-himik18:02:22

I meant reduce over a string in general.

Joshua Suskalo18:02:03

Oh does reduce have a special case for IndexedSeq? Again, I'm having a hard time finding where you're seeing this.

p-himik18:02:21

reduce -> clojure.core.protocols/coll-reduce (Object) -> clojure.core.protocols/seq-reduce -> clojure.core.protocols/internal-reduce (StringSeq)

p-himik18:02:34

clojure.lang.StringSeq
  (internal-reduce
   [str-seq f val]
   (let [s (.s str-seq)
         len (.length s)]
     (loop [i (.i str-seq)
            val val]
       (if (< i len)
         (let [ret (f val (.charAt s i))]
                (if (reduced? ret)
                  @ret
                  (recur (inc i) ret)))
         val))))

Joshua Suskalo18:02:02

oh, thanks! I hadn't dived into reduce that far before

👍 1
p-himik18:02:54

Literally the only reason I'm still using IDEA + Cursive instead of VS Code + Calva is the ability to navigate such things effortlessly.

vemv18:02:59

(and with that post this thread goes full circle, since the question popped up while hacking on enrich-classpath)

😄 1
p-himik18:02:05

That would be fantastic.

p-himik18:02:44

Also, it's not just about classpath and files - it's about being able to navigate Java code. And it sounds like something way out of the scope of Calva. :(

vemv18:02:18

shouldn't vscode have a decent lsp mode for java?

vemv18:02:26

It's kinda in scope - calva would likely inform java-lsp (or whatever it's called) of the classpath to be used. namely a classpath enriched with enrich-classpath

p-himik18:02:00

The last time I tried was about a year ago. Either I did something wrong or it was barely working.

👀 1
Ben Sless19:02:12

@U45T93RA6 to circle back to your original point, did you profile it? Do you know where the expensive bits are?

vemv19:02:20

I just upgraded not-any? to reduce following these insights, seemed enough

Ben Sless19:02:28

It depends. Could be that you call it on small strings then you may not see any improvement

vemv19:02:09

The strings are guaranteed to be from large to huge :)

Ben Sless19:02:00

How many chars approximately?

vemv19:02:53

IDK, for technical reasons it would actually take me actually 7d or so to give you an accurate answer. Let's leave it at "I got some TILs", I'm not excessively interested in squeezing perf for what happens to be a one-off assert

👍 1
Joshua Suskalo15:02:02

@hewrin the one thing you'll have to be aware of with the jump to java 17 is that codox needs you to pass some extra command line arguments in java 16+, so if you're using codox for autogenerating docs, that might catch you. The solution is in https://github.com/weavejester/codox/issues/197

👍 1
Faris03:02:38

Thanks for the tip!

Lennart Buit17:02:55

Can I extend a protocol on some (non-IMeta implementing) value? I am looking to create something that behaves like a j.u.c.CompletableFuture but implements another protocol, but I don't want to implement that protocol for all CompletableFuture’s

Lennart Buit17:02:36

So I guess I'm kinda looking for extending via metadata, but I can't because it's not IMeta, right?

noisesmith17:02:56

you could use a factory function that returns a reify - that implements IMeta and can also implement CompletableFuture and encapsulate your object

noisesmith17:02:53

that is, you'd delegate the CompleteableFuture methods to the existing object, and reify would handle the having metadata for protocol extension part

hiredman17:02:49

the thing you are extending to will have a concrete class that is not CompleteableFuture, you can extend the protocol to that

Lennart Buit17:02:50

Right, I was afraid so; the backing interfaces for completable future are huge tho…

hiredman17:02:21

(ah, misread the question)

Lennart Buit17:02:39

(No worries, appreciate the thinking along)

noisesmith18:02:48

there's a metaprogramming approach of looping over and delegating methods at compile time, which can save you a lot of boilerplate and lead to really weird bug fixing experiences

noisesmith18:02:35

oh - since CompletableFuture is concrete, I think you'd need a proxy rather than a reify (but don't use proxy-super as a shortcut, it's not thread safe)

1
Lennart Buit18:02:22

Maybe I'm just doing it wrong. I have a queue of tasks, I want to offer a function that lets you enqueue, gives you a completable future. The somewhat unrelated process will take those queue items, gets their task description, executes it some way and delivers that completable future. So at this point I'm trying to side channel my task desc in that completable future

Lennart Buit18:02:06

I kinda like that my queue is just a list of 'things that can be completed’ but I don't like that I have to side channel the task desc into this execute fn

noisesmith18:02:39

if I was doing that from scratch I'd use a hash-map with a data structure describing the task, and a promise that you deliver from the worker on completion

hiredman18:02:30

it is extremely common to pass pairs of [thing-to-do, result-handler]

Lennart Buit18:02:56

Haha that was what I had initially, I made it too complicated ^^

noisesmith18:02:48

one thing I like about promise is that if you provide it as a result handler, calling it does the right thing

ins)user=> (def p (promise))
#'user/p
(ins)user=> (p "foo")
#object[clojure.core$promise$reify__8501 0x1e9804b9 {:status :ready, :val "foo"}]
(ins)user=> @p
"foo"

rgm23:02:22

Is there a good way to simulate getting killed by out-of-memory errors in a dev environment? I'm setting -J-Xmx512m ... I'm hoping for some unix trick like a memory version of a browser's network conditioner.

Colin P. Hill23:02:26

any reason that actually inducing an OOME wouldn't work?

rgm23:02:44

I guess I want to test and re-test using parts of the application and see when I go over a threshold. I could watch VisualVM or such, I guess.

hiredman23:02:55

% clj
Clojure 1.10.2
user=> (throw (java.lang.OutOfMemoryError.))
Execution error (OutOfMemoryError) at user/eval1 (REPL:1).
null
user=>

hiredman23:02:17

an oom doesn't actually kill the jvm

rgm23:02:56

is there a way to set a really hard limit and cause that to throw automatically?

rgm23:02:32

I'm not great at java but the Xmx has seemed more like a suggestion than a real limit

rgm23:02:59

I guess being heap it still leaves stack to get fairly big?

hiredman23:02:06

I am just pointing out the assumption you are starting with, that your process is getting killed by ooms is incorrect

rgm23:02:19

I appreciate that

rgm23:02:29

I didn't actually know that

hiredman23:02:37

I don't know of anything that will force an oom other than writing code that throws one

rgm23:02:46

maybe a restatement of the objective: I want, under no circumstances, for this JVM to get bigger than, say, 512mb, in RAM

hiredman23:02:52

but I bet you could attach a debugger and do some stuff

rgm23:02:22

yeah might come to that... at least I can test that behaviour

hiredman23:02:30

that is really hard to do, in part because of the way applications use virtual memory for things like file pages

rgm23:02:47

yeah, kinda expected that, sadly

rgm23:02:13

can a JVM know its own memory usage?

ghadi23:02:13

Set cgroup limits in Linux then ignore them with -Xmx

hiredman23:02:02

modern jvms should respect cgroup limits?

rgm23:02:27

lemme repeat that back ... set up the OS to kill the JVM, then tell the JVM to go crazy take as much RAM as you want?

hiredman23:02:54

a cgroup limit won't cause the os to kill the jvm

hiredman23:02:06

(there are things like systemd-oomd which will kill processes when a cgroup is low on memory)

rgm23:02:02

what would it the limit do then, without the systemd add-on? Cause the java.lang.OutOfMemoryError exception and leave the JVM in whatever state that implies?

rgm23:02:39

ok maybe a better tuned version of my objective: set some limits to dev that mean we stop assuming there's as much RAM available in prd, and suffer when we forget. I guess -J-Xmx512m is one ... are there other good ones?

hiredman23:02:44

it all really depends on why, and what you are doing, but if you just want to limit the garbage collected heap size of the jvm, that is what -Xmx does

hiredman23:02:16

if you also care about things like native memory then you have to look out side of the knobs the jvm provides

rgm23:02:14

yeah, I don't have all that well-formed an answer to those questions but thinking about them helps. Long story short some code is behaving badly in prd and I'm thinking it would be convenient to throw a resource limit jail around it in dev to avoid having poor performance sneak up on us

rgm23:02:15

I guess working from a resource-constrained VM would work

rgm23:02:27

anyway, this all helps a ton, thanks

rgm23:02:40

(apologies for any questions that made no sense)

jumar07:02:24

Examine your production app memory footprint and see if it's heap or off-heap. If it's a heap problem then you can try to set reasonable Xmx in both prod and dev. If you want to restrict total RSS then consider running it in Docker

👍 1