2025-10-21 clojure-dev | Clojure Slack Archive

clojure-dev 2025-10-21

quoll 2025-10-21T13:13:15.745339Z

The other day a colleague noted that (when-not (empty? coll) ...) was evaluating faster than (when (seq coll) ...). I thought this was odd, given the implementation of empty?, and then saw that it was updated in clojure-1.12.0 when it was extended to include counted non-seq collections. Cool 🙂 However, the docstring still says the following: > To check the emptiness of a seq, please use the idiom (seq x) rather than (not (empty? x)) Should this text be updated?

👀 1

seancorfield 2025-10-21T13:14:12.636529Z

https://ask.clojure.org/index.php/14724/docstring-empty-should-perhaps-recommend-instead-anymore

quoll 2025-10-21T13:15:00.597079Z

ha. I wasn't looking at updating code, so it didn't occur to me to check with ask clojure 🤦‍♀️

borkdude 2025-10-21T13:15:19.954479Z

maybe I can try to make the warning in clj-kondo more conditional or remove it altogether...?

seancorfield 2025-10-21T13:15:30.943079Z

I have it open in a tab all the time and refresh it every morning 🙂

😂 2

quoll 2025-10-21T13:15:56.596479Z

It makes me wonder about the utility of "advice" in docstrings. This isn't the first time I've seen inefficient code become much faster in later versions.

dpsutton 2025-10-21T13:16:42.081979Z

it’s interesting that now empty? has performance differences. reminds me that last could be better for indexed collections but deliberately does not do this so the performance is always known, even if suboptimal, rather than surprising

➕ 1

borkdude 2025-10-21T13:16:42.123269Z

something like this:

(not (empty? [1 2 3])) ;; not warn on vector, set, ...
(not (empty? '(1 2 3))) ;; do warn

quoll 2025-10-21T13:16:58.681869Z

You're much more focused on such things than I have the bandwidth for @seancorfield 🙇

quoll 2025-10-21T13:18:21.121159Z

@borkdude at face value, yes, but how good is clj-kondo at type-inference?

borkdude 2025-10-21T13:18:37.249149Z

if it can be statically inferred, decent

borkdude 2025-10-21T13:18:52.876769Z

maybe kondo should only warn when it can infer the thing is already a seq

💯 1

dpsutton 2025-10-21T13:18:58.032849Z

hmm, but this isn’t actually o(1) vs linear time. It’s just the time of the seq vs checking a count field if present. Seems not the same as last i guess

👍 1

borkdude 2025-10-21T13:18:58.107009Z

that's more in the spirit of kondo

borkdude 2025-10-21T13:19:59.953089Z

(https://github.com/clj-kondo/clj-kondo/issues/1743)

quoll 2025-10-21T13:21:14.142729Z

It's only a tiny speed difference. But the docstring stands out by making the request to use the seq idiom. It's also the only use of the word "please" in clojure.core

😂 9

borkdude 2025-10-21T13:31:29.882409Z

btw when a seq if fully realized, clojure could also (in the future) store the length in a field which makes it cheaper to calculate the length twice...

🤯 1

borkdude 2025-10-21T13:35:07.435509Z

with seq inference:

borkdude 2025-10-21T13:35:21.340379Z

(note that it doesn't warn on x)

quoll 2025-10-21T13:39:34.639799Z

Saving the count is probably not expensive, given that any extra time will always be lost in the noise of realizing a lazy seq. Then again, I don't like the idea of:

=> (counted? lazy-coll)
false
=> (count lazy-coll)
5
=> (counted? lazy-coll)
true

It only seems referentially transparent if there could be some kind of private count that counted? doesn't know about

dpsutton 2025-10-21T13:40:07.298199Z

yes that would be a drastic change in operations at runtime, even on the same type right?

borkdude 2025-10-21T13:40:16.270029Z

I'd say counted? is about the type of a thing, not about an internal perf optimization

👆 1

borkdude 2025-10-21T13:41:13.128549Z

realized? would probably a better fit for this

borkdude 2025-10-21T13:42:19.698689Z

anyway, tangent

quoll 2025-10-21T13:42:34.430749Z

maybe #off-topic? 🙂

borkdude 2025-10-21T13:42:49.244199Z

sure

2025-10-21T13:48:08.238649Z

if clojure was easier to upstream patches, i'd say a good change here would be updating not-empty to use (not (empty? x)), and then speed-focused code can use not-empty, and seq can continue to be for conversion to seqs

2025-10-21T13:49:08.597349Z

(defn not-empty
  "If coll is empty, returns nil, else coll"
  {:added "1.0"
   :static true}
  [coll] (when-not (empty? coll) coll))

borkdude 2025-10-21T13:49:27.867889Z

a seq can also be a coll

Alex Miller (Clojure team) 2025-10-21T13:52:59.138339Z

counted? is absolutely about perf, per the docstring

Alex Miller (Clojure team) 2025-10-21T13:53:28.280529Z

It knows this via a type marker, but that is an implementation detail

borkdude 2025-10-21T13:55:42.430229Z

I guess one could have a DynamicCounted protocol that returns true for things that know the count after being realized. But if the protocol call dominates the counting of the thing then that would be a waste as well

Alex Miller (Clojure team) 2025-10-21T13:56:29.645079Z

Cached count seq does not make sense. You’d burn a field for every cons cell

👍 1

Alex Miller (Clojure team) 2025-10-21T13:57:12.080209Z

Changing not-empty could make sense if someone wants to create an ask

👍 3

2025-10-21T13:57:55.368249Z

gimme a min, i'll write something up

borkdude 2025-10-21T14:08:26.437429Z

(here's the clj-kondo PR to reduce warnings about (not (empty? ...)) to only cases where the argument can be inferred to be a seq: https://github.com/clj-kondo/clj-kondo/pull/2644)

yuhan 2025-10-21T15:22:16.540989Z

I did a quick benchmark out of curiosity, surprised it came out to such a significant difference:

(let [v (vec (range 10000))]
  (println "\n======= seq =========")
  (c/quick-bench (reduce (fn [acc x]
                           (conj acc
                             (if (seq acc)
                               (+ (peek acc) x)
                               x)))
                   [] v))
  (println "\n====== empty? =======")
  (c/quick-bench (reduce (fn [acc x]
                           (conj acc
                             (if (not (empty? acc))
                               (+ (peek acc) x)
                               x)))
                   [] v)))

;=> 
======= seq =========
Evaluation count : 1890 in 6 samples of 315 calls.
             Execution time mean : 370.025570 µs
    Execution time std-deviation : 46.246173 µs
   Execution time lower quantile : 322.871537 µs ( 2.5%)
   Execution time upper quantile : 421.533935 µs (97.5%)
                   Overhead used : 2.010225 ns

====== empty? =======
Evaluation count : 2382 in 6 samples of 397 calls.
             Execution time mean : 292.466504 µs
    Execution time std-deviation : 40.310215 µs
   Execution time lower quantile : 260.595806 µs ( 2.5%)
   Execution time upper quantile : 338.465559 µs (97.5%)
                   Overhead used : 2.010225 ns

I guess allocating all those seq wrappers does add up after all if you're in a hot loop

👍 2

dpsutton 2025-10-21T15:27:05.760019Z

ah it’s small. i misread that as 370 vs 40 and was amazed at first

borkdude 2025-10-21T15:28:44.389589Z

25% is still nice

Clojurians Log v2

clojure-dev 2025-10-21