Fork me on GitHub
#clojure
<
2021-08-09
>
Jim Newton09:08:15

question about performance. I've starting a bit of a performance analysis of a program and the first thing I see is that the following function seems to be a bottleneck.

(letfn [(cr [td operands]
          (cond (empty? operands)
                (unit td)

                (empty? (rest operands))
                (first operands)

                :else
                (cons (first td) operands)))]
  (defmethod create 'or
    [td operands]
    (cr td operands))
  (defmethod create 'and
    [td operands]
    (cr td operands)))
the cond is unfortunate because the sequence is rarely empty, yet the empty? check appears first. It is basically asking which of the three cases are occurring, 2-or-more elements (most likely), singleton, or empty. Is there any clever way to do this check more efficiently. I thought about a case and some soft of bounded count. but bounded count is not guaranteed to return the <= bound, as I understand. Perhaps something like the following,
(case (min 2 (bounded-count 2 operands))
            0 (unit td)
            1 (first operands)
            2 (cons (first td) operands))

Jim Newton09:08:23

does case compile to a goto-like structure, or does it compile to a sequence of if-then-else like structures?

Alex Miller (Clojure team)12:08:23

case ultimately complies to a single JVM tableswitch bytecode which is like an optimized goto

4
Jim Newton09:08:07

I could define a vector of three elements, and do something very cryptic like ((my-fun-arr (min 2 (bounded-count 2 operands)))) , note the double open paren to both get the nth element and then immediately funcall it.

Jim Newton09:08:18

I'm not sure how to interpret that. I think it means that sometimes case compiles to a table lookup and sometimes it doesn't. right? How can I force it to be a table lookup?

delaguardo09:08:28

I’m not a java expert. The link is here only because I remember looking at it when I read through the source code

p-himik09:08:48

(case (bounded-count 2 operands)
  0 (unit td)
  1 (first operands)
  (cons (first td) operands))
This avoids an extra call for min and should generate a table lookup since all the keys are integers.

☝️ 4
p-himik09:08:51

And if you know for sure that operands is counted?, just use (count operands). Although given that cons, I suspect that's not the case.

Jim Newton10:08:29

hmm. interesting because that was the opposite of my intuition. if I give a default as the final operand of case, my intuition was that it would have to do the check twice, once for the table out-of-bounds, and a second time for the table lookup.

Ben Sless10:08:31

This gets down to the fine details of how case is compiled

Ben Sless10:08:14

Roughly

21: iload           G__15905
              23: tableswitch {
                          2: 52
                          3: 73
                          4: 94
                          5: 115
                    default: 136
                  }
This loads the case value, the table switches on value and goes to line

🎉 4
Ben Sless10:08:43

At the end of each case you have

70: goto            137
Which is the "end" of the switch statement

Jim Newton10:08:58

in all normal situations (excluding machine generated test cases) operands is going to be a list which a human being has typed in, thus is 2,3,4,5 but rarely more. So perhaps simply count is better than (min 2 (bounded-count ...))

Ben Sless10:08:37

The example I pasted is from my own decompiled code

4
Ben Sless10:08:04

More to your case:

(defn foo
  [x]
  (case (count x)
    0 100
    1 200
    300))
decompiles to
public static Object invokeStatic(final Object x) {
        final int G__15943 = RT.count(x);
        Object o = null;
        switch (G__15943) {
            case 0: {
                o = fast$foo.const__2;
                break;
            }
            case 1: {
                o = fast$foo.const__4;
                break;
            }
            default: {
                o = fast$foo.const__5;
                break;
            }
        }
        return o;
    }
And in bytecode:
public static java.lang.Object invokeStatic(java.lang.Object x);
        Flags: PUBLIC, STATIC
        Code:
                  linenumber      1
               0: aload_0         /* x */
                  linenumber      3
               1: invokestatic    clojure/lang/RT.count:(Ljava/lang/Object;)I
               4: istore_1        /* G__15951 */
                  linenumber      3
               5: iload_1         /* G__15951 */
               6: tableswitch {
                          0: 28
                          1: 34
                    default: 40
                  }
              28: getstatic       clj_fast/fast$foo.const__2:Ljava/lang/Object;
              31: goto            43
              34: getstatic       clj_fast/fast$foo.const__4:Ljava/lang/Object;
              37: goto            43
              40: getstatic       clj_fast/fast$foo.const__5:Ljava/lang/Object;
              43: areturn

4
Jim Newton10:08:48

BTW, wrt clojure syntax, I always forget that the default of case takes no :else but they default of cond takes an :else .

Ben Sless10:08:57

cond doesn't actually have a default, it's just a convention

Ben Sless10:08:03

you can put anything there

Ben Sless10:08:22

It's just an expression which is always true

Jim Newton10:08:23

well anything that is not false 🙂

Jim Newton10:08:11

In the end, I don't think the optimized code is too cryptic.

(letfn [(cr [td operands]
          (case (count operands)
            0 (unit td)
            1 (first operands)
            (cons (first td) operands)))]
  (defn create-or [operands]
    (cr '(or) operands))
  (defn create-and [operands]
    (cr '(and) operands)))

Jim Newton10:08:43

since this is bottle-neck code, it might even be better to inline the case twice, and just put a comment that the code is repeated because it is critical path

Ben Sless10:08:12

Just make sure it's actually the bottleneck first

Ben Sless10:08:37

It seems like a small, well behaved function, it shouldn't be a bottle neck. It can lie, though, because of laziness. If operands are lazy, count will realize them and you'll see the call stack of the lazy sequence realization in the stack of cr and not the function which returned the lazy sequence

Jim Newton10:08:15

you might be right that it's not really a bottleneck. I was supposing this from my first unexperienced reading of the flamegraph

Jim Newton10:08:03

I don't feel bad because the optimized version isn't very ugly.

Jim Newton10:08:15

hmmm. good point about laziness. counting just one more element might have unexpected side effects. indeed. there may be cases in my code where counting one more element means realizing a lazy value by doing a tree traversal to compute it 😞

Jim Newton10:08:00

maybe I should change it back to bounded count???

Ben Sless10:08:21

No need, you can work with laziness

Ben Sless10:08:37

Instead of counting, just take first

Ben Sless10:08:53

if you get nil, the count is 0

Jim Newton10:08:14

no, I know that either the first element is nil, or it has 0 length. I'd still need to distinguish those cases

Ben Sless10:08:46

okay, then use seq, hold on, cooking up an example

Jim Newton10:08:30

aren't we going in circles. calling seq is back to my idea of calling empty?

p-himik10:08:39

IIRC seq will still realize the first chunk. so there's no O(...) difference between seq and bounded-count.

Jim Newton10:08:46

empty? is just a small wrapper around seq

Ben Sless10:08:49

(defn foo
  [xs]
  (if (seq xs)
    (if (next xs)
      :more-than-one
      :just-one)
    :empty))

Ben Sless10:08:00

seq realizes a chunk, count realizes all

Jim Newton10:08:16

bounded-count realizes a chunk. right?

p-himik10:08:48

Yes, because it calls seq internally.

Jim Newton10:08:57

who decides the chunk size?

Ben Sless10:08:07

(defn bounded-count
  "If coll is counted? returns its count, else will count at most the first n
  elements of coll using its seq"
  {:added "1.9"}
  [n coll]
  (if (counted? coll)
    (count coll)
    (loop [i 0 s (seq coll)]
      (if (and s (< i n))
        (recur (inc i) (next s))
        i))))
If it's counted it will count all

Jim Newton10:08:35

well most of my sequences (rule-of-thumb) 5 or less.

Jim Newton10:08:47

only in machine generated test cases are the sequences really long.

Ben Sless10:08:59

oh, lazy-seq isn't counted

p-himik10:08:18

> If it's counted it will count all Just to be clear - that will be O(1) because that's what "counted" means.

p-himik10:08:39

/* A class that implements Counted promises that it is a collection
 * that implement a constant-time count() */

👍 4
Jim Newton10:08:19

O(1) is a bit of a deception. It is O(1) times the complexity of generating an element, which in some cases might be exponential if it involves a tree traversal. right?

p-himik10:08:23

Unless you need to check for more than 2 elements, the code using seq and next above is probably the simplest one.

p-himik10:08:10

@U010VP3UY9X No. counted? means that there will be no generation of any sort. Basically, the number is already baked into the structure.

p-himik10:08:34

But yeah, as Ben says - get the actual data, measure, and only then start caring.

Jim Newton10:08:39

isn't seq and next equivalent to empty? and (empty? (rest ...)) ? which is what my original code was.

p-himik10:08:23

No, seq is the exact opposite of empty? Check the docstring of the functions, especially empty?.

Jim Newton10:08:55

(defn empty?
  "Returns true if coll has no items - same as (not (seq coll)).
  Please use the idiom (seq x) rather than (not (empty? x))"
  {:added "1.0"
   :static true}
  [coll] (not (seq coll)))

Jim Newton10:08:05

same in the sense of complexity.

Ben Sless10:08:18

I cheat a little. (rest []) = (), (next []) = nil, which allows me to avoid one more check

Jim Newton10:08:24

of course opposite logically/semantically

p-himik10:08:30

All solutions so far are the same in the sense of complexity, except for the count one.

Jim Newton10:08:14

hmmm. so my original code was perhaps better. To repeat. my original complaint was that I am always checking emptiness first even though I know that the sequence is rarely empty.

Jim Newton10:08:38

perhaps an emptiness check must always be done, even internally by count ?

Ben Sless10:08:57

This is slightly a case of looking for evidence under the flashlight. As long as we don't know where the real performance impact is, just go with what you find most readable. I can show you gloriously mangled code which makes the JVM happy

practicalli-johnny10:08:46

Anyone used Clojure and Kafka to spin up additional services on AWS (or similar cloud service) based on new partitions in Kafka? Elaborated here https://clojurians.slack.com/archives/C09N0H1RB/p1628505238008500

nooga12:08:21

I was fooling around with protocols and this puzzles me:

user=> (defprotocol Quux)
Quux
user=> Quux
{:on user.Quux, :on-interface user.Quux, :sigs nil, :var #'user/Quux, :method-map {}, :method-builders {}}
Unable to resolve symbol: typeof in this context
user=> (type Quux)
clojure.lang.PersistentArrayMap
user=> user.Quux
user.Quux
user=> (type user.Quux)
java.lang.Class
user=>
why is Quux a map?

p-himik12:08:20

A protocol is a named set of named methods and their signatures
That's what you see when you evaluate Quux.
defprotocol will automatically generate a corresponding interface,
  with the same name as the protocol, i.e. given a protocol:
  my.ns/Protocol, an interface: my.ns.Protocol
That's what you see when you evaluate user.Quux.

nooga12:08:22

is the protocol itself used for anything then?

p-himik12:08:47

Of course - you pass it to deftype / defrecrod / extend-protocol / ...

ghadi13:08:48

#’Quux is a var, containing a map with some information about the protocol

nooga16:08:47

is there anything else in the language that does this? AFAIK this is different from how deftype and definterface operate

Jim Newton17:08:43

Can someone help me with laziness. Does this code guarantee that the predicate, pred is NOT called after it returns true the first time? Do I need to control the call more precisely if that is important to me?

(defn conversion-C10
  "(and A B C) --> (and A C) if A is subtype of B
   (or A B C) -->  (or B C) if A is subtype of B"
  [self]
  (letfn [(pred [u]
            (exists [v (operands self)]
                    (and (not= u v)
                         (= (annihilator self u v) true))))]
    (let [subs (filter pred (operands self))]
      (if (empty? subs)
        self
        (let [sub (first subs)
              keep (setof [sup (operands self)]
                          (or (= sup sub)
                              (not= (annihilator self sub sup) true)))]
          (create self keep))))))
I think it is not important for the question, but here is the code for exists
(defmacro exists
  "Test whether there exists an element of a sequence which matches a condition."
  [[var seq] & body]
  `(some (fn [~var]
           ~@body) ~seq))

dpsutton17:08:22

Filter makes no such guarantees and will quite possibly do more “work” here than you intend. You do not have control over the amount of laziness like you are hoping

walterl17:08:43

Would (first (split-with pred ...)) be better? Or (take-while (complement pred) ...)?

emccue17:08:53

Lazy sequences usually realize 32 at a time

JohnJ17:08:52

he could use filterv

dpsutton17:08:19

lazy sequences are not a viable execution control strategy in clojure

Alex Miller (Clojure team)18:08:32

there are no guarantees about how many elements of a lazy sequence are evaluated

Jim Newton08:08:16

yes, that is what I was hoping to avoid because the filter option is so elegant. 😞

Jim Newton08:08:11

unfortunately with loop/reduce I have to make an assumption about the type of sequence. i.e., whether it is list-like [`first`/`rest`] or whether it is index based. In these types of cases I try to use reduce/`reduced` if possible.

Alex Miller (Clojure team)18:08:02

you can use loop/recur with exit on a condition or reduce with reduced to early exit as other options

walterl18:08:42

Does that hold if we only care about how many times pred is called, and not how many elements of the lazy seq are realized?

noisesmith18:08:28

you can use loop or reduce to call pred, but if the function calling pred is lazy you no longer control how many times it is called

🙌 2
noisesmith18:08:08

@UJY23QLS1 to be more specific, in the example above pred is passed to filter, and called by filter, since filter is lazy you can't have true control of when and how many times it uses its arg

noisesmith18:08:43

(also editorial comment using a side effecting function as a filter seems very odd...)

walterl18:08:33

Makes sense. Great answer, thanks 🙌

potetm19:08:16

Ok. After using Reveal for like 5 min, I recant all previous reservations re: tap> . It’s super nice to have real values from the middle of a process show up in the repl on demand. cc: @seancorfield @vlaaad

🎉 10
seancorfield19:08:58

Yeah, I love Reveal + tap> as development aid/tool.

reveal 5
✔️ 9
potetm19:08:03

I wasn’t too excited about the idea of making tap> a middleman for an atom when def works almost as well. But having dev tool integration is a real improvement.

seancorfield19:08:45

The nice thing about tap> is that you can leave it in your code and let it go to production if you want and it's "harmless". Which means you can also attach a viewer to in via a Socket REPL in production if you need to debug something complex there.

🙌 2
potetm19:08:29

Right. Or, even if you don’t jack in to prod, you have it in your dev env all the time, just in case.

2