Fork me on GitHub
#clojure
<
2021-08-11
>
tessocog00:08:47

no i meant in a more general form, where match would walk the data and either keep or drop that point based on conformity to spec in a greedy manner

tessocog00:08:03

i'm not sure if it makes sense

tessocog00:08:57

it should be possible with the help of specter selected? fn

(let [apath (spec-form->apath (s/form myspec))]
    (specter/select apath
                    mydata))

Jim Newton08:08:15

question about try/catch/finally. The https://clojuredocs.org/clojure.core/try seems to be missing a word in the first sentence.: What does "the value of the last is returned" ?

Jim Newton08:08:47

What I'd really like to do is return the value of finally, but that doesn't appear to be a feature of try

p-himik10:08:26

And you're correct, you can return something only from try or catch. yAnd you're correct,

Jim Newton09:08:30

@delaguardo I used your trick of naming methods like the following so that the profiler will include the name rather than just fn-some-random-number

(defmethod -canonicalize-type 'and method-canonicalize-type-and
  [type-designator nf]
  (find-simplifier type-designator
                   (combination-simplifiers nf)))
It works well. Thanks for the good advise. I wonder whether you might have a trick for the following case. I have some globally defined functions, and I have an array of them.
(defn combination-simplifiers [nf]
  [conversion-C1
   conversion-C2
   conversion-C3
   conversion-C4
   conversion-C5
   conversion-C6
   (fn conversion-C7-nf [td] (conversion-C7 td nf))
   conversion-C9
   conversion-C11
   conversion-C12
   conversion-C13
   conversion-C14
   conversion-C15
   conversion-C16
   conversion-D1
   conversion-D3
   conversion-C8
   conversion-C10
   conversion-C98
   (fn conversion-C99-nf [td] (conversion-C99 td nf))])
find-simplifier ( https://gitlab.lrde.epita.fr/jnewton/clojure-rte/-/blob/1126a85b16dcff55824f50840e8c43c80af93a7a/src/clojure_rte/util.clj#L201) is just a function which iterates over the array given as second argument and calls the functions until some application-specific condition is met. Unfortunately, these functions seem to appear in the profiling as fn- followed by some random number. That's bizarre as the function have real names.

p-himik10:08:29

Your conversion-* functions seem to also be multimethods. Is the same naming approach used on them, like (defmethod conversion-D1 'and method-conversion-D1-and ...)?

2
Jim Newton12:08:47

good catch. indeed some but not all my conversion-* functions are methods. most are defined with defn.

Jim Newton12:08:03

anyway, I'll apply the naming to those that are methods and see what happens.

Jim Newton12:08:53

yup, that's an improvement

👍 2
Jim Newton08:08:49

BTW I discovered that I can indeed profile allocation. Just use (prof/start {:event :alloc}) instead of (prof/start {}). Some other tricks http://clojure-goes-fast.com/blog/clj-async-profiler-tips/.

wei10:08:58

algorithm question for you guys, let's say i have 15 of each type of playing card (15 ace of spades, 15 two of hearts, ... etc) and I want to make packs of 3 without any pack having repeating cards. in general, i want to take a collection of collections of varying length and return a collection of collection of length N where the inner collections are non-repeating combinations of the cards. normally I'd just (->> card-types (map (partial repeat 15)) shuffle (partition 3)) and call it a day but the non-repeating requirement makes it more complicated.

pyry10:08:13

Perhaps take a look at clojure.math.combinatorics/combinations or perhaps other functions from that library.

pyry10:08:28

Not sure if you can just generate all the combinations and then shuffle those - depends on the parameters to your problem.

pyry10:08:23

Perhaps something like the following could work:

pyry10:08:05

1. Pick a random combination of cards (that you still have in reserve). 2. Return the hand, do whatever book keeping is necessary: decrease counts of cards in the chosen hand etc. 3. Loop if more hands need to be chosen.

wei10:08:18

thanks! I have an iterative solution that's similar to what you suggested but it doesn't always result in a solution if you end up with one bucket full and the others empty. my hack is to retry until it runs. wondering if there's a more elegant way to solve it, perhaps framing it as a graph traversal problem?

Billy Moon11:08:08

Perhaps create empty packs first, then deal all your aces to random packs that still need cards, then all the twos etc... when you are dealing, you ensure you don't deal two aces to the same pack.

wei13:08:13

oh, that's a great idea! reversal of what I've been doing

wei21:08:59

(defn deal-hands [n decks]
  (loop [left (mapv shuffle decks)
         results []]
    (if (empty? left) results
        (recur (->> (map pop left) (remove empty?))
               (concat results (->> (map peek left) shuffle (partition n)))))))

simongray12:08:06

I’ve been using #(some some? %) to check that a sequence contains some content (and not just nils), but is there a less awkward way of doing this?

p-himik12:08:35

Depends on what "awkward" means. Maybe #(not (every? nil? %)) is better.

simongray12:08:23

I guess to me the double “some some” is a bit awkward and doesn’t communicate its meaning strongly.

simongray12:08:04

I guess #(not-every? nil? %) is more clear in its intent.

p-himik12:08:09

To communicate the meaning strongly, just use a name:

(defn has-some-content? [coll]
  (some some? coll))

p-himik12:08:26

Oh, right, there's also not-every?.

simongray12:08:42

yeah, I keep forgetting about the not- versions of most of the core functions too, but your reply reminded me of it 🙂

Billy Moon12:08:41

I am working on a web app, which has services organised into a system using stuart sierra component. Throughout the app, we import functions from the namespace, and pass the corresponding service as the first argument. This allows the function to operate on that service, and also to gain access to other dependent services in a reasonable way. I am wondering though, whether it would be better during the application bootup to create partials of the functions we generally import through the namespace, binding them to the services they need. My thinking is that as soon as we have the service initialised, we can bind it once, then all the consumers of the newly made partial don't have to worry about the services required etc... they can just call with arguments relevant to their own business logic, and also could be passed more specific functions rather than passing the whole service. I would appreciate any feedback on what the tradeoffs might be, and if there is a convention of some kind where could I find what that is?

jkrasnay14:08:16

I don’t think that would be a great approach. Typically, services managed by Component are stateful things you need to start and stop like DB connection pools, and any function that uses one of them is not pure. That pain you feel passing them around is good. It should encourage you to separate your business logic out into pure functions and to keep the ones that touch stateful components simple and shallow.

Jim Newton12:08:15

does anyone have a clever algorithm to generate a (possibly infinite) lazy sequence of pairs (a b) or [a b] given another possibly infinite lazy sequence which has the property that if (a b) is in the generated sequence then a strictly precedes b in the input sequence? For example given (1 2 3) I want to generate ((1 2) (1 3) (2 3)) .

delaguardo12:08:01

(let [xs '(1 2 3)]
  (for [x xs
        y xs
        :when (< x y)]
    [x y]))

Jim Newton12:08:06

doesn't work for

(let [xs '(3 2 1)]
  (for [x xs
        y xs
        :when (< x y)]
    [x y]))

Jim Newton12:08:51

also doesn't work if xs is infinite.

Jim Newton12:08:13

because the first loop will never finish, and y will never take on the 2nd value.

Jim Newton12:08:21

my feeling is it is not possible without consuming a huge amount of memory. maybe I'm wrong

Alex Miller (Clojure team)13:08:22

math.combinatorics lib may have stuff like this

delaguardo13:08:26

it does work in both cases.

(let [xs '(3 2 1)]
  (for [x xs
        y xs
        :when (< x y)]
    [x y])) ;; => ([2 3] [1 3] [1 2])
and an example with infinite input
(first
  (let [xs (range)]
    (for [x xs
          y xs
          :when (< x y)]
      [x y]))) ;; => [0 1]

Jim Newton13:08:15

if the input is '(a b c d e f g) I want to generate (for example) all the pairs not containing d before generating any which contain d.

Jim Newton13:08:58

but won't all the elements of your second suggestion be [n 1] for some n? when does y advance to the 2nd value?

Jim Newton13:08:25

OK, I should try it before commenting.

delaguardo13:08:33

the all will be [0 n]

Jim Newton13:08:55

clojure-rte.rte-core> (take 10 (let [xs (range)] (for [x xs y xs :when (< x y)] [x y]))) ([0 1] [0 2] [0 3] [0 4] [0 5] [0 6] [0 7] [0 8] [0 9] [0 10]) clojure-rte.rte-core>

Jim Newton13:08:24

indeed so the sequence doesn't contain all the pairs 😞

delaguardo13:08:34

right, but because input is infinite it can’t output finite result without extra conditions

Jim Newton13:08:51

perhaps it is obvious what my goal is, but I'm trying to find the first pair (a b) which matches a given predicate, but realising as little as possible from the input sequence.

Jim Newton13:08:42

yes but it can generate an infinite sequence of ALL the pairs. your suggestion fails to do that. for example it could generate [0 1] [0 2] [0 3] [1 2] [0 4] [1 3] [0 5] [1 4] [2 3] [0 6] [1 5] [2 4] .... i.e., first all the pairs which sum to 1, then all the pairs which sum to two, then all the pairs which sum to 3 ....

Jim Newton13:08:53

that would contain all the pairs, not just the ones starting with 0

Jim Newton13:08:08

maybe that's my solution......

Jim Newton13:08:29

however, it would be great if [1 2] were generated before [0 3] as that would mean in some cases we would never have to realize the 3rd element of the input sequence.

delaguardo13:08:51

strictly speaking it will generate ALL the pairs. But to materialize ALL of them you will need to wait infinite time

Jim Newton13:08:12

no, you're algorithm will never generate [1 2]

Jim Newton13:08:53

I would like an algorithm that will generate any pair after some finite time.

Jim Newton13:08:58

@U064X3EF3 with regard to math.combinatorics and similar libraries. I don't understand how to find the documentation? it is expected that perspective users read the source code to find all the functions? or am I missing something obvious? I see the examples. but not the list of functions and their explanations

p-himik13:08:18

math.combinatorics has combinations, but it doesn't work with infinite inputs.

lassemaatta13:08:09

if I'm not mistaken, the algorithm you're looking for sounds a lot like "zigzag ordering" (not sure what's the official name), which traverses an array like this: https://upload.wikimedia.org/wikipedia/commons/thumb/4/43/JPEG_ZigZag.svg/200px-JPEG_ZigZag.svg.png. the example is a finite array, but i'd guess you could generate an infinite sequence of indecis

p-himik13:08:20

Similar, but not exactly like this. Zigzag won't mach all pairs.

p-himik13:08:37

Something like this? @U010VP3UY9X

(defn lazy-combinations
  ([coll]
   (lazy-combinations [(first coll)] (next coll)))
  ([seen coll]
   (when (seq coll)
     (lazy-seq
       (let [x (first coll)]
         (concat (map (fn [s]
                        [s x])
                      seen)
                 (lazy-combinations (conj seen x) (next coll))))))))
(take 10 (lazy-combinations (range)))
=> ([0 1] [0 2] [1 2] [0 3] [1 3] [2 3] [0 4] [1 4] [2 4] [3 4])
(some #(= [100 200] %) (lazy-combinations (range)))
=> true
(vec (lazy-combinations (range 5)))
=> [[0 1] [0 2] [1 2] [0 3] [1 3] [2 3] [0 4] [1 4] [2 4] [3 4]]

🎉 2
p-himik13:08:06

First time ever I had to use lazy-seq, heh.

Jim Newton14:08:42

Here is my attempt which won't work because (into [] seq) will never finish.

(defn lazy-pairs-2
  [seq]
  (cond (empty? seq)
        ()

        :else
        (let [vec (into [] seq)]
          (for [i (range (count vec))
                j (range i)]
            [(vec j) (vec i)]))))

p-himik14:08:12

Is there something wrong with my version or are you simply trying to come up with your own algorithm?

Jim Newton14:08:23

no, I think you're algorithm produces the same as mine. I think I'll use it. I was wondering whether it could be done without recursion.

Jim Newton14:08:41

it can't be done with loop/recur because the recursion is not tail-recursion.

p-himik14:08:47

Mine doesn't use recursion, at least not in a way that consumes the stack.

Jim Newton14:08:53

@U0178V2SLAY yes that's the idea except that it too greedily realizes the input sequence. For example, it realizes [0 3] before [1 2].

Jim Newton14:08:23

@U2FRKM4TW is that true? I didn't realize that subtlety .

p-himik14:08:27

As a demonstration:

(first (drop 100000000 (lazy-combinations (range))))
=> [8989 14142]
A recursive version that consumes the stack would've definitely exploded there.

p-himik14:08:55

lazy-seq creates a closure. See the top example here: https://clojuredocs.org/clojure.core/lazy-seq

Jim Newton14:08:07

I guess concat , lazy-seq, and lazy-combinations really do return before the otherwise recursive call to lazy-combinations happens, right?

Jim Newton14:08:48

Great. I think I'll use your code if you don't mind. and I'll put a comment with you as the author.

p-himik14:08:49

Right. Well, except for the only truly recursive call due to the additional arity. Sure, go ahead! :) Happy to help.

Jim Newton14:08:06

BTW you use a vector as seen. why? why not a list?

Jim Newton14:08:30

cons onto list is faster than conj onto vector, right?

Jim Newton14:08:52

sorry if that's a stupid question;.

p-himik14:08:17

And since it's my first usage of lazy-seq, perhaps it could be improved - may be worth asking a separate question in the main channel. Not stupid at all! Perhaps a list with cons is better. But you'd have to measure really carefully. And it's O(1) either way. Besides, using a list with cons will change the order of the pairs. No idea if that matters in your case, but I find the vector-based result more intuitive.

✔️ 2
Alex Miller (Clojure team)14:08:26

cons onto list is faster than conj onto vector

p-himik14:08:55

But we also iterate over it after each cons/`conj`. A list would still be faster?

Alex Miller (Clojure team)14:08:24

I believe so, yes (but iterating is going to be the same time complexity either way)

👍 2
Alex Miller (Clojure team)14:08:20

with vectors, conj is not really O(1). if there's room in the tail (best/average case) then you add it to the tail. but in the worst case, the tail is full and you're (potentially) making 1 or more new tree nodes

p-himik14:08:40

Thanks! Makes sense.

Jim Newton12:08:40

what I'm currently doing is generating all the pairs that start with the first element, and concating that with the pairs generated from the tail. However, this will never finish if the input is infinite. Honestly, I really don't care about infiniteness, I just want to realize as few as possible from the input sequence.

Jim Newton13:08:39

Can someone help me understand what's wrong with this macro?

(defmacro foo
  "Test whether there exists an element of a sequence which matches a condition."
  [[var seq] & body]
  `(some (fn exists-some [~var]
           ~@body) ~seq))
When I call the macro
(foo [x '(1 2 3)] (even? x))
I get an error from spec
Syntax error macroexpanding clojure.core/fn at (clojure-rte:localhost:54898(clj)*:1311:23).
clojure-rte.rte-core/exists-some - failed: vector? at: [:fn-tail :arity-1 :params] spec: :clojure.core.specs.alpha/param-list
clojure-rte.rte-core/exists-some - failed: (or (nil? %) (sequential? %)) at: [:fn-tail :arity-n] spec: :clojure.core.specs.alpha/params+body

ghadi13:08:06

use macroexpand on your form and see what pops out

Jim Newton14:08:08

I think macroexpand fails.

ghadi14:08:54

why guess when you can check?

ghadi14:08:09

Clojure 1.10.3
user=> (defmacro foo
  "Test whether there exists an element of a sequence which matches a condition."
  [[var seq] & body]
  `(some (fn exists-some [~var]
           ~@body) ~seq))
#'user/foo
user=> (macroexpand '(foo [x '(1 2 3)] (even? x)))
(clojure.core/some (clojure.core/fn user/exists-some [x] (even? x)) (quote (1 2 3)))
user=>

ghadi14:08:07

so that fn form needs to be fixed

Jim Newton08:08:24

yes I really don't understand clojure symbol resolution well enough to understand why I need ~' before the exists-some symbol in the macro definition. But that fixed it. Several people suggested the same thing.

Jim Newton13:08:57

if I remove the exists-some symbol it works fine.

p-himik13:08:43

Replace it with ~'exists-some.

✔️ 2
Jim Newton13:08:44

I suspect it has something to do with the namespace of exists-some

Darin Douglass13:08:05

yep, the syntax quote is fully namespacing exists-some

✔️ 2
delaguardo13:08:07

~'exists-some

✔️ 2
Jim Newton13:08:40

sometimes clojure macros can be uglier than they need to be. grumble

dpsutton13:08:36

does [~var] seem weird to anyone else?

p-himik13:08:25

Not weird at all. Check out e.g. the implementation of for.

dpsutton13:08:53

is this an anaphoric version?

p-himik13:08:38

foo above is not anaphoric. clojure.core/for is anaphoric, but doesn't use [~x] in that context, as far as I can tell.

p-himik13:08:08

Oh, wait, foo actually is anaphoric, my bad.

Jim Newton13:08:28

how is it anaphoric. the caller provides a variable name, and the macro expects that the caller references that variable in the given body.

Jim Newton13:08:01

maybe I misunderstand anaphoric

p-himik13:08:04

It captures the name and allows you to use that name later in the body. At least, that's my understanding of the concept.

p-himik13:08:24

From Wiki: > An anaphoric macro is a type of https://en.wikipedia.org/wiki/Macro_(computer_science) that deliberately captures some form supplied to the macro which may be referred to by an anaphor (an expression referring to another)

Jim Newton13:08:51

ahhh. hmmm. indeed. that's not the intent. the intent is simply to allow the profiler to use the name fn-exists-some rather than fn- followed by some cryptic number.

Jim Newton13:08:16

but yes you are right, if the user knows the name he can use it. 😞

p-himik13:08:39

Nah, it's about the name of the var, not the name of the function.

p-himik13:08:58

That [~var] makes it possible to both define and use some particular name in that ~@body.

p-himik13:08:19

Similar to how you can use :let in for.

Jim Newton13:08:21

the example: (foo [x '(1 2 3)] (even? x))

Jim Newton13:08:44

the user provides x and he provides a body which references x ... is that anaphoric?

Jim Newton13:08:05

I thought anaphoric meant that the macro decided var

p-himik13:08:06

Looking at the actual examples - seems like you're indeed correct and my initial understanding was wrong.

Jim Newton13:08:44

I find that sometimes a form like

(exists [x some-input] (or (even? x) (prime? x)))
reads easily. Does there exist an x in some-input such that either x is even or prime?

Jim Newton13:08:31

but obviously it is syntax-mappable to a call to some with a (fn ..) which uses the given variable name.

dpsutton13:08:35

yeah you are right i think. you control the binding, rather than the macro introducing some symbol it thinks is appropriate

dgb2313:08:39

What I don’t get is why you are collecting the body as a variadic parameter. What would happen if I put in more predicates?

Jim Newton14:08:32

collecting the body?

p-himik14:08:05

You can replace & body with body and ~@body with ~body - this will allow only one expression. With the & and @ there, you allow multiple expressions, and the results of all but the last one will be ignored - just like in a regular fn body.

p-himik14:08:38

Or you can combine such multiple expressions within e.g. (and ...). A matter of preference, I suppose. HoneySQL does that, at least v1.

Jim Newton14:08:55

what about (exists [x some-seq] (prn [:x x]) (even? x)) ?

p-himik14:08:04

I'm not sure if leaving room for such statements is a good thing. You can always wrap something in a do to help with debugging. But making an explicit place for it is just asking for potential errors.

Jim Newton14:08:05

and of course (exists [x some-seq] (and (even? x) (prime? (inc x))))

p-himik14:08:23

^ that will work in either approaches.

Jim Newton14:08:40

what kind of potential errors are you referring to?

p-himik14:08:56

(exists [x some-seq] (even? x) (prime? (inc x))) - where you forget and . Or not you but someone else could think that there's an implicit and . There will be no error message, just a silent ignore of (even? x).

Jim Newton14:08:05

ahh. but I would never do that, so I don't consider it as potential. interesting that someone might think that is a reasonable thing to do.

p-himik14:08:13

> I would never do that Famous last words. ;) As I mentioned - where in HoneySQL does just that. (hh/where [:= :a 1] [:= :b 2]) will end up in ("a" = 1) AND ("b" = 2).

😜 2
Jim Newton14:08:59

well these are macros I've used for 20 years, so it is hard for me to think about seeing them the first time.

timo13:08:41

Anyone knows a good way to check if currently running compiled in jar or not? Need to do some debug-printlns...

Alex Miller (Clojure team)13:08:57

if you convert the namespace to a class resource, and that resource can be loaded, then you're using compiled code

Alex Miller (Clojure team)14:08:24

there is not a good public function to give you that resource name unfortunately. but it's: • replace - with _ • replace . with / • append "init.class"

timo15:08:30

I tried https://clojuredocs.org/clojure.core/load but had no success with it. It tells me

Could not locate clojure/test_tool/ui__init.class, clojure/test_tool/ui.clj or clojure/test_tool/ui.cljc on classpath.
but it is not under clojure
(ns test-tool.ui
  (:gen-class))
(load "test_tool/ui")
How would you 'load' the resource?

Alex Miller (Clojure team)16:08:09

Use clojure.java/resource to check whether the test_tool/ui__init.class resource exists

4
timo07:08:51

Thank you! Will try!

Nom Nom Mousse13:08:32

I've tried reading naming guides, but I cannot find help in the guides for the cases where a variable can have one out of two values:

(def err-or-out :out) ;; can also be :err
Better suggestions welcome.

Nom Nom Mousse14:08:23

I should have explained the context more: the variable describes whether output should be sent to stdout or stderr. The valid values it can have in the outputting library is :err or :out.

Nom Nom Mousse14:08:14

Perhaps process-stream or somethIng is better. Having the possible values (out/err) in the name isn't good practice.

dgb2314:08:40

I agree

👍 2
dgb2314:08:05

I don’t like process-stream though, because a stream would be something else typically

dgb2314:08:32

but it’s definitely better than err-or-out

dgb2314:08:03

I would call it write-to or write-target or something along those lines maybe.

👍 4
dgb2314:08:31

But I’m not a naming expert, just what I personally like.

RH19:08:32

Hello all. I'm new here but old to Clojure. I'm guessing this question has been answered many times but I can't find good answers. Is there a discussion somewhere on the web about the effects that JVM optimization over the years to improve Java performance has had on Clojure performance? In particular, I'm looking for recent information about when Java interop should be preferred over Clojure alternatives when performance really matters. Thanks.

Ben Sless19:08:25

can you be a bit more specific regarding your use case and performance budget? Also, this is a good starting point https://github.com/joinr/performancepaper

Alex Miller (Clojure team)19:08:30

in general, most things the JVM is trying to optimize are also things that benefit Clojure pretty directly, and Clojure has seen benefits over time from ongoing JVM optimizations

Alex Miller (Clojure team)19:08:31

(one notable place where Clojure can have different GC characteristics than a typical Java program is if you keep around a very long-lived stateful persistent collection - from an object perspective, you can have an object chain of "mutations" that is long back to a root object that is very old whereas in Java, you'd probably just reuse the same existing collection object. that said, newer collectors like Shenandoah work great with Clojure so not sure it matters.)

RH19:08:02

Thanks for the link. That has some useful info. No specific use case, we are developing increasingly large streaming ML applications in Clojure. The metaphor is natural, functional transformation of data streams, but performance ups and downs can be a mystery when those transformations are complex. So I'm trying to pull together some guidelines when to resort to Java interop. Math operations are one place that Java interop seems to be a win and not too ugly. This week a question arose whether performance is better if one accesses Mongo we use as an intermediate store sometimes via Java interop or Monger. The same has come up with Kafka/jackdaw. Clearly profiling is the best way to determine for a particular case. But generally it would be nice to have some best practices to start from when performance rather than Clojure purity matters most.

Alex Miller (Clojure team)19:08:32

in general, I am hesitant to give up on Clojure abstractions and make direct interop calls unless it is in the very hot path and really makes a difference. In which case, heck yeah. :) Math ops seem like a reasonable place to focus.

Ben Sless19:08:02

You have some options in Clojure land, as well: fast primitive math: https://github.com/generateme/fastmath ML is pretty matrix heavy, you might want to look at https://github.com/uncomplicate/neanderthal and https://github.com/cnuernber/dtype-next

Ben Sless19:08:12

But profiling first is good

Ben Sless19:08:35

I managed to reach throughputs deserializing JSON (in production environment, not just an experiment) where turning keys to keywords had a performance impact, but it's important to know if you're even there

Ben Sless19:08:55

I can write a lot about it (and have already) but knowing where the problem is is the first step

Ben Sless19:08:42

use criterium, clj-async-profiler, to get an initial picture JMH to get accurate microbenchmarks if you need to VisualVM and JFR for a holistic picture And JITwatch if you must, but I doubt you will unless you're trying to do HFT, which you aren't

RH19:08:11

@U064X3EF3 You've hit on a key issue. In our cases we do have some very long-lived data objects that originate from clojure, but not kept as large collections of objects in clojure. In those cases we don't use any type of clojure collection but effectively just flatten them and store them in a Mongo collection. In other situations, we have very large unbounded sequences of data, which commonly originate from as many as thousands of Kafka streams. We consume, transform,, and produce 10s, 100s, or 1000s of Kafka streams. So from this perspective, and this is where JSON ser-des is relevant that @UK0810AQ2 mentions, Kafka streams are really the large immutable collections we are manipulating. This makes most of the Clojure hot in a sense because every single operation is repeated large numbers of times. The profiling tools @UK0810AQ2 mentions could be quite helpful for us going forward.

Ben Sless20:08:44

I sometimes embed clj-async-profiler in an application and trigger it externally, gives you a very good profile of CPU activity. There's a lot here depending on your level of abstraction. If it's all streams and KTables then you may not be holding large objects for long. Not enough information to give a more useful response. Profile first

Ben Sless20:08:13

but always remember to solve the performance problems related to the platform, too, not just the language. Kafka can have performance pitfalls, so can kafka streams, make sure those are solved, too

Ben Sless20:08:27

for example, if you're using compression, use zstd and not gzip

RH22:08:44

@UK0810AQ2 @U064X3EF3 Thanks for all the suggestions. I'm going to pass all of this on the team

noisesmith16:08:48

> This week a question arose whether performance is better if one accesses Mongo we use as an intermediate store sometimes via Java interop or Monger. The same has come up with Kafka/jackdaw. with "wrappers" there will never be a generic answer (about quality, performance, etc.) - some are simply better implemented than others, and all of them fail miserably when you leave the use cases that the wrapper author was considering for example, I worked on jackdaw when I was employed at Funding Circle, and I would be suspicious of how well it handles your use case of very large numbers of streams - it might just happen to work, but that also wasn't a usage we were designing around. Of course it might be the case that our abstractions are minimal enough that the issue never comes up (jackdaw is lower level than most wrappers) but on the other hand the kinds of abstractions we chose might make your use case more complicated than the interop version would be

vemv20:08:59

I remember reading this comment at the time https://news.ycombinator.com/item?id=18346043 , some months later that system was OSSed https://github.com/Netflix/mantis-mql/ looks like it went a bit under the radar? Someone might find it an interesting code read.

borkdude20:08:45

Interesting behavior:

user=> (defmacro foo [] `(foo/bar 1))
#'user/foo
user=> (macroexpand '(foo))
(foo/bar 1)
user=> (defmacro foo [] `(foo.bar/baz 1))
#'user/foo
user=> (macroexpand '(foo))
Execution error (ClassNotFoundException) at java.net.URLClassLoader/findClass (URLClassLoader.java:382).
foo.bar

Alex Miller (Clojure team)21:08:45

(pst) should give you trace

Alex Miller (Clojure team)21:08:15

in general, dotted things are assumed to likely be classes in several places

Alex Miller (Clojure team)21:08:50

the case being used there is when it's a namespaced symbol and the namespace does not exist, then the assumption is that it's a class? might be a case that could use a try/catch/do nothing, dunno.

borkdude21:08:30

yeah, sometimes it's useful to macroexpand code without actually having that namespace around, but the use case might be niche

borkdude21:08:42

actually for CLJS macros this might be an issue for namespaces that do exist in CLJS but not in CLJ, unless they are doing a different macro-expansion

👍 2
borkdude21:08:18

but usually the ns is called similarly, e.g. foo.cljs and foo.clj for the macros, so you won't run into that in that case

sansarip22:08:39

Let's say you're using a zipper and one of the paths to a desired node looks like this 😲

(-> zipper
        z/down
        z/right
        z/right
        z/right
        z/right
        z/right
        z/right
        z/right
        z/right
        z/right
        z/down
        z/right
        z/right
        z/node)
What would be a good strategy for reducing the redundant z/right calls?

hiredman22:08:36

(apply comp (repeat 9 z/right))

👍 2
🙇 2
Michael Gardner22:08:14

not worth the cognitive overhead IMO. If you just want to reduce verbosity:

(require '[clojure.zip :refer [left right up down] :rename {left L, right R, up U, down D}])
...which almost lets you do the Konami code too

2
Alex Miller (Clojure team)22:08:49

this kind of code can't possibly be easy to maintain can it? I don't wanna like open a whole can of worms, but if I had this I'd be reconsidering how I got there

22
Alex Miller (Clojure team)22:08:56

does the data need to be that nested? is there some "path" that could be compiled to the zipper instructions? etc