Fork me on GitHub
#clojure
<
2023-02-17
>
Lennart Buit13:02:26

Is it correct that http://clojuredocs.org doesn’t have the 1.11 introduced parse-x family of functions?

Lennart Buit13:02:12

I think this is community maintained, no?

Lennart Buit13:02:00

Ah okay, I’m answering my own question: https://github.com/zk/clojuredocs/pull/246

Noah Bogart14:02:43

The comments are community maintained but the site is still run by a single person who has sadly been absent

Alex Miller (Clojure team)15:02:30

The official Clojure API docs are at http://clojure.github.io/clojure/

👍 6
2
tomd14:02:03

Can anyone 1) reproduce this error:

(dorun (clojure.math.combinatorics/partitions [1 1 2 3 4 5] :min 5 :max 5))
Execution error (IndexOutOfBoundsException) at clojure.math.combinatorics/m5$fn (combinatorics.cljc:859).
2) explain it? I assume it's a bug but the code is so inscrutable it's hard to be sure what's going on. I can't seem to reproduce it without duplicate items, or with :min and :max not equal, but I haven't tested too much here. Using dorun to avoid spamming my repl, but it does occur without. I've reproduced it with lots of other groups of numbers, but haven't noticed a pattern yet.

dgb2315:02:23

Seems like you're trying to realize a sequence that is too large. You can also reproduce it like this: (count (combo/partitions [1 1 2 3 4 5] :min 5 :max 5))

dgb2315:02:05

or maybe not?

tomd15:02:34

How do you mean a sequence that is too large?

dgb2315:02:39

no its not too large

👍 2
dgb2315:02:53

it's something else, i jumped to conclusions

🙂 2
dgb2315:02:07

it only seems to happen when there are duplicate 1's for me so far

dgb2315:02:12

other numbers seem fine?

dgb2315:02:27

it's not about being 1's. this also fails:

dgb2315:02:40

(dorun (combo/partitions [2 3 4 5 1 2] :min 5 :max 5))

dgb2315:02:48

where the first item is duplicated

tomd15:02:12

ah. good find!

dgb2315:02:13

this doesnt:

dgb2315:02:21

(dorun (combo/partitions [3 2 4 5 1 2] :min 5 :max 5))

dgb2315:02:45

i tried to read the source code... but its way above my paygrade so I can't help you there 😄

tomd15:02:26

I think a function with an args list like this:

[n m f c u v a b l r s]
is paygrade-agnostic. Looks a bit write-only to me 😄

😆 4
tomd16:02:10

As this has been replicated, and I think I understand the conditions for throwing an error a little better, I've made a post on ask clojure: https://ask.clojure.org/index.php/12673/clojure-combinatorics-partitions-throw-error-duplicate-items

👍 6
Sam Ritchie03:02:21

@UE1N3HAJH you definitely found a bug! it’s broken right at the start, before the error, when it generates

([1 4] [1] [2 3] [2 3] [2 3])
as its fourth option

Sam Ritchie03:02:28

since that’s no longer a partition

Sam Ritchie04:02:33

woah, I think I’ve got it

Sam Ritchie04:02:47

@UE1N3HAJH phew, okay, I’d love to know if this fixes the issues that you’ve seen with other groups of numbers!

tomd19:02:15

Thanks @U017QJZ9M7W - great work! All looks good to me, as commented on the ticket. I hope it gets merged soon 🎉

tomd12:02:47

Maybe worth pinging @U0518RMLD instead as I think he's actually the author.

Sam Ritchie19:03:59

Proper fix released, my old patch wasn’t great, it generated all small partitions then filtered them so performance was poor

tomd19:03:59

Great job! Thanks for working on this!

❤️ 2
Noah Bogart15:02:27

I have sorted pairs of index and predicate, and a seq of unknown size. I want to call the predicate on the value of the seq at the paired index and early exit on the first failure. Is nth my best bet or is there a faster/better way of stepping through a seq to arbitrary points?

p-himik16:02:43

If it's performance that you're worrying about, it should be possible to write your own "caching nth" that uses a transient vector as an indexed cache. If you want readability, nth along with every? are probably the best bet here. Although, every? helps in the above case as well.

p-himik16:02:17

Oh, but since the predicate collection is sorted by index, I'd probably go with loop.

Noah Bogart16:02:12

My worry was that without vectors Indexed, nth would start from 0 every time. So as the indexes get bigger, the O(n) traversal would grow too. Would it be worth using nthnext or something and calculating the distance from index to index?

p-himik16:02:10

nth would indeed start from the very start every time. nthnext is no different - it just returns the tail and not the item at that position. You can use multiple calls to drop in e.g. a reduce but you'd have to keep track of the current index.

👍 2
p-himik16:02:51

Something like this, although there are probably more clever solutions:

(def index+pred [[1 int?]
                 [5 pos?]
                 [18 #(> % 3)]
                 [200 zero?]])

(def coll (range))

(def result (boolean (reduce (fn [[coll curr-idx] [idx pred]]
                               (let [n (- idx curr-idx)
                                     coll (drop n coll)
                                     item (first coll)]
                                 (if (pred item)
                                   [coll idx]
                                   (reduced false))))
                             [coll 0]
                             index+pred)))

😍 2
🎉 2
Noah Bogart16:02:50

That’s great, thank you. I’ll give that a whirl

👍 2
Ben Sless18:02:03

What would be really neat is the ability to project a sequence over a sequence of indices (sort of like in MATLAB)

p-himik18:02:18

Ah yeah, indeed. Something like this:

(defn project [indices coll]
  (letfn [(step [curr-idx indices coll]
            (lazy-seq
              (when-let [indices (seq indices)]
                (when-let [coll (seq coll)]
                  (let [idx (first indices)
                        n (- idx curr-idx)
                        coll (drop n coll)
                        item (first coll)]
                    (cons item (step idx (next indices) coll)))))))]
    (step 0 indices coll)))

Ben Sless19:02:36

Yes, although this assumes the input is sorted and unique, but it's a good fit for sequences

p-himik19:02:12

Sorted - yes. Unique - no. drop accepts 0 as n.

Ben Sless19:02:04

Oh, that's kind of it 😄

Timofey Sitnikov16:02:51

I am looking for an algorithm where I can use a lambda to do rolling computation, for example, rolling average would look like this:

(rolling #(/ (reduce + %) (count %)) array roll_length)
to calculate the rolling average and of course, I would be able to change the function.

tomd16:02:51

I wrote some code recently which takes an offset rather than a roll-length, so it supplies each item, along with the items n in front of, or behind it, depending on if offset is positive or negative:

(defn partition-offset
  ([offset coll]
   (if (neg? offset)
     (let [abs-offset (abs offset)]
       (concat (map #(take % coll) (range 1 abs-offset))
               (partition abs-offset 1 coll)))
     (partition-all offset 1 coll))))
Note that if you just care about a forward-facing window, that last line is sufficient. I also have the transducer code kicking around if that's helpful. This may be in a lib somewhere too. We can't be the first two people to need this 🙂

tomd16:02:34

For clarity, your code would then look like this:

(->> coll
     (partition-offset -3)
     (map #(/ (reduce + %) (count %))))

p-himik16:02:30

Just in case - averages in particular could be made much more performant if you don't use something generic for it because (count %) is always the same and (reduce + %) can be replaced with a single - followed by a +.

tomd16:02:47

For this function, the count isn't always the same, because partitions "off the edge" of the collection are shorter, rather than being padded with nils or 0s. Probably a good argument for allowing padding :thinking_face:

p-himik16:02:47

Ah, right. But that can also be taken into consideration, whether with or without a padding - depending on how you roll [the windows]. :)

👍 2
Ed18:02:55

partition allows for creating windows like this

(let [average (fn [nums]
                  (/ (apply + nums) (count nums)))]
    (->> (range 30)
         (partition 3 1)
         (map average)))
which will miss off the extra unpadded elements. You could also make a transducer that creates windows from a sequence, maybe something like
(defn window [n]
  (fn [rf]
    (let [win (volatile! [])]
      (fn
        ([] (rf))
        ([r]
         (rf (reduce rf r (take-while not-empty (iterate (partial into [] (drop 1)) @win)))))
        ([r x]
         (rf r (vswap! win (fn [w]
                             (let [w' (conj w x)]
                               (if (-> w' count (> n))
                                 (subvec w' 1)
                                 w'))))))))))

(comment

  (let [average (fn [nums]
                  (/ (apply + nums) (count nums)))]
    (sequence (comp (window 3) (map average)) (range 10))) ;; => (0 1/2 1 2 3 4 5 6 7 8 8 17/2 9)

  )
or you could reach for Christophe Grand's xforms lib (https://github.com/cgrand/xforms) which has a more flexible implementation of partition and a window that might do what you want (and I think can be made to do what @U2FRKM4TW was talking about with performance)

Sam Ritchie19:02:45

The most efficient thing here would be an exponential moving average

p-himik19:02:53

What does weighing have to do with efficiency?

Sam Ritchie19:02:09

That particular algorithm processes 1 item at a time and doesn’t need to access windows of the data

Sam Ritchie19:02:37

another nice one is an “exponential histogram”, which can give you an approximate rolling-window average with a guaranteed error bound with much lower memory than keeping the full window in memory

Sam Ritchie19:02:47

exponential histogram code in Scala, from back in the day: https://github.com/twitter/algebird/blob/develop/algebird-core/src/main/scala/com/twitter/algebird/ExpHist.scala#L7 “decayed value”: https://github.com/twitter/algebird/blob/develop/algebird-core/src/main/scala/com/twitter/algebird/DecayedValue.scala#L17 definitely nice to skip the window, and you can use reductions to track the averaged value as you add in items, of course

Ed20:02:04

Sorry, I probably missed the point of the actual question. I thought he was more asking about how to do rolling window type things than actually calculating an average.

Sam Ritchie20:02:43

That’s probably true, I was just dropping additional breadcrumbs

👍 2
Timofey Sitnikov20:02:24

@U0P0TMEFJ, actually, yes, you are absolutely right, I am about the rolling window type. I am using the code you recomended.

Timofey Sitnikov20:02:35

(let [average (fn [nums]
                  (/ (apply + nums) (count nums)))]
    (->> (range 30)
         (partition 3 1)
         (map average)))

Ed21:02:13

Cool ... Glad that's working for you ;)

Sam Ritchie21:02:52

In this case @U012GN57FJ9 (count nums) is always 3!