Fork me on GitHub
#clojure-spec
<
2020-08-21
>
johanatan03:08:29

btw i think i found the problem

(apply concatv rest)
can blow up the stack / is not in tail position.

seancorfield03:08:23

Oh, interesting... Does that introduce laziness somehow?

johanatan03:08:27

(at least that is "one" problem)

johanatan03:08:52

well, it's a recursive call

seancorfield03:08:20

OH! Yeah, that sounds so obvious once you say it out loud! 🙂

johanatan03:08:06

let me explain. sorry, i noticed that even if I remove the :fn I still get the crash with the original impl of concatv. but with this loop/recur one (inspired by the Stuart Sierra blog on concat) that crash won't occur:

(defn concatv
  "Strict version of concat."
  [head & tail]
  (loop [seqs (cons head tail) acc []]
    (if (empty? seqs)
      acc
      (recur (rest seqs) (into acc (first seqs))))))

seancorfield03:08:28

Seems weird to have head & tail when you only cons them together and never use them again.

johanatan03:08:01

oh, could just be [& args] ?

seancorfield03:08:40

You might want to consider whether (concatv) should work and what it should produce (I'd say yes and [] respectively).

johanatan03:08:18

i'm just trying to match behavior of the original concat (which seems to agree w/ you) 🙂

cljs.user=> (concat)
()

johanatan03:08:45

and yea that concatv does similarly:

cljs.user=> (concatv)
[]

seancorfield03:08:39

I just tried it in Clojure and got an error

seancorfield03:08:03

Oh, it works after changing to & args and (loop [seqs args ...

johanatan03:08:05

hmm, mine was from lumo

johanatan03:08:14

didn't paste in here

johanatan03:08:24

(defn concatv
  "Strict version of concat."
  [& args]
  (loop [seqs args acc []]
    (if (empty? seqs)
      acc
      (recur (rest seqs) (into acc (first seqs))))))

petterik16:08:58

Did you consider (reduce into [] args)?

johanatan20:08:08

Nope, does it work?

johanatan20:08:56

Also, how much memory will it use? the loop/recur one is tail optimized so should use a constant amount of memory

seancorfield03:08:32

The version with [head & tail] requires 1+ arguments 🙂

johanatan03:08:14

haha, yea 2+ or 0+ makes more sense than 1+

johanatan04:08:06

here is a "final" version that works (tested up to 75 iterations at a time):

(s/def ::any-coll
  (s/with-gen
    (s/coll-of any?)
    #(s/gen (s/coll-of any? :max-count 125))))

(s/fdef concatv
  :args (s/cat :args (s/* ::any-coll))
  :fn #(let [r (:ret %1)
             args (-> %1 :args :args)
             sumhash (fn [c] (apply + (mapv hash c)))]
         (and
          (= (count r) (apply + (mapv count args)))
          (= (sumhash r) (apply + (mapv sumhash args)))))
  :ret (s/coll-of any? :kind vector?))

(defn concatv
  "Strict version of concat."
  [& args]
  (loop [seqs args acc []]
    (if (empty? seqs)
      acc
      (recur (rest seqs) (into acc (first seqs))))))

3
johanatan04:08:33

so yea i think i was kinda confounded earlier by the fact that there was a potential stack overflow in the code under test. so even when i got the test code "right" the code under test could screw me

johanatan04:08:29

but it was somewhat non-deterministic on both sides. bit of an entangled heisenbug

johanatan04:08:35

two entangled heisenbugs rather

seancorfield04:08:44

Complected, even 🙂

johanatan04:08:51

haha exactly

johanatan19:08:30

small update: so ... there turned out to be 3 problems. my hunch about s/* was also correct. some of the crashes were emanating from it as well. here's a "more final" version (at least until I get another random, intermittent failure some time in the future):

(defn arbitrarily-partition [coll freq]
  (let [signals (take (count coll) (cycle (cons true (repeat freq false))))
        shuffled (shuffle signals)
        zipped (map vector shuffled coll)
        partitioned (partition-by first zipped)]
    (for [c partitioned]
      (map second c))))

(s/def ::coll-of-colls
  (s/coll-of (s/coll-of any?)))

(s/def ::distinct-coll-of-colls
  (s/with-gen
    ::coll-of-colls
    #(gen/let [elems (gen/set (s/gen any?) {:max-elements 150})
               freq (s/gen (s/int-in 1 10))]
       (arbitrarily-partition elems freq))))

(s/fdef concatenate
        :args (s/cat :args ::distinct-coll-of-colls)
        :fn #(let [r (:ret %1)
                   args (-> %1 :args :args)
                   sumhash (fn [c] (apply + (mapv hash c)))]
               (and
                (= (count r) (apply + (mapv count args)))
                (= (sumhash r) (apply + (mapv sumhash args)))))
        :ret (s/coll-of any? :kind vector?))

(defn- concatenate
  "Strict version of concat."
  [args]
  (loop [seqs args acc []]
    (if (empty? seqs)
      acc
      (recur (rest seqs) (into acc (first seqs))))))

(defn concatv
  "Strict version of concat."
  [& args]
  (concatenate args))

misha08:08:39

I did not read entire discussion, so it might be irrelevant, but: there is a :distinct option in s/coll-of

(s/coll-of int? :distinct true) 

misha08:08:19

or is it supposed to be "distinct for all 1st and 2nd level items"?

johanatan02:02:38

it's supposed to be distinct across the insides of the inner collections if I recall correctly (i had to read the code to refresh my own memory but that's what it appears to me at this point)

johanatan02:02:08

based on the fact that the generator for ::distinct-coll-of-colls first generates a set of elements and then merely chooses random places to "divide" those into buckets

johanatan02:02:56

but yea coll-of? int? :distinct true is in effect equivalent to gen/set (s/gen int?)

misha12:02:20

welcome back! opieop

johanatan18:02:32

been a while for sure lol