Fork me on GitHub
#clojure-spec
<
2016-08-23
>
amacdougall03:08:46

I've been following http://clojure.org/guides/spec, and trying to apply it to some maze-building code I'm toying with. The core data structure is a grid (a two-dimensional vector) of cells (x, y, and a set of open exit directions). I have some specs which seem entirely reasonable to me:

(spec/def ::x integer?)
(spec/def ::y integer?)
(spec/def ::direction #{::n ::ne ::e ::se ::s ::sw ::w ::nw})
(spec/def ::exits (spec/coll-of ::direction :kind set?))
(spec/def ::cell (spec/keys :req [::x ::y ::exits]))
(spec/def ::row (spec/coll-of ::cell))
(spec/def ::grid (spec/coll-of ::row))
And then I have this pretty basic function, with a pretty basic fspec:
(defn grid-contains?
  "True if the supplied x and y coordinates fall within the grid boundaries."
  [grid x y]
  (and (<= 0 y) (<= 0 x)
       (< y (count grid))
       (< x (count (nth grid y)))))

(spec/fdef grid-contains?
  :args (spec/cat :grid ::grid :x ::x :y ::y)
  :ret boolean?)
But when I take the guide's advice and do this:
(require '[clojure.spec.test :as stest])

(stest/check `grid-contains?)
...I use 350% CPU for fifteen minutes and counting. What gives? Is there some unbounded complexity in my specs? Or is test.check just being very thorough?

amacdougall03:08:24

I did try this:

(stest/check `grid-contains? {:clojure.spec.test.check/opts {:num-tests 10}})
... which seems to be the right way to specify the options, according to http://clojure.github.io/clojure/branch-master/clojure.spec-api.html#clojure.spec.test/check, but it didn't make a difference.

amacdougall03:08:24

(spec/exercise ::grid) does fine, so I want to think that Clojure is able to generate correct test data without any issues. ¯\(ツ)

amacdougall03:08:41

Final note: when I first ran stest/check, it hit null errors which revealed bugs (yay?). Once I fixed those bugs, I ended up in this situation.

lvh03:08:33

Can I use spec to spec out a protocol? Just fdef the protocol’s method separately??

seancorfield04:08:48

@lvh I seem to recall a JIRA issue around protocol functions, but it may be only for instrument...?

brabster06:08:08

@amacdougall my first thought is maybe your spec is generating enormous grids under test which is chewing up cpu walking around it

brabster07:08:13

@amacdougall just added a print to your fn and ran (stest/check `grid-contains? {:num-tests 1}), getting values like 32174406 809896496 -13491 -1 -3980 -2856 193086473 -53862035

brabster07:08:57

Would guess it's the (nth grid y) that's trying to walk along really long linked lists or something - out of time now but hope that helps!

magnetophon11:08:33

what's wrong with the above?

magnetophon11:08:08

the generator works, the s/def works if I comment out the with-gen parts, but the whole fails with: "Couldn't satisfy such-that predicate after 100 tries."

minimal11:08:42

sort turns maps into lists of map entries

magnetophon11:08:54

what would be the proper way?

minimal11:08:34

are you trying to gen a sorted-map? @magnetophon

minimal11:08:55

i think this works (gen/sample (gen/fmap (fn [a] (apply sorted-map (flatten (sort a)))) (gen/map (s/gen int? ) (s/gen int?))))

minimal11:08:17

({} {0 -1} {0 -1} {-1 -1, 0 -4, 7 -3} {-1 -4, 0 -5, 1 -1} {-2 -2, -1 0, 0 3, 3 7} {-1 6, 0 -4, 24 -12} {-16 -21, -7 0, -4 16, -1 0, 0 -1, 1 -10} {-32 -7, 0 -23, 92 -13} {-13 1, -3 -1, -2 0, -1 1, 6 51, 14 11})

magnetophon11:08:03

@minimal It does! thanks! now I just need to decrypt that, but I'll manage.

minimal11:08:50

Just turning it into what sorted map expects (sorted-map k1 v1 k2 v2 …)

magnetophon11:08:32

@minimal I'm very new to clojure and lisp, so I'll have to study a bit to really get it, but I'll be fine.

magnetophon12:08:15

@minimal Hmm, I sort of get what you're doing, and can use that with most types as the val of the map, but when I use (the type I need )[https://github.com/magnetophon/openloop/blob/master/src/openloop/spec.clj#L79] I get:

No value supplied for key: 0

magnetophon12:08:46

the weird thing is: when I comment out the :src-index, it works. when I leave that in, but uncomment :length, it also works.

amacdougall14:08:51

@brabster: Thanks for the pointer—maybe if I refine my spec to declare that the grid should never be more than, say, 10000x10000, and also specify that it must be made of vectors (for faster indexed access), test.check will be a bit less ambitious. On the other hand, should we expect such a system to generate arbitrarily large collections up to the literal limit of the integer type? I certainly see the benefit, but if a single test takes such a long time to run, how could you run a whole suite without a Bitcoin-mining level of resources?

amacdougall14:08:09

Last night before giving up, I did change the grid spec to include :kind vector?:

(spec/def ::row (spec/coll-of ::cell :kind vector?))
(spec/def ::grid (spec/coll-of ::row :kind vector?))
But it didn't seem to make a difference. My understanding was that nth on a vector is O(1), because it is effectively (v n), and although I don't understand all the voodoo at a glance, https://github.com/clojure/clojure/blob/master/src/jvm/clojure/lang/PersistentVector.java seems to confirm that.

amacdougall14:08:03

To put it more generally: should developers expect to hand-tune specs to keep generative testing from trying to simulate the entire possible range of inputs? On reflection, the answer is yes: that's what generative testing is for. But in practice, generative testing on strings generates a bunch of garbage strings, it doesn't generate million-character strings. So why is this test trying to generate a 193,086,473-column grid? ...end of musing.

magnetophon14:08:39

@minimal ah, found the problem: flatten flattens recursively, so when I have an uneven number of elements I get an error.

minimal14:08:18

cool, sorry was afk for a bit

magnetophon14:08:51

np, you've been very helpful.

amacdougall15:08:21

I'm afraid that setting explicit bounds on my specs doesn't make a difference. I updated the relevant specs to limit both coordinates and grid dimensions:

(spec/def ::x (spec/and integer? (partial > 1000)))
(spec/def ::y (spec/and integer? (partial > 1000)))
(spec/def ::row (spec/coll-of ::cell :kind vector? :max-count 1000))
(spec/def ::grid (spec/coll-of ::row :kind vector? :max-count 1000))
...and added a prn to grid-contains? to see what test.check is trying to do... and I get this:
#'user/grid-contains?
user=> (stest/check `grid-contains? {:num-tests 1})
"grid-contains? grid 0 0"

OutOfMemoryError GC overhead limit exceeded  clojure.lang.PersistentVector$TransientVector.persistent (PersistentVector.java:568)
Perhaps there's something I could do to work around this, but I think for now I'll skip the generative testing.

amacdougall15:08:06

The other main use of fdef is to instrument a namespace during development/test, which forces spec checking of all function calls, right? That seems like it would still be quite beneficial even without full-spectrum generative testing. 🌈

raymcdermott16:08:06

Hi guys, I want to do some checking of JSON data with spec, seems like it will be OK - any gotchas?

magnetophon17:08:47

@minimal I found a solution ^^

magnetophon17:08:15

when I s/conform some ok values it works as expected. when I try to conform values with a duplicate key, I don't get an error saying that, but instead I get:

Unmatched delimiter: )

magnetophon17:08:00

should I report that as a bug in clojure, or am I doing it wrong?

magnetophon17:08:43

Here is the code, and some sample data,if someone wants to try it out: https://github.com/magnetophon/openloop/blob/master/src/openloop/spec.clj#L79

minimal17:08:54

@magnetophon cool! Yep (into (sorted-map)) is better. Not sure about the error

magnetophon17:08:26

@minimal a clearer example of the error is:

(s/def ::test-type
  (s/map-of int? int?))
(s/conform ::test-type {41 1, 42 2}) ; ok
(s/conform ::test-type {42 1, 42 2}) ; unmatched delimiter: )

bbrinck17:08:52

@amacdougall Keep in mind that instrument will only check :args, not :ret or :fn specs

bfabry17:08:37

@magnetophon you should be getting an exception from the reader creating a literal map with duplicate keys

minimal17:08:45

@magnetophon you can’t have map literals with duplicate keys anyway. The unmatched delimiter must be a symptom of that

bfabry18:08:01

I mean.. you also can't have maps with duplicate keys either. if you try and construct one one of the keyval pairs will disappear

magnetophon18:08:29

@bfabry @minimal thanks. It's just that I recently read the clojure spec page, and wanted to try out the promise of usefull error messages.

bfabry18:08:07

try (s/conform ::test-type {:a 2, 31 3})

bfabry18:08:15

the thing is spec is never coming into play when you're trying to detect duplicate keys in a map. the map data structure just doesn't support that, and the clojure reader also doesn't support that. so it gets chucked out way before spec ever checks it

magnetophon18:08:15

I was hoping to get a nice sorry, the keys are not unique message from spec. oh well... 🙂

bfabry18:08:40

I'm kind of surprised the reader didn't give you that error

bfabry18:08:43

because that's what I get

bfabry18:08:10

user=> {1 1 1 1}

user=> clojure.lang.LispReader$ReaderException: java.lang.IllegalArgumentException: Duplicate key: 1
     java.lang.IllegalArgumentException: Duplicate key: 1

magnetophon18:08:05

@bfabry yeah I get that too, but not when I do it in spec

bfabry18:08:29

ok but you're not doing it "in spec"

bfabry18:08:54

the error happens pre spec

bfabry18:08:41

you literally cannot create a map with duplicate keys, so there's no way for spec to look at a map that has duplicate keys and give you an error about it

magnetophon18:08:43

@bfabry I mean I don't get a usefull error from the reader when the faulty map is inside a conform or explain function.

minimal18:08:26

I get

(s/conform ::test-type {42 1 42 2}) 
     java.lang.IllegalArgumentException: Duplicate key: 42
clojure.lang.LispReader$ReaderException: java.lang.IllegalArgumentException: Duplicate key: 42
             java.lang.RuntimeException: Unmatched delimiter: )
clojure.lang.LispReader$ReaderException: java.lang.RuntimeException: Unmatched delimiter: )

bfabry18:08:26

yeah that's a bit odd, but I don't think it's related to spec

bfabry18:08:55

user=> (identity {1 1 1 1})

clojure.lang.LispReader$ReaderException: java.lang.IllegalArgumentException: Duplicate key: 1
     java.lang.IllegalArgumentException: Duplicate key: 1
clojure.lang.LispReader$ReaderException: java.lang.RuntimeException: Unmatched delimiter: )
             java.lang.RuntimeException: Unmatched delimiter: )
user=>

minimal18:08:04

No it’s not spec

bfabry18:08:10

it's just because you've blown up inside the reader, so there's unconsumed input

minimal18:08:28

Once the reader fails, all bets are off and it gets more errors

magnetophon18:08:09

@minimal when I run that line, I get the ) error.

bfabry18:08:37

something about your environment is eating the first error, I guess

minimal18:08:39

It’s a distracting error but the duplicate key error is the useful one. So it's confusing if it’s missing from your output

magnetophon18:08:16

@bfabry @minimal ah, I was looking in the wrong place. both errors are there. sorry for the noise.

minimal18:08:37

no worries.

magnetophon18:08:46

What is meant by

By default map-of will validate but not conform keys because conformed keys might create key duplicates that would cause entries in the map to be overridden. If conformed keys are desired, pass the option `:conform-keys true'.
in the spec guide?

magnetophon18:08:57

afaik conform means to add metadata like the keys of a map. What do they mean by conforming keys? And why would that create duplicates?

bfabry18:08:13

conforming means modifying the data to fit the spec

bfabry18:08:24

boot.user=> (s/conform (s/map-of int? (s/or :k keyword? :i int?)) {1 :foo 2 3})
{1 [:k :foo], 2 [:i 3]}

bfabry18:08:30

so you see there the values were conformed

bfabry18:08:48

to tell you whether the value conformed to the :k branch or the :i branch of the spec

bfabry18:08:14

but you can use custom conformers, so hence the statement above

Alex Miller (Clojure team)18:08:32

by default the keys in the conformed value of the map will be identical

Alex Miller (Clojure team)18:08:03

but you can use :conform-keys true in the call to map-of to allow that to happen

Alex Miller (Clojure team)18:08:16

user=> (s/conform (s/map-of (s/or :i int? :s string?) int?) {5 5, "a" 10})
{5 5, "a" 10}
user=> (s/conform (s/map-of (s/or :i int? :s string?) int? :conform-keys true) {5 5, "a" 10})
{[:i 5] 5, [:s "a"] 10}

Alex Miller (Clojure team)18:08:15

and the reason not to do that automatically is that when you are altering the keys of the map, you might end up producing the same key and effectively erasing some of your transformed input, so we err on the side of safety

magnetophon18:08:08

thanks. so it only makes sense if you have a branch in your key spec, and you want to know which branch was taken?

magnetophon18:08:27

that is not clear (to me) from the guide.

Alex Miller (Clojure team)21:08:19

It only matters if your key spec conforms to something other than the original value

Alex Miller (Clojure team)21:08:04

Given the common map keys - strings, longs, keywords, symbols - those are all likely to conform the same and it's a nonissue

Alex Miller (Clojure team)21:08:40

So in most cases you can ignore this entirely