Fork me on GitHub
#test-check
<
2017-05-23
>
Alex Miller (Clojure team)17:05:41

@gfredericks hey, I have a probably too broad question re test.check generators. There are a number of circumstances where random inputs can get “too big” (usually via either collection sizing or nesting) if you do enough tests. Surely this is an issue that quickcheck has already encountered - do they have generic ways to do thinning or something else? (I will do some of my own research but figured you might already know about it)

gfredericks17:05:43

@alexmiller you're asking about something more sophisticated than the normal size control techniques?

Alex Miller (Clojure team)17:05:47

well… kind of. this is really about controlling size at a higher level, which as I’ve been thinking about it more is really connected to controlling generation over a greater composite structure (which is the thing we’re producing out of a composite spec)

Alex Miller (Clojure team)17:05:14

I went and looked at some of the quickcheck stuff more and it seems like there are all those knobs for size, scale, etc

Alex Miller (Clojure team)17:05:21

but they assume pretty localized and intentional use

Alex Miller (Clojure team)18:05:45

my real goal here is: how can we assure that an automatically created composite generator does not produce a value so big that it kills testing. and maybe the answer is really being smarter in spec about how we use test.check

Alex Miller (Clojure team)18:05:56

some more specific problem areas are: collection size, recursive/nesting size, and individual value size (for things that use variable memory like strings, keywords, symbols)

Alex Miller (Clojure team)18:05:32

really these are mostly easy to manage in isolation but are in some cases harder in combination

Alex Miller (Clojure team)18:05:03

sorry, stepping away for a few

gfredericks18:05:07

Yes definitely; sizing is one of the biggest headaches I've encountered in maintaining test.check -- I've kept wanting to redo the size parameter so that it has some kind of linear relationship to the (comp count pr-str) of the value generated

gfredericks18:05:49

I thought about maybe having each generator have knowledge about the size of things it generates, but that doesn't work so well for gen/one-of or gen/frequency, and not at all for gen/bind

gfredericks18:05:53

even gen/fmap would be hard

gfredericks18:05:37

so I think the only other approach to a general solution would be to have some sort of logical post-processing that prunes things when the whole structure is too big

gfredericks18:05:41

I haven't thought about that one too much

gfredericks18:05:51

My guess is that this is the biggest sizing pitfall for spec: https://dev.clojure.org/jira/browse/TCHECK-106

gfredericks18:05:55

it's possible that solving that would be enough from spec's perspective

Alex Miller (Clojure team)19:05:20

yes, I think that is the highest priority issue

Alex Miller (Clojure team)19:05:58

Rich suggested that maybe there was some notion of “thinning” that could be applied as you generate - reducing the scale as you get deeper in a structure

gfredericks19:05:12

@alexmiller yeah, that's something that the recursive-gen already does (else it'd likely generate infinite structures)

gfredericks19:05:17

I certainly considered it -- it had the feel of a breaking change, so I was probably hesitant on that count, but there might have been some more subtleties; I'll give that some more thought and try it out

gfredericks19:05:02

one approach I tried with recursive-gen was to take the current size and partition it up randomly among the elements of the collection to be generated

gfredericks19:05:14

I think that's a lot harder with recursive-gen than it would be with the regular collection generators

gfredericks19:05:20

so maybe there is hope

Alex Miller (Clojure team)19:05:43

yeah, I saw some interesting stuff in the quickcheck papers about the recursive gen aspects

Alex Miller (Clojure team)19:05:07

and spec is doing some of its own things there as well

gfredericks19:05:33

yeah I noticed that and wasn't sure what I thought about it

gfredericks19:05:44

I'm not sure that there are any serious backwards-compatibility concerns about changing the sizing behavior, so I'll consider this a plausible solution

gfredericks19:05:59

do you have a guess about whether there will be other serious sizing issues remaining?

Alex Miller (Clojure team)19:05:23

collection size and recursion seem to be at the root of the majority of issues I see

gfredericks19:05:44

does spec use recursive-gen at all?

gfredericks19:05:03

I think the biggest thing I noticed was that it didn't even with naturally recursive specs

Alex Miller (Clojure team)19:05:43

yeah, the gen namespace doesn’t pull in recursive-gen so I’d guess nothing is using it

gfredericks19:05:56

it might not be a good fit anyhow

Alex Miller (Clojure team)19:05:14

I don’t know if that was a conscious choice or not (although with Rich I typically bet on the side of conscious choice)

Alex Miller (Clojure team)19:05:43

stepping away again, maybe for a while, sorry :)