Fork me on GitHub

@gfredericks recalling out discussion about generators, monads vs applicatives, annoyances in generative testing, etc - i’ve finally gotten around to reading “Constraint Propagation” by Guido Tack"


some snippets:


> The generate-and-test solver is too inefficient to solve real-world problems, as it requires enumeration of all assignments. We now improve on this naive method by pruning the set of enumerated assignments using propagation.


> A propagation-based solver interleaves propagation and enumeration (search).


seems like the way generators defined now is the output of a single manual propagator pass - ie the hand written generators are just domains

Yehonathan Sharvit05:03:48

@alexmiller what do you mean by “bind a custom explain printer”?


@viebel Maybe like how explain-str does it with with-out-str, but it doesn't sound right.


Ok, got it (from spec source):

(def ^:dynamic *explain-out* explain-printer)

(defn explain-out
  "Prints explanation data (per 'explain-data') to *out* using the printer in *explain-out*,
   by default explain-printer."
  (*explain-out* ed))

(defn explain
  "Given a spec and a value that fails to conform, prints an explanation to *out*."
  [spec x]
  (explain-out (explain-data spec x)))


@bbloom how much of this 200 pg paper do I need to read to know what you're talking about


would it make sense for (s/conformer f) to return :clojure.spec/invalid on exception?


seems like I wrote this a few times already


when defining :args in fdef is there a good way to enforce some relationship between the arguments, for example (fn [a b] (+ a b)) but (> a b) (contrived example)


I have a hard time expressing this as a spec


@gfredericks just like any paper like this, you can skip all the math 😛 i didn’t read all of it, just the intro and some prose in the major chapters


@gfredericks or you can rewatch sussman’s talk from strange loop a few years back about “we really don’t know how to compute” which may be more accessible to the propagator idea. the only really extension beyond that is that if you combine propagation and search, you get a sound/complete constraint solver. think soduku: if you have badly made puzzle (for humans) that doesn't have enough info to solve by inferences, you can take guesses, and then make more inferences. if your guesses don’t pan out, you can backtrack and make different guesses.


right now spec makes some inferences from stuff like s/and etc and composes together some generators, and those things are making guesses


that paper mentions that the propagators that provide inferences also act as the predicates in a generate and test system, but in spec, no new inferences occur after the initial construction of the generators


not sure if there’s any way to capitalize on this insight in a spec context tho - but the goal would be to shrink the search space to reduce how many examples need to be generated only to be thrown out by predicates


@stathissideris You can use s/& to apply additional predicates to the conformed values of a regex spec.

(s/& (s/cat :a int? :b int?)
     (fn [{:keys [a b]}] (> a b)))


@bbloom thanks, I'll noodle it a bit and get back to you


@viebel Slackbot just reminded me about sharing details of our explain mapping code — nice 🙂


There are two parts to it: one is a function that reduces the explain data to a vector of relevant pieces to match on, from most specific to least specific, and then for each spec, we have a map from symbols/forms to the specific error code or message.


The key is that we have two paths through the code, one for when :path is empty and one for when :path is non-empty. The latter provides more context (a top-level key, for example) so we get better error messages that way. With the former, we can sometimes only get generic errors, unless the predicates are sufficiently specific.


We look at :path and :pred (only), and only the first element of :path (at present). We turn the predicate into a canonical form by taking the first three subforms and removing % and the path element (e.g., given a :path of [:email] and a :pred of #(re-find email-regex %) we’d get (‘re-find ‘email-regex) and a path of :email) and then build a sequence of possible matches [[:email ‘re-find ‘email-regex] [:email ‘re-find] [:email] [‘re-find]] — and we’d look in the map of forms to errors for each of those three in turn.


@madstap didn’t know about s/& - thanks, I’ll check it out!


So a given map might be

(def ^:private update-photo-fail-codes
  {:photoid           11301
   [:photoid '->long] 11302
   :caption           11401
   [:caption 'seq]    11402
   [:caption '<=]     11403})
— so 11302 is for photoid failing the ->long conforming predicate and 11301 is for “other” photoid failures (which is just required field missing in this case). 11402 is for caption being empty, 11403 is for caption being too long, and 11401 is for caption in general (in this case just “missing”).

Yehonathan Sharvit18:03:39

I was hoping that clojure.spec would give something out of the box


I don’t think that’s really possible since the spec forms can be both arbitrarily complex and application-specific.


We deliberately structure our specs so that our “decoding” function works — i.e., if we find a particularly spec doesn’t decode well, we refactor it (usually just a matter of creating named predicates).


It also takes a bit of trial and error sometimes to get the right set of structural keys in the error code map for certain errors we want to break out specifically.


All told, our decoding logic runs about 50 lines of somewhat dense code. Then we have a map for each spec that needs specific error messages.


And it’s taken two or three iterations on the decoding logic to be both general enough for all our specs and also specific enough to get the level of detail we need in error messages.


The nice thing about this approach is that specs are still clean and readable — and independent of the error messages we need to produce — and we can change the level of detail in how we report errors just by changing the map we pass into the decoding function (e.g., in the map I showed above, if we remove the [:photoid '->long] key, then :photoid will still be matched and we can treat it as “photo ID is required and must be numeric” instead of two separate errors — without changing the spec itself!).


wasn’t there something in spec that allowed you to define a modification of the conformed input?




maybe it used to be there - I remember a bit of documentation saying that you have the option to do it, but you shouldn’t in case a consumer of your spec output needs the original info


@stathissideris i think you’re thinking of conformer - why doesn’t it work for you / what do you need?


so I have this s/or that produces [:match …] vectors, and I’d like to get rid of the “tag” around the actual data


you mean that calling conform produces the tagged value?


you can’t change the result of conform - only the input in to conform


So you want to add (s/conformer second)?



(s/def ::boolean (s/and (s/or :b boolean?
                              :s (s/and #{"true" "false"}
                                        (s/conformer #(Boolean. %))))
                        ;; we don't care about the path, just the value
                        (s/conformer second)))


(not necessarily good practice — because downstream consumers of your spec are forced to take the conformed value and lose the original)


For some situations it’s “right” however.


@seancorfield exactly, but I was reading the documentation wrong and ended up using it incorrectly


many thanks for the example


that’s a neat trick @seancorfield 🙂


I found the description of conformer very confusing on the first few reads… but I’m not sure how to describe it any better so I haven’t submitted a PR.


It’s just an odd concept at first.


the reason I wanted that is that I’m using s/& in my function spec to enforce constraints between args, but the conformed args don’t play very well with my predicates


Yeah, we use it to hide the “implementation” of certain specs.


Wrong channel? 🙂


ah yes 🙂


@seancorfield I made a lib of these tricks, but I'm looking for something more generic:


Some thoughts: Instead of comparing against ::s/invalid, why not use the predicate s/invalid?; your conform-or-throw seems to be the built-in s/assert; I’d consider a macro to reduce the boilerplate of those last set of very similar predicates.


We have a macro that takes a predicate, a coercion, and an optional reverse coercion that lets us define specs of that pattern: satisfies “pred” or is a string that coerces to something that satisfies “pred” — and can be generated (by generating for the “pred” and then reverse coercing back to a string).


When do you reverse-coerce?


As part of generation. The default for most of our “API specs” is just str so we produce strings.


@seancorfield regarding s/assert I wrote my own that's independent of *compile-asserts* or check-asserts because it was part of my application logic and not just to find bugs. Anyway I hardly throw anymore and instead put the explain data in a stream. A macro combining pred and coercion is nice. I'll think about it.


But we have a few specs where we accept or coerce to a keyword so the reverse-coerce is name instead of str.


Ah yes, good point about *compile-asserts* etc.


I'm not into gen testing yet so this whole use case escapes me.


I used to allow strings padded by whitespace and trim them before coercing to keyword. A reverse coercion would miss that, wouldn't it?


But it sounds like a regex spec would know how to generate these things instead of reverse coercion. Could you eliminate reverse-coerce in your case?


With regex as the “spec”, I use test.chuck’s regex generator.


And, yes, if my spec trims whitespace before validation, I would likely not bother randomly generating strings with additional whitespace (I would instead have a single test-by-example that had whitespace to verify that validation accepted it).


It really depends on what aspects of the spec I consider important enough to be part of the generated test cases. For example, we have a spec for validating member-supplied passwords, but we use a regex-generator for it that produces only a subset of possible valid input values.


(it’s not worth the effort to accurately generate all possible edge cases in that situation since the random-but-subset-conforming password values as “good enough” as test data)