Fork me on GitHub
#clojure-spec
<
2016-12-13
>
j-po04:12:50

I'm trying to spec something like s-expressions, and my current flailing in that direction involves a recursive s/tuple (something like

(s/def ::sexp
  (s/alt :arg integer?
            :expression (s/tuple fn?
                                              ::sexp
                                              (s/* ::sexp))))
), but I'd like for the input not to have to be a vector of vectors. Am I better off just turning the input into such a vector as part of the checking process, or is there another way?

Alex Miller (Clojure team)05:12:49

then you can just replace ::sexp (s/* ::sexp) with (s/+ ::sexp) too

Alex Miller (Clojure team)05:12:47

(s/def ::sexp (s/alt :arg integer? :expression (s/cat :f fn? :args (s/+ ::sexp))))

j-po09:12:11

@alexmiller Thanks! When I gen/generate from an s/cat approach, though, I don't get a sequence of sequences, though, but just one flat one.

vandr0iy09:12:27

I was asking stuff in the beginners channel, but probably this is a better place for doing that. I posted this snippet of code, claiming that it doesn't work:

(let [arst {:a [1 2 3]
            :b {:c #{::z ::x ::c}}}]
  (s/keys :req-un (get-in arst [:b :c])))
And I was told that :req-un wants a literal vector of namespaced keywords, not something that evaluates to one, because of it being a macro (obviously) - which makes sense. So I went a step further, and macro'd my stuff like this:
(def my-bunch-of-stuff {:a {:b [:x :c :d]}
                      :z {:b vector?}}

(defmacro arst [type]
  (let [sp# (get-in my-bunch-of-stuff [type :b])]
    (if (fn? sp#)
      `(s/spec ~sp#)
      `(s/spec (s/keys :req-un ~(vec (map #(->> % str keyword) sp#)))))))
and it still doesn't work OOB; while, if I copy-paste the result of macroexpand '(arst :a) in the REPL - it works flawlessly. What's going on here? I'm afraid that there's some rookie mistake here somewhere...

mpenet09:12:03

you need to use either a macro or eval:

mpenet09:12:08

(let [arst {:a [1 2 3]
                :b {:c #{::z ::x ::c}}}]
  (eval `(s/keys :req-un ~(get-in arst [:b :c]))))

mpenet09:12:45

(s/keys is a macro)

mpenet09:12:07

and you might actually need to call seq or vec on the result of the get-in, not sure s/keys will happily take a set as :req-un value

mpenet09:12:56

your macro should work (kinda, you need namespaced keys in your bunch-of-stuff and there's a paren missing there too)

vandr0iy09:12:35

I'm sorry for adding the macro after the request, I accidentally pressed enter before completing my post. My keys aren't namespaced - I "namespace" them with the (map #(->% str keyword) sp#) thing. What puzzles me is: why does this macro work if I try to do (arst :z) - which takes a function from the bunch-of-stuff and spits a spec out of it - and doesn't if I get a vector of un-namespaced keywords, "namespace" them with the dirty hack described above??? I (macroexpand '(arst :a)), and copy-paste what I get in REPL - and it works, but it doesn't if I just try to do (arst :a). I'm pretty sure that ~(vec (map #(->> % str keyword) sp#)) yields a vector of namespaced keywords...

mpenet09:12:13

it doesnt I think, it creates keys with ":foo" content with a leading :, so yeah ::foo but not namespaced per say

mpenet09:12:47

try running your spec, it will complain about this. Try changing your keys in bunch-of-stuff with ns keys you ll see

vandr0iy09:12:31

Oh... I see. Was able to make it work using this other horrible hack:

(defmacro arst [type]
  (let [sp# (get-in settings [type :b])]
    (if (fn? sp#)
      `(s/spec ~sp#)
      `(s/spec (s/keys :req-un ~(vec (map #(keyword (str *ns* "/" (name %))) sp#)))))))

triss09:12:51

how do I write a spec for an instance of Comparable?

vandr0iy10:12:49

(s/valid? (s/spec #(instance? Comparable %)) "arst")

triss10:12:16

thanks @vandr0iy . I’ll give that a try

triss10:12:26

that worked. Brilliant thanks.

triss10:12:02

If I want a generator that produces a wider range of vaues than the ones provided for double-in and int-in how would I go about getting one of those?

triss10:12:34

all the random numbers I’m getting are pretty small and I need to test against some big ones.

mpenet10:12:02

test.check has generators that allow you to specify a range

mpenet10:12:14

large-integer* and double* I think

triss10:12:33

cheers @mpenet I’ll have alook

triss10:12:05

(gen/sample (gen/double* {:min 1 :max 200})) still produces really small values

triss10:12:23

I’d like to see some values closer to 200. Is this possible?

mpenet11:12:40

Try sampling more values

triss11:12:19

ah ok.. looks generators probably aren’t going to be efficient enough for the way I was abusing them.

triss11:12:46

If I wanted a more evenly distributed generator would it be possible to write one?

luxbock12:12:58

@triss yes but you need to use the generators in test.check

gfredericks12:12:08

@triss what's a uniform distribution over doubles?

gfredericks12:12:57

@triss oh I think this is a sizing issue, sorry I didn't read back far enough

triss12:12:09

@gfredericks ah I can see why that might be awkward. rand seems more even than this though

gfredericks12:12:16

@triss you should get lots of values close to 200, but gen/sample is deliberately only showing you small examples

gfredericks12:12:39

@triss try (gen/sample ... 200)

luxbock12:12:45

is there any reason for why s/spec couldn't accept a third optional argument for defining a custom generator? I thin it would make it easier to statically analyze specs and might end up with some cool tooling use cases

gfredericks12:12:56

@triss test.check has some subtleties with sizing that need to be documented better, and I have 45 minutes free right now so I think I will start on that

triss12:12:17

@gfredericks I still see a similar distribution. biased to low numbers

triss12:12:31

ah many thanks @gfredericks will be much appreciated

gfredericks12:12:40

@triss it's not going to be uniform, but you should definitely get some larger numbers

gfredericks12:12:15

@triss one reason I asked my original question about what a uniform distribution is, is because doubles themselves aren't uniform -- there are a lot more numbers between 1 and 50 than 50 and 200

gfredericks12:12:43

so uniform might mean different things in different contexts

triss12:12:45

oh that’s interesting. because of the way they are represented in the machine?

gfredericks12:12:55

you could say that

gfredericks12:12:10

it's kind of inherent in the idea of floating point

triss12:12:11

I guess I’d like a distribution of Real numbers? does that make more sense?

triss12:12:30

^a more even distribution of real numbers

gfredericks12:12:37

not really, but I think I know what you're trying to get at

gfredericks12:12:59

@triss you could use large-integer to get this pretty easily

triss12:12:04

I’d love to see a flatter histogram!

gfredericks12:12:05

or at least something close

gfredericks12:12:11

yes that's a good way of putting it :)

triss12:12:41

ah brilliant will look at large-integer

gfredericks12:12:58

(gen/let [x (gen/large-integer {:min 0 :max 200000000})] (/ x 1000000.0))

gfredericks12:12:03

@triss ⇑ something like that

gfredericks12:12:21

it won't get you every double in that range, but it might be okay for what you're doing

gfredericks12:12:42

and it wouldn't be hard to make it fancier so that it hits most things

triss12:12:23

thanks. this will probably do nicely.

gfredericks12:12:32

oh ha yes I forgot that

triss12:12:01

looks like I won’t be generating my population of genotypes with spec or generators.

triss12:12:24

or hang on I can write a custom one?

gfredericks12:12:24

@triss try (gen/let [x (gen/choose 0 200000000)] (/ x 1000000.0))

gfredericks12:12:02

@triss ⇑ these essentially are custom generators

gfredericks12:12:11

it's all about composing lower-level generators to get what you want

gfredericks12:12:27

gen/choose is one of the lowest-level ones, and gives you a uniform distribution over a range of integers

triss12:12:47

gen/choose is perfect for what I’m doing I think....

gfredericks12:12:50

@triss now that I think about it, I might recommend the large-integer approach anyhow, but it's tricky to explain why

mpenet12:12:00

worst case you can cheat the thing with (gen/fmap #(do whatever you want) (gen/return nil))

triss12:12:28

ok - I need to scratch my head about composing these things for a while....

triss12:12:20

wonderful thanks.

gfredericks12:12:56

@triss w.r.t. choose vs large-integer, the short explanation is that test.check has a strategy where it tries small things first and slowly tries larger and larger things; by using choose you're overriding that strategy, and that strategy is the reason you saw an uneven distribution with large-integer

gfredericks12:12:43

which is best depends on your goals, but I'd say large-integer would be a good default

triss12:12:53

would you consider using generators outside of testing code foolish?

triss12:12:25

it seems very much tailored for testing.... (as the namespace indicates I suppose)

gfredericks12:12:08

@triss clojure.spec uses them for a bit more than test.check does, but still for testing purposes; if you're using them for not-testing-at-all, it will be a little weird but not terrible; understanding the sizing subtleties could be more important though

gfredericks12:12:19

I suppose based on that jungle music talk at the conj I should expect people will be using generators for all sorts of things

gfredericks12:12:02

@triss the gen/generate function will probably be useful for you, since it accepts a size parameter

gfredericks12:12:43

e.g., (repeatedly 200 #(gen/generate (gen/let ...) 200)) should give you the distribution you wanted even using large-integer

ag22:12:13

Guys, can someone help me with this: I have a long string of lorem ipsums, something like this "pellentesq dapib suscip liguldon posue augquaeti v tort ..." and now I need a spec with a generator that would generate a name of 3 to 5 words long by randomly pulling those words from that lorem-ipsum string. How do I do that?

adambrosio23:12:43

(def s “pellentesq dapib suscip liguldon posue augquaeti v tort”)
(clojure.string/join " " (take 3 (shuffle (clojure.string/split s " “))))
;=> "augquaeti posue v”
@ag does this look right?

adambrosio23:12:57

oh you said generator… hmm

j-po23:12:46

You could make a generator for sequences fitting that format. Composing in string/join is the unknown bit, but it seems doable.

j-po23:12:31

You could do that in a custom generator, for instance, but then you'd have a custom generator 😕

adambrosio23:12:57

(gen/sample (gen/bind (gen/shuffle (str/split "foo bar baz qux asd wet ab" #" ")) #(gen/frequency [[5 (gen
/return (take 3 %))] [5 (gen/return (take 5 %))]])))
;=> (("foo" "baz" "bar" "ab" "wet") ("foo" "ab" "baz") ("wet" "bar" "baz") ("baz" "bar" "ab") ("bar" "ab" "baz") ("foo" "
ab" "baz") ("foo" "wet" "asd" "ab" "baz") ("asd" "wet" "ab") ("qux" "wet" "foo" "bar" "baz") ("ab" "bar" "asd" "foo"
"baz"))

ag23:12:59

yeah but I can’t make this work, I got this far:

(gen/generate
 (s/with-gen string? #(gen/generate (gen/shuffle (clojure.string/split lorem-ipsum #”\s")))))

adambrosio23:12:14

got it to alternate between take 3 and take 5

adambrosio23:12:18

i think you see the pattern

ag23:12:26

oh, let me try your snippet

adambrosio23:12:33

just add an entry for (take 4 %) with w/e freq you want

adambrosio23:12:05

i think the only reason i came up with that so fast was the time i spent in haskell 😅

ag23:12:34

@adambros oh this is awesome, now I think how could be gen/frequency part can be generalized

gfredericks23:12:25

(gen/vector (gen/elements the-words) 3 5)

gfredericks23:12:02

should be more efficient than the shuffler too since it doesn't do lots of shuffling work just to throw it away

gfredericks23:12:24

only difference is it won't give you distinct elements

ag23:12:10

@gfredericks oh, yeah, makes sense

adambrosio23:12:48

thats interesting, so i guess it depends on if you want/need distinct elements

gfredericks23:12:30

if you need distinct, wrapping it in such-that might be efficient, as long as the collection is large enough

ag23:12:09

errrh… now how I actually emit strings? str/joined? I am trying to wrap it in gen/fmap - doesn’t work ;(

ag23:12:01

nevermind got it!

ag23:12:23

(gen/fmap
    #(clojure.string/join " " %)
    (gen/bind (gen/shuffle (clojure.string/split lorem-ipsum #" "))
              #(gen/vector (gen/elements %) 1 5)))
Thanks everyone!