Fork me on GitHub

Having some very robust results adding constraints to schemas. In particular, because the constraints are inside the schemas, generators are much more specific. e.g., the generator for [:and int? [:> 739] [:< 741]] almost always fails but this always succeeds.

(is (= 740 (mg/generate [:int {:> 739 :< 741}])))

👀 2

I haven't implemented the generators yet for this, but another interesting example:

(is (= ["should be distinct: 3 provided 2 times"
          "should be sorted: index 1 has 3 but expected 2"]
         (me/humanize (m/explain
                        [:sequential {:sorted true
                                      :distinct true} :any]
                        [1 3 3 2]))))


To give some idea of the improved generator expressivity, this is another way of writing the schema in the OP:

(is (= 740 (mg/generate [:int {:and [[:not [:<= 739]]
                                       [:not [:>= 741]]]}])))

Ben Sless13:04:44

How's that different from min and max properties?

Ben Sless13:04:25

For other types that's pretty neat


superficially, min/max is inclusive (<=/>=). But on a fundamental level, this is no different than properties. This is a propositional logic where the atomic propositions are things like min/max/sorted/distinct/</>.

👍 1

It would be trivial to add </> support in the same way as min/max, but I think this abstracts over that entire idea.

Ben Sless17:04:30

Which means it has algebra, neat


I thought you would enjoy 🙂

catjam 1
Ben Sless17:04:10

btw, if you're doing strings, it's worth it to add some caregory predicates, such as alpha, alnum, numeric, etc


oh yeah, and there's already generators for that. that would work nicely.

Ben Sless17:04:32

Exactly, and it can be implemented efficiently using predicates

👏 1
Ben Sless17:04:20

Then just for fun you could add :re and we could get rid of regex schemas which aren't comparable because java regexes don't have value semantics (boo)


oh, and test.chuck has a re string generator.


is there a generator for :re?

Ben Sless17:04:16

It uses test.chuck if it's on the class path

👍 1
Ben Sless17:04:33

And sorry to make more work for you but don't forget time schemas


elaborate your ideas


I want to really test this abstraction.

Ben Sless17:04:15

I mean these, I worked really hard on that sh*t would be a shame if they don't get to enjoy the new constraints features


ah, and which constraints are you thinking?


like before/after?

Ben Sless17:04:55

Yes. They already support min and max, so mostly make sure they aren't left out of the party


Nice. are min/max inclusive here?

Ben Sless17:04:44

Check line 30


and there's no generators currently hooked up?

Ben Sless17:04:57

There are. Getting them right was a pain


nice. yes I've been there.


This propositional logic might get us out of :fn guard hell. pretending we don't have nat-int? for a second, this could be the schema for nth:

[:=> {:refine [v i]
       [:<= 0 i]
       [:< i [:count v]]]}
 [:cat :vector :int]


I have no idea what implications that has but just an idea that popped up.

Ben Sless17:04:11

This is starting to look like a where clause in datomic 🙂

Ben Sless18:04:46

Though it loses some information, the return type isn't any, it's a function of the vector index


right, but I'm still working on polymorphic schemas 🙂


If I'm unabashedly wishfully thinking:

(m/all [x]
       [:=> {:and (m/refine [v i]
                             [:<= 0 i]
                             [:< i [:count v]]])}
        [:cat [:vector x] :int]

Ben Sless18:04:12

That doesn't handle a heterogeneous vector, right? Still, I'd buy it


might be more like:

(m/all [x :*, n :- nat-int?]
        [:cat [:tuple [:.. x]] [:= n]]
        [:..nth x n]])


where [:..] is regex that expands at "instantiation time" of the all.


[:inst ::nth [:cat :a :b :c] 2] =>

[:=> [:cat [:tuple :a :b :c] [:= 2]] :c]


and the [:nth.. schema implicitly relates x and n?


there's a bunch of details I'm working out to get something like this, but I think we can look to type theory for inspiration on how to create an expressive specification language.


the main caveat is that you can't really instrument a polymorphic fn, so you'd need to compile a monomorphic one for instrumentation, and just use the polymorphic one for generation. Still working on how to do that. like, one detail here is that if you instantiate n to nat-int to generate the most general version of the schema for instrumentation purposes, you get [:= nat-int] as the second arg. Obviously not what you want. Perhaps n is really a spec here instead of a nat. Stuff like that.


Would be curious to know what this would look like starting from your observation that everything resembles datalog, but I don't know how to create verification system from that foundation so that's for someone else to explore 🙂


my formal training in type theory is leaking through exposing my biases/limitations. it's my one trick.


fleshed it out a bit, I think this should check length bounds for instrumentation and additionally generators of this schema will know how to generate good returns.

(m/all [x :*, n :< nat-int?]
       [:=> {:and (m/refine {{:keys [v i]} :args}
                            [:< i [:count v]])}
         [:v [:schema x]]
         [:i n]]
        [:..nth x n]])

[:inst ::nth [:cat :a :b :c] [:= 2]]
;=> [:=> {:and ..} [:cat [:schema [:cat :a :b :c]] [:= 2]] :c]
[:inst ::nth [:* :int] nat-int?]
;=> [:=> {:and ..} [:cat [:schema [:* :int]] nat-int?] :int]
;; for instrumentation:
[:inst ::nth [:* :any] nat-int?]
;=> [:=> {:and ..} [:cat [:schema [:* :any]] nat-int?] :any]


The trick is that [:..nth [:* :int] nat-int?] => :int and [:..nth [:cat :a :b] [:= 0]] => :a. And probably [:..nth [:cat :a :b] nat-int?]=>[:or :a :b].

Ben Sless19:04:36

Why aren't you speccing it like a map? I guess heterogeneous and homogeneous vectors could be specced differently


this was my only idea, what were you thinking?

Ben Sless19:04:56

Too tired to be coherent, but just however you'll describe a map if the vector is heterogeneous. A homogeneous vector could be defined like you have now. The definitions should overlap for homogeneous vector, too


I like where this is heading. It is much cleaner to have a :map with constraints in the properties versus having to nest it inside :and or :fn etc. The symmetry with the actual shape is maintained.

❤️ 2

For instance, I have been working on some visualizations of schemas and it requires walking the schema tree and removing all the nodes that are really constraints since the`:maps` are the bits I am interested in.

Ben Sless03:04:06

Suggestion: use the format property like json schema does. Thinking forward to things like email, IP, etc


@UK0810AQ2 are you referring to the :format field in malli.json-schema? I don't follow.

Ben Sless04:04:38

For strings, instead of :alpha true, :format :alpha

Ben Sless04:04:56

A single key to look at


It's just sugar for an atomic proposition: {prop true} => {:and [[prop]]}. {:alpha true :numeric true :max 10} => {:and [[:alpha] [:numeric] [:max 10]]}


You can even do {:gen/alpha true, :alphanumeric true} to generate alpha in generators and alphanumeric in validators.


though I haven't implemented it, that's how I've set up the syntax.


I've only done the validator side atm. there [:gen/alpha] expands to the top proposition [:any] so it usually has no effect on the proposition during validation.


to give you an idea, these are the propositions I've supported for :string alone, along with their :gen version. they all use the {prop true} syntax:

:max :min :alphanumeric :non-alphanumeric :letters :non-letters :numeric :non-numeric :alpha :non-alpha :sorted :distinct :palindrome :trim :triml :trimr :trim-newline :blank :non-blank :escapes :includes


it's easily extensible


well, {:includes "foo"} and

{:escapes {\- "_MINUS_"}}
actually make use of their val.


and min/max of course.

Ben Sless05:04:33

What threw me in a loop is that (and alpha numeric) is an empty set

Ben Sless05:04:04

The sugar is confusion imo


{:max 10 :or [[:alpha] [:numeric]]} rather


{:distinct true :sorted true :max 10} is a better example of a conjunction.


The rule is the same as always: {:min 10 :max 12} is a conjunction. we just have more atomic propositions now.

Ben Sless05:04:29

Yes, but the code for handling them would be annoying

Ben Sless05:04:07

{:format [:and :alpha :distinct]} is clearer for the reader and simpler to implement imo

Ben Sless05:04:57

It's uniform and simple to compile to a bunch of predicates


It's a dozen lines to desugar in a fully extensible way. Plus I want to save the keyword syntax for [:contains K] sugar like s/keys.

(defn -constraint-from-properties [properties constraint-opts options]
  (let [{:keys [flat-property-keys nested-property-keys]} (->constraint-opts constraint-opts)]
    (when-some [cs (-> []
                       (into (keep #(when-some [[_ v] (find properties %)]
                                      (into [%] v)))
                       (into (keep #(when-some [[_ v] (find properties %)]
                                      (conj [%] v)))
      (if (= 1 (count cs))
        (first cs)
        (into [:and] cs)))))


[:and :x :y] instead of [:and [:contains :x] [:contains :y]].


At least for schemas that support contains?. Haven't decided whether to allow unwrapped keywords in other schemas yet.


Let's say I do allow unwrapped keyword in say :string . I think {:and [:alpha :distinct]} is clearer than {:format [:and :alpha :distinct]}.


mainly because "format" isn't a proposition.


It's unclear how far this concept can go, but I want the user to think property == proposition. And property map == conjunction.

Ben Sless06:04:26

yeah, format isn't a proposition, it's more like a predicate


earlier I renamed :and to things like :keyset :keys . I think once I saw the abstraction of everything is a proposition, the current philosophy started to make sense to me.


every single schema can support :and , and it will make sense.


I think :format for :string is similar to my thinking of :keyset constraints for :map. We can abstract over the concept of "thing is true for schema".

Ben Sless06:04:10

my thinking about it is tainted by how json schema does it


yeah and my thinking about properties was tainted by min/max. It was tailored for each schema in different ways and didn't seem like a coherent abstraction beyond syntax.


but that was my thinking too: a property is something tied to the schema type.


until like yesterday. so it's a new idea.


here's a good existing example of using true.

  [:double {:gen/infinite? true, :gen/NaN? true}]
  {:seed 1})
They are both atomic propositions. could use them like {:gen/xor [:gen/infinite? :gen/NaN?]} to never mix ##Inf and NaN in the same run (in theory, let's see when the rubber meets the road).

Oliver Marks15:04:16

I am using the malli reitit ring middleware all is working but there is one thing I am curious about can you adjust the humanized response to include keys.

"humanized":["invalid type","invalid type"]
In the above example it tells me there are 2 errors but it does not include the key in the response so I don't know the field that had the error, Is there a way to add in this information or map it to the submitted data so that I can put an error marker on the form fields on the frontend, wanted to check before I role some kind of work around. This sounds like it may do the job but the example is for manually validation so not sure if I can modify the ring reitit middleware in the same way ?