Fork me on GitHub
#malli
<
2024-04-02
>
ambrosebs04:04:11

Having some very robust results adding constraints to schemas. https://github.com/metosin/malli/pull/1025 In particular, because the constraints are inside the schemas, generators are much more specific. e.g., the generator for [:and int? [:> 739] [:< 741]] almost always fails but this always succeeds.

(is (= 740 (mg/generate [:int {:> 739 :< 741}])))

👀 2
ambrosebs05:04:50

I haven't implemented the generators yet for this, but another interesting example:

(is (= ["should be distinct: 3 provided 2 times"
          "should be sorted: index 1 has 3 but expected 2"]
         (me/humanize (m/explain
                        [:sequential {:sorted true
                                      :distinct true} :any]
                        [1 3 3 2]))))

ambrosebs05:04:59

To give some idea of the improved generator expressivity, this is another way of writing the schema in the OP:

(is (= 740 (mg/generate [:int {:and [[:not [:<= 739]]
                                       [:not [:>= 741]]]}])))

Ben Sless13:04:44

How's that different from min and max properties?

Ben Sless13:04:25

For other types that's pretty neat

ambrosebs17:04:06

superficially, min/max is inclusive (<=/>=). But on a fundamental level, this is no different than properties. This is a propositional logic where the atomic propositions are things like min/max/sorted/distinct/</>.

👍 1
ambrosebs17:04:22

It would be trivial to add </> support in the same way as min/max, but I think this abstracts over that entire idea.

Ben Sless17:04:30

Which means it has algebra, neat

ambrosebs17:04:53

I thought you would enjoy 🙂

catjam 1
Ben Sless17:04:10

btw, if you're doing strings, it's worth it to add some caregory predicates, such as alpha, alnum, numeric, etc

ambrosebs17:04:48

oh yeah, and there's already generators for that. that would work nicely.

Ben Sless17:04:32

Exactly, and it can be implemented efficiently using https://docs.oracle.com/javase/8/docs/api/java/lang/Character.html predicates

👏 1
Ben Sless17:04:20

Then just for fun you could add :re and we could get rid of regex schemas which aren't comparable because java regexes don't have value semantics (boo)

ambrosebs17:04:55

oh, and test.chuck has a re string generator.

ambrosebs17:04:07

is there a generator for :re?

Ben Sless17:04:16

It uses test.chuck if it's on the class path

👍 1
Ben Sless17:04:33

And sorry to make more work for you but don't forget time schemas

ambrosebs17:04:45

elaborate your ideas

ambrosebs17:04:08

I want to really test this abstraction.

Ben Sless17:04:15

I mean these, I worked really hard on that sh*t would be a shame if they don't get to enjoy the new constraints features https://github.com/metosin/malli/blob/master/src/malli/experimental/time.cljc

ambrosebs17:04:43

ah, and which constraints are you thinking?

ambrosebs17:04:59

like before/after?

Ben Sless17:04:55

Yes. They already support min and max, so mostly make sure they aren't left out of the party

ambrosebs17:04:50

Nice. are min/max inclusive here?

Ben Sless17:04:44

Check line 30

ambrosebs17:04:10

and there's no generators currently hooked up?

Ben Sless17:04:57

There are. Getting them right was a pain

ambrosebs17:04:53

nice. yes I've been there.

ambrosebs17:04:42

This propositional logic might get us out of :fn guard hell. pretending we don't have nat-int? for a second, this could be the schema for nth:

[:=> {:refine [v i]
      [:and
       [:<= 0 i]
       [:< i [:count v]]]}
 [:cat :vector :int]
 :any]

ambrosebs17:04:47

I have no idea what implications that has but just an idea that popped up.

Ben Sless17:04:11

This is starting to look like a where clause in datomic 🙂

Ben Sless18:04:46

Though it loses some information, the return type isn't any, it's a function of the vector index

ambrosebs18:04:09

right, but I'm still working on polymorphic schemas 🙂

ambrosebs18:04:35

If I'm unabashedly wishfully thinking:

(m/all [x]
       [:=> {:and (m/refine [v i]
                            [:and
                             [:<= 0 i]
                             [:< i [:count v]]])}
        [:cat [:vector x] :int]
        x])

Ben Sless18:04:12

That doesn't handle a heterogeneous vector, right? Still, I'd buy it

ambrosebs18:04:15

might be more like:

(m/all [x :*, n :- nat-int?]
       [:=> 
        [:cat [:tuple [:.. x]] [:= n]]
        [:..nth x n]])

ambrosebs18:04:46

where [:..] is regex that expands at "instantiation time" of the all.

ambrosebs18:04:39

[:inst ::nth [:cat :a :b :c] 2] =>

[:=> [:cat [:tuple :a :b :c] [:= 2]] :c]

ambrosebs18:04:10

and the [:nth.. schema implicitly relates x and n?

ambrosebs18:04:21

there's a bunch of details I'm working out to get something like this, but I think we can look to type theory for inspiration on how to create an expressive specification language.

ambrosebs18:04:45

the main caveat is that you can't really instrument a polymorphic fn, so you'd need to compile a monomorphic one for instrumentation, and just use the polymorphic one for generation. Still working on how to do that. like, one detail here is that if you instantiate n to nat-int to generate the most general version of the schema for instrumentation purposes, you get [:= nat-int] as the second arg. Obviously not what you want. Perhaps n is really a spec here instead of a nat. Stuff like that.

ambrosebs18:04:59

Would be curious to know what this would look like starting from your observation that everything resembles datalog, but I don't know how to create verification system from that foundation so that's for someone else to explore 🙂

ambrosebs18:04:48

my formal training in type theory is leaking through exposing my biases/limitations. it's my one trick.

ambrosebs18:04:20

fleshed it out a bit, I think this should check length bounds for instrumentation and additionally generators of this schema will know how to generate good returns.

(m/all [x :*, n :< nat-int?]
       [:=> {:and (m/refine {{:keys [v i]} :args}
                            [:< i [:count v]])}
        [:catn
         [:v [:schema x]]
         [:i n]]
        [:..nth x n]])

[:inst ::nth [:cat :a :b :c] [:= 2]]
;=> [:=> {:and ..} [:cat [:schema [:cat :a :b :c]] [:= 2]] :c]
[:inst ::nth [:* :int] nat-int?]
;=> [:=> {:and ..} [:cat [:schema [:* :int]] nat-int?] :int]
;; for instrumentation:
[:inst ::nth [:* :any] nat-int?]
;=> [:=> {:and ..} [:cat [:schema [:* :any]] nat-int?] :any]

ambrosebs18:04:53

The trick is that [:..nth [:* :int] nat-int?] => :int and [:..nth [:cat :a :b] [:= 0]] => :a. And probably [:..nth [:cat :a :b] nat-int?]=>[:or :a :b].

Ben Sless19:04:36

Why aren't you speccing it like a map? I guess heterogeneous and homogeneous vectors could be specced differently

ambrosebs19:04:58

this was my only idea, what were you thinking?

Ben Sless19:04:56

Too tired to be coherent, but just however you'll describe a map if the vector is heterogeneous. A homogeneous vector could be defined like you have now. The definitions should overlap for homogeneous vector, too

millettjon02:04:07

I like where this is heading. It is much cleaner to have a :map with constraints in the properties versus having to nest it inside :and or :fn etc. The symmetry with the actual shape is maintained.

❤️ 2
millettjon02:04:08

For instance, I have been working on some visualizations of schemas and it requires walking the schema tree and removing all the nodes that are really constraints since the`:maps` are the bits I am interested in.

Ben Sless03:04:06

Suggestion: use the format property like json schema does. Thinking forward to things like email, IP, etc

ambrosebs04:04:40

@UK0810AQ2 are you referring to the :format field in malli.json-schema? I don't follow.

Ben Sless04:04:38

For strings, instead of :alpha true, :format :alpha

Ben Sless04:04:56

A single key to look at

ambrosebs04:04:46

It's just sugar for an atomic proposition: {prop true} => {:and [[prop]]}. {:alpha true :numeric true :max 10} => {:and [[:alpha] [:numeric] [:max 10]]}

ambrosebs04:04:11

You can even do {:gen/alpha true, :alphanumeric true} to generate alpha in generators and alphanumeric in validators.

ambrosebs04:04:30

though I haven't implemented it, that's how I've set up the syntax.

ambrosebs05:04:49

I've only done the validator side atm. there [:gen/alpha] expands to the top proposition [:any] so it usually has no effect on the proposition during validation.

ambrosebs05:04:30

to give you an idea, these are the propositions I've supported for :string alone, along with their :gen version. they all use the {prop true} syntax:

:max :min :alphanumeric :non-alphanumeric :letters :non-letters :numeric :non-numeric :alpha :non-alpha :sorted :distinct :palindrome :trim :triml :trimr :trim-newline :blank :non-blank :escapes :includes

ambrosebs05:04:51

it's easily extensible

ambrosebs05:04:08

well, {:includes "foo"} and

{:escapes {\- "_MINUS_"}}
actually make use of their val.

ambrosebs05:04:01

and min/max of course.

Ben Sless05:04:33

What threw me in a loop is that (and alpha numeric) is an empty set

Ben Sless05:04:04

The sugar is confusion imo

ambrosebs05:04:13

{:max 10 :or [[:alpha] [:numeric]]} rather

ambrosebs05:04:07

{:distinct true :sorted true :max 10} is a better example of a conjunction.

ambrosebs05:04:57

The rule is the same as always: {:min 10 :max 12} is a conjunction. we just have more atomic propositions now.

Ben Sless05:04:29

Yes, but the code for handling them would be annoying

Ben Sless05:04:07

{:format [:and :alpha :distinct]} is clearer for the reader and simpler to implement imo

Ben Sless05:04:57

It's uniform and simple to compile to a bunch of predicates

ambrosebs06:04:00

It's a dozen lines to desugar in a fully extensible way. Plus I want to save the keyword syntax for [:contains K] sugar like s/keys.

(defn -constraint-from-properties [properties constraint-opts options]
  (let [{:keys [flat-property-keys nested-property-keys]} (->constraint-opts constraint-opts)]
    (when-some [cs (-> []
                       (into (keep #(when-some [[_ v] (find properties %)]
                                      (into [%] v)))
                             nested-property-keys)
                       (into (keep #(when-some [[_ v] (find properties %)]
                                      (conj [%] v)))
                             flat-property-keys)
                       not-empty)]
      (if (= 1 (count cs))
        (first cs)
        (into [:and] cs)))))

ambrosebs06:04:50

[:and :x :y] instead of [:and [:contains :x] [:contains :y]].

ambrosebs06:04:14

At least for schemas that support contains?. Haven't decided whether to allow unwrapped keywords in other schemas yet.

ambrosebs06:04:52

Let's say I do allow unwrapped keyword in say :string . I think {:and [:alpha :distinct]} is clearer than {:format [:and :alpha :distinct]}.

ambrosebs06:04:48

mainly because "format" isn't a proposition.

ambrosebs06:04:01

It's unclear how far this concept can go, but I want the user to think property == proposition. And property map == conjunction.

Ben Sless06:04:26

yeah, format isn't a proposition, it's more like a predicate

ambrosebs06:04:48

earlier I renamed :and to things like :keyset :keys . I think once I saw the abstraction of everything is a proposition, the current philosophy started to make sense to me.

ambrosebs06:04:18

every single schema can support :and , and it will make sense.

ambrosebs06:04:19

I think :format for :string is similar to my thinking of :keyset constraints for :map. We can abstract over the concept of "thing is true for schema".

Ben Sless06:04:10

my thinking about it is tainted by how json schema does it

ambrosebs06:04:08

yeah and my thinking about properties was tainted by min/max. It was tailored for each schema in different ways and didn't seem like a coherent abstraction beyond syntax.

ambrosebs06:04:41

but that was my thinking too: a property is something tied to the schema type.

ambrosebs06:04:57

until like yesterday. so it's a new idea.

ambrosebs07:04:49

here's a good existing example of using true.

(mg/generate
  [:double {:gen/infinite? true, :gen/NaN? true}]
  {:seed 1})
They are both atomic propositions. could use them like {:gen/xor [:gen/infinite? :gen/NaN?]} to never mix ##Inf and NaN in the same run (in theory, let's see when the rubber meets the road).

Oliver Marks15:04:16

I am using the malli reitit ring middleware all is working but there is one thing I am curious about can you adjust the humanized response to include keys.

"humanized":["invalid type","invalid type"]
In the above example it tells me there are 2 errors but it does not include the key in the response so I don't know the field that had the error, Is there a way to add in this information or map it to the submitted data so that I can put an error marker on the form fields on the frontend, wanted to check before I role some kind of work around. This sounds like it may do the job but the example is for manually validation so not sure if I can modify the ring reitit middleware in the same way ? https://github.com/metosin/malli/blob/master/docs/tips.md#getting-error-values-into-humanized-result