Fork me on GitHub
#malli
<
2020-09-03
>
ikitommi05:09:30

@jeroenvandijk wanted to test the lazy registries.

(require '[malli.core :as m])
(require '[malli.registry :as mr])
Given a data-source that can map names to schemas:
(def schema-provider
  {"int" :int
   "map" [:map [:x "int"]]
   "maps" [:vector "map"]})
We can compose a registry that uses both local and lazy/external resolving:
(defn LazyRegistry [default-registry]
  (let [cache* (atom {})
        registry* (atom nil)]
    (reset!
      registry*
      (mr/composite-registry
        default-registry
        (reify
          mr/Registry
          (-schema [_ name]
            (or (@cache* name)
                (do (println "loading" (pr-str name))
                    (when-let [schema (schema-provider name)]
                      (swap! cache* assoc name (m/schema schema {:registry @registry*}))
                      schema))))
          (-schemas [_] @cache*))))))

(def registry (LazyRegistry m/default-registry))
Using the registry (either swap the m/default-registry or pass as argument:
(count (mr/-schemas registry))
; => 125

(m/validate "map" {:x 1} {:registry registry})
;loading "map"
;loading "int"
; => true

(m/validate "map" {:x 1} {:registry registry}) ;; cached
; => true

(count (mr/-schemas registry))
; => 127

(m/validate "maps" [{:x 1}] {:registry registry})
;loading "maps"
; => true

(count (mr/-schemas registry))
; => 128
Schemas are first class :refs:
(m/schema "map" {:registry registry})
; => "map"

(m/-deref (m/schema "map" {:registry registry}))
; => [:map [:x "int"]]
Hope this helps.

jeroenvandijk07:09:28

@ikitommi Thanks for sharing. I think it’s almost what I need. I’m puzzling how to deal with the (lazy) dispatch on a map key. In clojure.spec I would use multimethods and multispec:

(defmulti resource-type :Type)

(s/def :aws.cfn/resource (s/multi-spec resource-type :Type))

;; Some random examples
(defmethod resource-type "AWS::AmazonMQ::Broker" [_] :aws.amazon-mq/broker)
(defmethod resource-type "AWS::AmazonMQ::Configuration" [_] :aws.amazon-mq/configuration)
(defmethod resource-type "AWS::ApiGateway::Account" [_] :aws.api-gateway/account)
(defmethod resource-type "AWS::ApiGateway::ApiKey" [_] :aws.api-gateway/api-key)
...
If I can do this dispatch somehow, with your suggestion I think I have all I need

jeroenvandijk08:09:49

I’ll study the :multi schema and see if that is the missing piece

ikitommi08:09:46

s/multi-spec is open & mutable, :multi is closed & immutable.

ikitommi08:09:40

so here, I think a lazy multi variant would be needed.

ikitommi08:09:15

a) lazy multi, with immutable values

[:multi {:dispatch :type, :children children-fn}]
b) mutable multi, backed by a custom (mutable) multimethod:
[:multi {:dispatch :type, :children my-multimethod}]

ikitommi08:09:58

… actually would be the same code, it’s in user-space whether to allow overriding the keys.

ikitommi08:09:16

should not be many loc to implement

ikitommi08:09:55

If you make a PR, would like that the default case (e.g. no :children key set) will not slow down -> the entry parsing will happen at schema creation time. for the case of dynamic childs - it would happen at runtime.

ikitommi08:09:49

one question is: what happens if you create a validator, explainer or generator out of that schema: should the current children be used or should those be dynamic too.

ikitommi08:09:28

e.g. if you add a branch after creating a validator, will the validators before that see it or not.

jeroenvandijk08:09:38

With clojure.spec I have one spec that contains all types. This gives you a suggestion in case the dispatch on type fails. E.g.

(s/def :cfn.all/Type #{"AWS::AmazonMQ::Broker" "AWS::AmazonMQ::Configuration" "AWS::ApiGateway::Account" "AWS::ApiGateway::ApiKey" "AWS::ApiGateway::Authorizer" "AWS::ApiGateway::BasePathMapping" "AWS::ApiGateway::ClientCertificate" .....})
This is not ideal either because it doesn’t have spell-check functionality. But to answer your question, I don’t think, at least for my use case, everything has to be dynamic

jeroenvandijk14:09:40

@ikitommi The start of this seems to be simple indeed https://gist.github.com/jeroenvandijk/59d22a726cda2158c01b9d63790aec50#file-malli_lazy-clj-L80 I’ve only added the validator part, not sure if the transformers and explainers will make things more painful

ikitommi15:09:13

@jeroenvandijk just to Make sure: you do know all the possible dispatch keys in advance?

ikitommi15:09:33

(if so, there might be a simpler solution)

jeroenvandijk16:09:16

Yeah all the dispatch types are known in this case. The raw schema data is close to 1mb. So that's the main reason to do it lazy

ikitommi19:09:49

@jeroenvandijk This would be a small change in :ref impl:

(defn LazyRegistry [default-registry f]
  (let [cache* (atom {})
        registry* (atom nil)]
    (reset!
      registry*
      (mr/composite-registry
        default-registry
        (reify
          mr/Registry
          (-schema [_ name]
            (or (@cache* name)
                (do (println "loading" (pr-str name))
                    (when-let [schema (f name)]
                      (swap! cache* assoc name (m/schema schema {:registry @registry*}))
                      schema))))
          (-schemas [_] @cache*))))))

(def registry
  (LazyRegistry
    m/default-registry
    {"map1" [:map [:type [:= "map1"]] [:x :int]]
     "map2" [:map [:type [:= "map2"]] [:y :int]]
     "map3" [:map [:type [:= "map3"]] [:z :int]]}))

(m/validate
  [:multi {:dispatch :type}
   ["map1" [:ref "map1"]]
   ["map2" [:ref "map2"]]
   ["map3" [:ref "map3"]]]
  {:type "map3", :z 1}
  {:registry registry
   ::m/lazy-refs true})
;loading "map3"
;=> true

ikitommi19:09:40

new option :malli.core/lazy-refs that would control if the :refs are checked eagerly or lazily

ikitommi19:09:10

or there could be a :lazy variant of :ref to make things explicit.

ikitommi19:09:23

or a new property :lazy to :ref to mark it being lazy:

[:ref "map1"]

[:ref {:lazy true} "map1"]

ikitommi19:09:50

I think that’s actually good.

ikitommi20:09:51

actually, we can push all the changes from user api (e.f. schema props) into extender api (here: lazy registry impl). This allows to write fully lazy multis:

[:multi {:dispatch :type}
 "AWS::AmazonMQ::Broker"         
 "AWS::AmazonMQ::Configuration"
 "AWS::ApiGateway::Account"
 "AWS::ApiGateway::ApiKey"
 "AWS::ApiGateway::Authorizer"]

ikitommi20:09:53

(`:multi` uses the entry-syntax, like :map which allows single-value elements if they are valid schema reference types, now: just qualified keywords, should be strings too)