Fork me on GitHub
#malli
<
2020-11-10
>
lmergen07:11:54

@ikitommi for what it's worth, i still got a huge performance increase by actually caching the validators as well.

(crit/with-progress-reporting
  (crit/quick-bench (m/validate schema value)))
;; => Execution time mean : 297.880813 ms

(def schema' (m/schema schema))
(crit/with-progress-reporting
  (crit/quick-bench (m/validate schema' value)))
;; => Execution time mean : 533.885193 µs

(def validator (m/validator schema))
(crit/with-progress-reporting
  (crit/quick-bench (validator value)))
;; => Execution time mean : 1.830348 µs
so it looks like about a 500x improvement by caching schemas, and then another 300x improvement by caching the validators

lmergen07:11:25

i suspect in your specific benchmark, the schema is fairly simple so then a larger share of the benchmark is actually about performing the validation

ikitommi09:11:57

@lmergen there was a cljs-issue, just merged the cached satisfies. could you retry with the latest master?

ikitommi09:11:40

there is still a lot of room for improvement for maps (`-parse-entries` is really slow) and for handling property-based registries. I would guess can make schema creation 2-5 times faster. But then again, after malli is used to validate schema properties & children, it will slow things down again.

lmergen09:11:35

(crit/with-progress-reporting
  (crit/quick-bench (m/validate schema value)))
;; before: => Execution time mean : 297.880813 ms
;; after:  => Execution time mean : 12.194964 ms

(def schema' (m/schema schema))
(crit/with-progress-reporting
  (crit/quick-bench (m/validate schema' value)))
;; before: => Execution time mean : 533.885193 µs
;; after:  => Execution time mean : 517.890217 µs


(def validator (m/validator schema))
(crit/with-progress-reporting
  (crit/quick-bench (validator value)))
;; before: => Execution time mean : 1.830348 µs
;; after:  => Execution time mean : 1.952607 µs
so while m/validate got ~ 20x faster, caching the actual validator is still much, much faster

👍 3
lmergen09:11:09

i'm caching the explainers in my own defn macro, but it requires quite a bit of macro magic to make this work, so i was looking for a more generic way to make this happen -- possibly some kind of registry

ikitommi09:11:50

what should be in the registry? validator + explainer + generator + decoder(s) + encoder(s) + …?

lmergen09:11:18

if possible, i'd say all of them yes

lmergen09:11:20

right, so then you lazily cache things

lmergen09:11:31

which would be the best middle-ground

ikitommi09:11:00

one would be to add a wrapper Schema impl, that is returned from registry instead of the real one. And that impl would have a cache -> first call to -validate would store the validator.

ikitommi09:11:30

could be just an option to the registry to return caching proxys instead of normal ones…

lmergen09:11:45

this would be very effective

lmergen09:11:20

i'll experiment with this approach

ikitommi09:11:15

… actually, just a new option key that m/schema uses would do fine (to wrap the returned thing if the option is present)

lmergen10:11:14

so then you cache it inside the actual schema, rather than a wrapper around it ?

ikitommi10:11:09

I would wrap it outside, e.g. the return value wrapped

lmergen10:11:47

ah, right -- and the option to m/schema would then tell it whether to return the wrapped schema or the "regular" schema

ikitommi10:11:01

yes. Or there could be a memoized-schema etc. as a separate fn? (-> :string m/schema m/memoized)

lmergen10:11:18

well that's a detail

lmergen10:11:38

let me experiment with creating that memoized / cached schema in the first place

👍 3
lmergen16:11:29

(crit/with-progress-reporting
  (crit/quick-bench (m/validate schema value)))
;; => Execution time mean : 11.782073 ms

(def schema' (memoized-schema (m/schema schema)))
(crit/with-progress-reporting
  (crit/quick-bench (m/validate schema' value)))
;; => Execution time mean : 2.095245 µs
@ikitommi conceptually it seems to be working like a charm

lmergen16:11:43

exactly which schema am i supposed to wrap here -- it's just the regular malli.core/Schema, right ? the into-schema is meant more for building a hierarchy of parent/child schemas ?

ikitommi16:11:08

yes Schema. IntoSchema is the factory-protocol for creating a Schema out of the Schema AST, each Schema is responsible for it’s own props & children.

lmergen16:11:35

i'll send a PR once i have all the functions working