datahike

whilo 2026-06-26T08:02:11.957959Z

๐Ÿงช wandler โ€” a query planner for your in-app data pipelines. On Clojars now (experimental). Your database has a query planner: it picks join orders, pushes down filters, chooses indexes โ€” so you never hand-tune a query. Your in-memory map/`filter`/`reduce`/`group-by`/`join` pipelines get none of that. wandler is that planner for your pipeline code. Plug one in and it fuses the passes, drops intermediate collections, factorizes a group-by-over-join so the join is never built (O(Nยฒ)โ†’O(N)), or runs it incrementally โ€” and it re-plans automatically as you edit, so you never sit and reason about the optimal execution for each hot loop. Types are just malli โ€” the m/=> schema you may already write; change defn โ†’ a/defn and the pipeline is checked, optimized, and compiled:

clojure
(m/=> big-squares [:=> [:cat [:sequential :int]] :int])
(a/defn big-squares [xs]
  (reduce + 0 (map (fn [x] (* x x)) (filter (fn [x] (< 10 x)) xs))))

(:stages-after (w/explain 'big-squares))  ;=> [foldl]   ; 3 passes โ†’ 1
(:verified?    (w/explain 'big-squares))  ;=> true      ; proved equal to your original
The usual catch with an aggressive optimizer is "did it change my results?" wandler's answer: every rewrite carries a machine-checked proof that it preserves them (checked by a Lean-4-style kernel) โ€” a rewrite that isn't provably equivalent is rejected, never shipped. And the practical part: you don't write the proofs. Coding assistants already know Lean 4 + ansatz; point one at your domain's operators and it writes the certified rewrite rules โ€” and since the kernel checks them, an AI-written rule is exactly as safe as a hand-written one (verified, or rejected). You extend the optimizer to your pipelines by asking an assistant, not by learning a proof assistant. ๐Ÿงต the e-graph planner, joint planning with datahike/stratum, and the ansatz 0.2 kernel release โ€” in the thread. โš ๏ธ Early-stage research software โ€” evolving, not production-hardened. Tutorial: https://github.com/replikativ/wandler Join #C09622F337D for more general discussion.

whilo 2026-06-26T08:04:30.708549Z

๐Ÿงต Under the hood. The optimizer is an e-graph (equality-saturation, egg-style) rewriter: it builds out the space of forms your pipeline is provably equal to under the rule set, then extracts the cheapest โ€” each adopted rewrite carrying its kernel proof. Same idea as a modern query optimizer, except the rewrites are certified and the IR is your pipeline. It also plans jointly with datahike and stratum: embed a d/q or stratum query right in a pipeline and the planner optimizes across the boundary โ€” pushing the aggregation through the join, factorizing, choosing the drive direction โ€” so the same planning + optimized execution those engines bring to your queries reaches the code around them too. Extending it to your domain: the rewrite rules are just kernel-checked laws. Have a coding assistant add the laws for your operators (a custom fold, a domain join, a bespoke aggregation); once they're in, every pipeline that uses them gets planned automatically โ€” even as you keep editing.

whilo 2026-06-26T08:04:47.998839Z

๐Ÿงต ansatz 0.2 โ€” "Verified Clojure via Lean 4" โ€” the kernel + DSL underneath (and what the assistant uses to check the rules it writes), now much easier to pick up. If you already write malli-instrumented Clojure, it's the gradual next step โ€” keep your schema, change defn โ†’ a/defn:

clojure
(require '[ansatz.core :as a] '[malli.core :as m])
(a/load-init!)                          ; bundled Lean Init, zero setup (straight from the jar)

(m/=> dbl [:=> [:cat :int] :int])       ; just a malli schema
(a/defn dbl [x] (+ x x))                ; defn โ†’ a/defn โ€” now machine-checked, still plain Clojure
(dbl 21)                                ; => 42
โ€ข Batteries included, from Clojars โ€” (a/load-init!) loads a bundled Lean Init environment from the jar, no store to build; full Init / Mathlib fetched on demand. โ€ข malli support โ€” m/=> schemas become kernel signatures; [:map โ€ฆ] โ†’ named-field records (keyword access verifies, runtime stays plain maps); [:int {:min 1}] โ†’ refinements you still use as plain numbers. โ€ข Better tactics โ€” faithful funext, simp only / simp_all only, a split cluster, omega improvements, ac_rfl. โ€ข Bugfixes โ€” omega hypothesis handling, codegen, several simp universe fixes. Repos: https://github.com/replikativ/ansatz ยท https://github.com/replikativ/wandler