Fork me on GitHub
#instaparse
<
2024-03-30
>
misha19:03:22

greetings! not exactly an instaparse question, but somewhat related: how do you "work" with parse-result tree after parsing? I found it still tiresome and custom every time to walk and modify result tree. For example, imagine you have a compiler, and want to implement a bunch of different optimizations based on (sub)tree shape/node types. Generally, you'd need to match several levels deep into tree to make a decision. So it'd be cool to have some compact DSL/macro to do this, to make it feasible to accept optimizators as plugins, which do not require 10 pages of custom clojure/walk implementation. One candidate I thought of is core.match, but first you'd need to "materialize" all the optional nodes, to manage combinatorial explosion of valid patterns you need to write clauses for. (Think def syntax patterns: from (def foo) to (def ^:a ^:b ^:c ... ^:N foo "optional doc str" ^:a ... ^:N some-value) – you don't want to write 1 pattern for each possible number of meta nodes). So core.match would need some preprocessor, which can data-reg-ex: malli or spec etc, e.g.

(cat
 :var-metas (* meta-node?)
 :sym       symbol?
 :doc       (? string?)
 :val-metas (* meta-node?)
 :expr      any?)
But even then it limits (?) you to making decision after entire pattern is "matched", so you cannot (and I mean just "in readable or declarative way") make decisions while you reduce. Meaning "making decisions inside a reduce function" (assuming you walk/reduce tree in DFS/BFS order, think "this token is a new local binding name, and I need to add it to other ahead/behind tokens' context"), because in general case you need both to look ahead and to look behind. So I guess my question is: is there something between "100% bespoke tree walking" and "limited-in-power context free reduce functions"? Because the way I did it before seems to be too verbose and noisy, and I need to do it again, and even more important – provide extension point. https://github.com/akovantsev/blet/blob/main/src/com/akovantsev/blet/impl2.cljc#L210-L262 Oh, and then there is a challenge to put changes back into originally shaped tree (unform ?).

respatialized20:03:37

Lookahead and lookbehind sound challenging– I don’t know if malli transformers are expressive enough to achieve them, for example. But potentially doable with https://clojure.github.io/clojure/clojure.zip-api.html - you can conditionally go up, down, left, or right depending on your matching criteria.

respatialized20:03:16

Also, consider #CFFTD7R6Z , which is a term rewriting system expressive enough to do matching, reducing, and replacement in a single operation. https://jimmyhmiller.github.io/meander-practical

👍 2
misha20:03:41

"making decision as you go through tree" + "having entire pattern overview" together scream "FSM" at me, so I am rewatching everything related I can recall or find, like https://www.youtube.com/watch?v=AEhULv4ruL4

misha20:03:51

I like meander's destructuring, I wish clojure had it as default I have mine almost exactly like it: https://github.com/akovantsev/destruct

misha21:03:23

this is pretty close to what I imagine: https://github.com/noprompt/meander/?tab=readme-ov-file#gaining-control

(def eliminate-zeros
  (m*/rewrite
   (+ ?x 0) ?x
   (+ 0 ?x) ?x))
although specifically here it requires you to describe 100% of the return structure. so maybe datalog would be the way to go, because: • tree would be flat db (think datomic) • lookaround would be just datalog query, potentially recursive, like "has ancestor = x" to scope the rule application to particular subtree. • uniformal format (datoms) is easy to update "inplace" • datalog is perfect for plugin API and there is a "prior art" of that (for some reason I have completely blanked out on it): https://petevilter.me/post/datalog-typechecking/ https://www.hytradboi.com/2022/codebase-as-database-turning-the-ide-inside-out-with-datalog one thing datalog is not - visual. but maybe there is a way to DSL over it to have meander-like "visual" shape description/matching, but datalog's patches.

respatialized21:03:16

asami has the support for transitive graph relations you may need for this

misha21:03:16

I think (->> id (iterate #(-> % by-id :parent-id)) (take-while some?) set) will do (assuming hand rolled PoC)

misha13:03:13

re zippers: https://youtu.be/5VAEXYYdfWY?t=287 I like trivial api but don't like that I still play computer while mentally mapping next next next down down onto a datastructure (hopefully written somewhere near in source code). (again, depends on an actual goal one (me) is trying to achieve)

respatialized19:03:11

I think the other thing to consider with matching and modifying complex parser output is trying to solve as many of these problems as you can in the grammar specification itself. Things like lookahead and lookbehind are expressible in Instaparse itself, and you can use the ordered choice operator to make sure the complex cases get considered before the simpler, more ambiguous ones and treated as a single syntactic unit.

misha11:04:25

need something as close as possible to this: https://youtu.be/nCR9zMU2Q_M?t=456