meander

lvh 2022-06-13T14:07:07.703269Z

Hi! I have an instaparse parser for ini files (specifically intended for AWS configs). It understands kv pairs in with in section headers, and kv pairs without a section header, but I’m trying to figure out if I can write both in 1 (or at least 2 minimally duplicative) meander expression.

lvh 2022-06-13T14:07:16.733569Z

Grammar 🧵

lvh 2022-06-13T14:07:20.040929Z

ini = body section*.

section = header (eol body)?.
header = "[" wsp name wsp "]" wsp comment?.
name = #"[\w ]+(?<! )"

body = (kv? wsp comment? eol)* (kv wsp comment?)?

kv = key wsp "=" wsp val.
key = #"[\w ]*(?<! )"
val = #"[\w ]*(?<! )"

comment = #"#.*".
wsp = "(\t| )*".
eol = "(\r|\n|\r\n)".

blank-line = wsp comment? eol.

comment = #"#.*".
wsp = #"( |\t)*".
eol = #"\R".

lvh 2022-06-13T14:07:50.754229Z

Example parses 🧵

lvh 2022-06-13T14:07:58.543099Z

(def no-section
  (i/ini-parser "x=1"))

[:ini
 [:body
  [:kv [:key "x"] [:wsp ""] "=" [:wsp ""] [:val "1"]] [:wsp ""]]]

(def empty-section
  (i/ini-parser "[x]"))

[:ini
 [:body]
 [:section
  [:header "[" [:wsp ""] [:name "x"] [:wsp ""] "]" [:wsp ""]]]]

(def empty-section-with-newline
  (i/ini-parser "[x]\n"))

[:ini
 [:body]
 [:section
  [:header "[" [:wsp ""] [:name "x"] [:wsp ""] "]" [:wsp ""]]
  [:eol "\n"]
  [:body]]]


(def one-section-with-one-kv
  (i/ini-parser "[xyzzy]\nx = 1"))

[:ini
 [:body]
 [:section
  [:header "[" [:wsp ""] [:name "xyzzy"] [:wsp ""] "]" [:wsp ""]]
  [:eol "\n"]
  [:body [:kv [:key "x"] [:wsp " "] "=" [:wsp " "] [:val "1"]] [:wsp ""]]]]

(def one-section-with-two-kvs
  (i/ini-parser "[xyzzy]\nx = 1\ny = 2"))

[:ini
 [:body]
 [:section
  [:header "[" [:wsp ""] [:name "xyzzy"] [:wsp ""] "]" [:wsp ""]]
  [:eol "\n"]
  [:body
   [:kv [:key "x"] [:wsp " "] "=" [:wsp " "] [:val "1"]]
   [:wsp ""]
   [:eol "\n"]
   [:kv [:key "y"] [:wsp " "] "=" [:wsp " "] [:val "2"]]
   [:wsp ""]]]]


(def two-sections
  (i/ini-parser "[xyzzy]\nx = 1\n[iddqd]\ny=2"))

[:ini
 [:body]
 [:section
  [:header "[" [:wsp ""] [:name "xyzzy"] [:wsp ""] "]" [:wsp ""]]
  [:eol "\n"]
  [:body
   [:kv [:key "x"] [:wsp " "] "=" [:wsp " "] [:val "1"]]
   [:wsp ""]
   [:eol "\n"]]]
 [:section
  [:header "[" [:wsp ""] [:name "iddqd"] [:wsp ""] "]" [:wsp ""]]
  [:eol "\n"]
  [:body [:kv [:key "y"] [:wsp ""] "=" [:wsp ""] [:val "2"]] [:wsp ""]]]]

lvh 2022-06-13T14:08:20.125679Z

Meander code I have 🧵

noprompt 2022-06-17T20:14:13.679139Z

Can you give an example of what you want the output to look like?

noprompt 2022-06-17T20:15:09.078209Z

Also, sorry for the lag! I’ve had a wacky week.

lvh 2022-06-13T14:08:25.377279Z

(defn get-kvs
  [ini-parse]
  (m/search
   ini-parse
   [:ini
    [:body . _ ...] ;; <= below body rule can occur here too
    &
    (m/scan
     [:section
      [:header & (m/scan [:name ?h])]
      . _ ...
      [:body & (m/scan [:kv [:key ?k] [:wsp _] "=" [:wsp _] [:val ?v]])]
      . _ ...])]
   [?h ?k ?v]))

lvh 2022-06-13T14:09:00.494619Z

basically this works great for finding [?header ?key ?value] pairs, and I can find kvs in the section headerless body, but I’m trying to see if I can extract both

lvh 2022-06-13T14:09:12.376049Z

I assume the way to write this is recursively

noprompt 2022-06-13T18:47:57.183319Z

@lvh When I get a free moment I'll take a look at this. 👍