This page is not created by, affiliated with, or supported by Slack Technologies, Inc.
2020-12-08
Channels
- # adventofcode (55)
- # announcements (21)
- # babashka (20)
- # beginners (89)
- # calva (21)
- # cider (16)
- # circleci (11)
- # clara (8)
- # clj-kondo (6)
- # clojure (31)
- # clojure-australia (3)
- # clojure-europe (17)
- # clojure-nl (5)
- # clojure-uk (10)
- # clojurescript (25)
- # community-development (4)
- # conjure (12)
- # cryogen (28)
- # cursive (21)
- # datomic (7)
- # deps-new (1)
- # depstar (45)
- # emacs (5)
- # fulcro (46)
- # instaparse (5)
- # jobs (5)
- # jobs-discuss (23)
- # kaocha (12)
- # lambdaisland (2)
- # leiningen (1)
- # meander (10)
- # mid-cities-meetup (1)
- # reagent (5)
- # reitit (5)
- # remote-jobs (45)
- # reveal (9)
- # sql (6)
- # tools-deps (103)
- # uncomplicate (1)
- # xtdb (1)
Hello all.
I’m starting to learn parsing and EBNF and I am struggling to remove the ambiguity from my parser. I have constructed a simple example to demonstrate the problem I am having.
The following parser tags text marked as emphasises like ***emphasis***
(def remove-ambiguity
(insta/parser
"S = (em / char)+ | epsilon
em = <'*' '*'> char* <'*' '*'>
<char> = #'.'")
Although with an input such as **em** **em**
there are many possible parse results:
([:S [:em "e" "m" "*" "*" " " "*" "*" "e" "m"]]
[:S "*" "*" "e" "m" "*" "*" " " "*" "*" "e" "m" "*" "*"]
[:S [:em "e" "m" "*" "*" " "] "e" "m" "*" "*"]
[:S "*" "*" "e" "m" "*" "*" " " [:em "e" "m"]]
[:S "*" "*" "e" "m" [:em " " "*" "*" "e" "m"]]
[:S "*" "*" "e" "m" [:em " "] "e" "m" "*" "*"]
[:S [:em "e" "m"] " " "*" "*" "e" "m" "*" "*"]
[:S [:em "e" "m"] " " [:em "e" "m"]] <-- This is the one I want
;; This makes sense since there are a few ways you can match up the asterisks to match the rule. However I only ever want to allow results like this `[:em "e" "m"] " " [:em "e" "m"]]
It’s almost like I want it to greedily take the first match possible and then ignore all others. But I have no idea how to express this. Any help would be greatly appreciated 😄.your grammar says '' is both the start of an em sequence, and two chars, and that is the ambiguity