This page is not created by, affiliated with, or supported by Slack Technologies, Inc.
2021-06-29
Channels
- # adventofcode (8)
- # babashka (12)
- # beginners (91)
- # calva (6)
- # cider (2)
- # circleci (11)
- # clj-kondo (19)
- # clojure (202)
- # clojure-australia (7)
- # clojure-brasil (1)
- # clojure-europe (123)
- # clojure-nl (2)
- # clojure-poland (24)
- # clojure-uk (6)
- # clojured (1)
- # clojurescript (91)
- # core-async (23)
- # cursive (16)
- # data-science (5)
- # datomic (26)
- # emacs (27)
- # events (2)
- # graalvm-mobile (50)
- # graphql (4)
- # honeysql (4)
- # instaparse (33)
- # lsp (24)
- # meander (22)
- # nrepl (3)
- # off-topic (26)
- # pedestal (1)
- # re-frame (15)
- # releases (1)
- # sci (1)
- # shadow-cljs (20)
- # tools-deps (22)
I was trying to make this ebnf grammar work with instaparse: https://github.com/cbeust/kash/blob/master/src/main/resources/bash.ebnf But so far it didn't work out
Here's what I got: https://gist.github.com/borkdude/98c5d9e2bf598b227e8e643e4271e61e
user=> (def parser (insta/parser "/Users/borkdude/Downloads/bash.ebnf"))
#'user/parser
user=> (parser "foo")
Parse error at line 1, column 1:
foo
^
Expected:
#"[0-9]"
Hi, when you do not specify a starting rule for the grammar instaparse selects the top rule for a starting point. In you case that is the number rule. https://github.com/engelberg/instaparse#parsing-from-another-start-rule This should work:
(parser "foo" :start :command)
(NB: surrounding an rule with angle brackets makes it hidden, since all commands are hidden you will probably only get an empty list on a successful parse)
btw, it wasn't my choice to use angle brackets, I just copied that from the original ebnf
why does this succeed if I have set :partial
to false
:
user=> (parser "foo" :start :word :partial false)
[:word]
That was my hunch, which is why i thought a head's up was in it's place:) Yes, these is at good example of hiding here: https://github.com/engelberg/instaparse#hiding-content as mentioned, it is usually used for hiding whitespace and other tokens you do not care about in the final output, but if you hide the top rule, everything disapears
ah I see, it was because of the hiding again:
user=> (parser "foo" :start :word :partial false)
[:word [:word [:word [:letter "f"]] [:letter "o"]] [:letter "o"]]
:partial
allows a partially complete/successful parse to succeed, embedding the failure node in the AST where at the point where the output
It seems the original ebnf works a bit differently than instaparse. e.g.:
<for_command> ::= 'for' <word> <newline_list> 'do' <compound_list> 'done'
| 'for' <word> <newline_list> '{' <compound_list> '}'
| 'for' <word> ';' <newline_list> 'do' <compound_list> 'done'
| 'for' <word> ';' <newline_list> '{' <compound_list> '}'
| 'for' <word> <newline_list> 'in' <word_list> <list_terminator>
<newline_list> 'do' <compound_list> 'done'
| 'for' <word> <newline_list> 'in' <word_list> <list_terminator>
<newline_list> '{' <compound_list> '}'
if I have to rewrite the grammar anyway I'm more inclined to hand-roll my own parser
I think that for most yacc/bison parsers rules are separated by whitespace by default yes, instaparse supports adding this by using the auto-whitespace feature which has worked well for me https://github.com/Engelberg/instaparse/blob/master/docs/ExperimentalFeatures.md#auto-whitespace
I dont know what you are using this parser for, but in my experience using a proper grammar-based parser is more maintainable and flexible in the long run. Of course for small use cases it can be a lot to get into and learn
This is the original: https://github.com/cbeust/kash/blob/master/src/main/resources/bash.ebnf
Then i guess comes down to which approach you find most fun:) I think instaparse is quite amazing once you grok it, but again i understand i can be a hassle go get into. On the other side, hand written parsers can also be painful to get correct
There isnât really a single EBNF syntax specification or RFC, so every âEBNF grammarâ youâll find in the wild will have a slightly varied flavor of the syntax. Sometimes because a certain parser library chose a unique metasyntax, or sometimes because the grammar is meant to serve as documentation rather than compiled and executed.
Instaparse attempts to support most of the different flavors, which is why you can use either x?
or [x]
syntax for example
But sometimes a grammar or a different parser library will make a particularly unusual syntax choice, like using angle brackets in rule names
Or a grammar will make an implicit logical assumption that Instaparse has no way to act upon, like whitespace being parsed between tokens
The angle brackets are particularly unfortunate since Instaparse chose to use angle brackets for an instaparse-specific feature (hiding data from the output parse tree)
ABNF, on the other hand, seems to be a much more regulated metasyntax, so copy and pasting ABNF grammars into instaparse (using :input-format :abnf
) tends to be safer