instaparse

diego.videco 2023-04-22T16:36:58.379379Z

Need some help with my grammar. I have this (simplified):

pattern = (<whitespace>? token)*
<token> = word | cat
cat = pattern <whitespace> <'.'> <whitespace> pattern
whitespace = #"\s+"
word = #"[a-zA-Z]+"
And with this test string "a b . c" I get this result:
[:pattern
 [:word "a"]
 [:cat
  [:pattern [:word "b"]]
  [:pattern]]
 [:word "c"]]
But I am expecting to get this:
[:pattern
 [:cat
  [:pattern [:word "a"] [:word "b"]]
  [:pattern [:word "c"]]]]
However this string does provide the correct results: " a b . c" (notice the whitespace at the beginning). Any ideas how I can get the expected result?

2023-04-22T17:20:59.504019Z

Make your grammar unambiguous, given that input, both what instaparse gives and what you expect are valid parses according to your grammar, I would expect if you asked instaparse for all the parses instead of just taking the first one it comes up with you would get both trees

👍 1
2023-04-22T17:23:29.674829Z

You have essentially two loops in your grammar, one is implicit in using * at the end of pattern, and the other goes pattern -> token -> cat -> pattern

2023-04-22T17:23:40.653549Z

And those overlap in what they recognize

diego.videco 2023-04-22T19:48:18.544819Z

Thanks, that worked