Fork me on GitHub
#instaparse
<
2023-03-02
>
Markus20:03:00

Hello y’all! I’m working on generating regular grammars using genetic programming. Now, I’m not completely sure how to use instaparse or, more specifically, I’m not quite sure how to write my rules, so that instaparse produces the desired output. I’m testing instaparse using tomita grammars and for that I’m converting some FSMs to rulesets. I got really confused using tomita-3 and instaparse, since the rules would be

A -> 'a'A
A -> 'b'B
A -> eps
B -> 'b'A
B -> 'a'C
B -> eps
C -> 'a'D
D -> 'a'C
D -> 'b'E
D -> eps
but when I typed the ruleset into instaparse using = instead of the -> of my specification, it wasn’t parsing anything. What would be the correct way to rewrite these rules for instaparse?

2
Markus20:03:16

When I’m saying it wasn’t parsing anything, I actually mean that it was parsing valid words of tomita-3 as empty arrays or returning a failure.

Markus15:03:37

For anyone else looking for a solution: It seems that instaparse strongly adheres to EBNF, which means that each non-terminal can only be defined once. This means, that all possible production rules for one non-terminal have to be concatenated with | like this:

A = 'a'A | 'b'B | eps
B = 'b'A | 'a'C | eps
C = 'a'D
D = 'a'C | 'b'D | eps