Fork me on GitHub
#instaparse
<
2022-09-26
>
r0man14:09:42

Hello, I have defined a grammar [1] to parse Java stacktrace with Instaparse. The grammar seems to work if I pass well formed input to it. What I would like to do next, is use the grammar to also parse input that has "garbage" at the beginning or the end of the input string. So I would go from something like this:

S = exception causes
exception = ...
causes = ...
to something like this:
S = <garbage?> exception causes <garbage?>
exception = ...
causes = ...
Now, the issue I am facing is how to define <garbage>. I tried to define it as #[\s\S]* but I believe it is too greedy and it messes up my grammar. For example, sometimes parsing succeeds with garbage, but most of the input is eaten by <garbage> and not by my actual stacktrace grammar. I'm staring to wonder if I actually should include the <garbage> into my grammar at all, or use some other functionality of Instaparse. I saw I can use insta/parses to get access to all parses tried so far, but they are quite a lot, and I am not sure which one to pick (I guess it depends on my application). How do you deal with garbage, or rules that are too greedy in Instaparse? Thanks for your help. [1] https://github.com/r0man/orchard/blob/stacktrace-at-point/resources/orchard/stacktrace/parser/java.bnf