Fork me on GitHub
#instaparse
<
2023-11-09
>
Giles Alexander18:11:05

Hi, I’m having a weird issue with instaparse. I’ve written a grammar using EBNF. When parsing moderately long (~200 lines) documents using that grammar, if the document has an error then Instaparse will go into an infinite loop, but only if the document uses DOS line endings. Huh? If the document uses UNIX line endings, then Instaparse reports the error. EOL is part of the syntax of the grammar, and I have defined a terminal ('\r\n' | '\n' | '\r' | #"$"). There’s got to be something wrong with my grammar, and I want to fix that. But, it seems odd that I’m able to drop Instaparse into an infinite loop with only changing the line endings. Anyone have any ideas where I should start to look to produce a simpler test case? Thanks 🙏

aengelberg18:11:43

So it infinite loops if the input has \r\n ?

Giles Alexander18:11:36

Not exactly. Infinite loops if the document has \r\n and the document otherwise has a parse error.

👍 1
aengelberg18:11:25

I think #"$" might be dicey because it will detect the end of a line, not consume it (because it parses zero characters), meaning you could parse infinite empty lines in a row

Giles Alexander18:11:58

Ahhh… I’m trying to match end of input as the same as an end of line

aengelberg18:11:04

Also, $ detects the end of a line, not the end of the file

aengelberg18:11:16

Yeah, that seems closer to what you want

Giles Alexander18:11:40

Thanks! I’ll give it a try. And see if I can produce something to repro the infinite loop with that and without that change

👍 1
aengelberg18:11:07

I'd still be a little concerned at the potential for matching infinite EOF's

Giles Alexander18:11:17

OK. I see what you mean. I’ll have a think about a different way of expressing this

aengelberg18:11:05

It's possible you don't need to explicitly match on EOF, because instaparse will only consider a parse valid if it consumes the whole string

👍 1