Fork me on GitHub

With instaparse, how do you elegantly match any sequence of characters up until a specific sequence of characters?


Currently I do that in a cumbersome way, given the impedence mismatch between grammars and regular expressions:

WrappedLabel = UnderscorePair Word UnderscorePair
UnderscorePair = "__"
Word = #".+(?=__)" (* a valid word is hereby contrained to anything that does not include an UnderScorePair *)


__ appears both as a grammar definition (`UnderscorePair`) as well as serving as a stop expression in the regex


:thinking_face: I am curious in case there's a solution I've not thought of


The above is supposed to catch and parse anything of the form __foo__, so that foo can be extracted (as part of a larger parse the details of which are quite plain and uninteresting)


@matan that's how I would do it. I'm not aware of a more elegant or efficient solution, besides making the underscore pairs part of the regex.