Fork me on GitHub
#instaparse
<
2017-07-05
>
matan09:07:25

With instaparse, how do you elegantly match any sequence of characters up until a specific sequence of characters?

matan09:07:34

Currently I do that in a cumbersome way, given the impedence mismatch between grammars and regular expressions:

WrappedLabel = UnderscorePair Word UnderscorePair
UnderscorePair = "__"
Word = #".+(?=__)" (* a valid word is hereby contrained to anything that does not include an UnderScorePair *)

matan09:07:08

__ appears both as a grammar definition (`UnderscorePair`) as well as serving as a stop expression in the regex

matan09:07:02

:thinking_face: I am curious in case there's a solution I've not thought of

matan09:07:06

The above is supposed to catch and parse anything of the form __foo__, so that foo can be extracted (as part of a larger parse the details of which are quite plain and uninteresting)

aengelberg16:07:31

@matan that's how I would do it. I'm not aware of a more elegant or efficient solution, besides making the underscore pairs part of the regex.