Fork me on GitHub
#instaparse
<
2015-06-17
>
aengelberg00:06:34

That part of the abnf namespace seems not properly designed for supplementary characters altogether...

aengelberg00:06:49

It isn't using any utility (unlike the clj version) to turn an oversized integer into a series of two characters.

aengelberg02:06:38

The get-char-combinator function needs a rework, so ABNF terminals like %x5D-10FFFF can work.

aengelberg02:06:43

JavaScript, unlike Java, does not seem to support regular expressions with \x{10FFFF}.

aengelberg02:06:24

In instaparse for Clojure, single characters are represented as a string combinator with the surrogate pair (two 16-bit chars side by side), and a character range uses the regex \x{10FFF} syntax. ClojureScript or JavaScript appear to not have much support for either of these things. It may be impossible to support Unicode character ranges in ABNF without introducing third-party js libraries.

aengelberg02:06:12

OK, the former is doable via goog.i18n.uChar/fromCharCode.

lucasbradstreet04:06:46

Nice! Yeah, this character support code is probably the weakest part of the port.

lucasbradstreet04:06:38

I'm glad that you're finding these issues. I had a feeling there were some lurking issues there.

aengelberg04:06:43

goog has some utils to work with surrogate strings, but the regex (char range) seems impossible without pulling in an external dependency like Regenerate. https://github.com/mathiasbynens/regenerate

lucasbradstreet05:06:18

Ah, yeah, I think I’d rather recreate the functionality internally than pull in extra deps. Definitely a bit of a pain though.

lucasbradstreet05:06:44

I wonder if this issue is true for all browsers

lucasbradstreet05:06:49

Actually, if you could create a PR with a failing cljs test case that would be a good place to start

aengelberg17:06:37

Hmm, now I'm mildly concerned because circleci is passing... ;)

aengelberg17:06:06

Hmm, I think that's because there isn't really a notion of the cljs tests "passing" or "failing" (no exit codes)