Fork me on GitHub
#cljs-dev
<
2022-06-08
>
p-himik17:06:01

Seems like some RegEx patterns are broken. This just popped up: https://clojurians.slack.com/archives/C03S1L9DN/p1654705094721129 And there have been questions just like this throughout the years - just do a Slack search for "Invalid regular expression flag". The culprit seems to be this code:

(defmethod emit-constant* #?(:clj java.util.regex.Pattern :cljs js/RegExp) [x]
  (if (= "" (str x))
    (emits "(new RegExp(\"\"))")
    (let [[_ flags pattern] (re-find #"^(?:\(\?([idmsux]*)\))?(.*)" (str x))]
      #?(:clj  (emits \/
                 (.replaceAll (re-matcher #"/" pattern) "\\\\/")
                 \/ flags)
         :cljs (emits pattern)))))
In the context of the linked thread, what once was #"(\/\/).*" in CLJS becomes /(\\/\\/).*/ in JS, which is obviously invalid.

dnolen17:06:31

I don't think there is much actionable here though - unless you trying are to propose something

dnolen17:06:16

Any change here is probably a backwards compatibility chaos generator

p-himik17:06:14

Would replacing /.../ with new RegExp("...") be an acceptable solution?

dnolen18:06:51

I think you would need to review what Google Closure does here under advanced compilation - I thought it optimized some obvious cases. Also the ticket should probably at least reference what the VMs optimize if at all.

p-himik18:06:31

How is Closure relevant here if the current approach doesn't work without any optimizations?

dnolen18:06:10

some cases that some people don't understand don't work

dnolen18:06:13

but the cases that do work are optimized - does your proposal impact that

dnolen18:06:48

so just suggested what you're proposing if you'd like try it out, then make an issue a patch and back it up w/ some facts about the performance impact

dnolen18:06:24

it maybe all be negligible - but this is just due diligence

p-himik18:06:47

Oh, so GCC rewrites regular expressions themselves, like e.g. turning aa|ab into a[ab]? Huh!

dnolen18:06:03

it might, I don't know! they have a pass for regexes that's all I know 🙂

borkdude18:06:39

So if you dynamically construct the regex, you might circumvent that?

dnolen18:06:52

also I don't know

dnolen18:06:58

if everything is static maybe it is equivalent

dnolen18:06:19

all these things seem pretty simple to verify for the patch creator

dnolen18:06:07

so it does stuff for sure