Fork me on GitHub
#lambdaisland
<
2022-01-28
>
jjttjj00:01:44

I'm trying to replicate this in regal (https://github.com/lambdaisland/regal) using a backrefernce to match any 3 or more of the same consecutive character:

(re-find #"(.)\1{3,}" "111123")
;;=> ["1111" "1"]
But can't seem to get it:
(regal/regex
  [:cat [:capture  :any] [:repeat ::_ 3 nil]]
  {:resolver (fn [x] "\\1")})

;;=>  #"(.)(?:\\1){3,}"

(regal/regex
  [:capture [:capture  :any] [:repeat ::_ 3 nil]]
  {:resolver (fn [x] "\1")})

;;=> #"((.){3,})"


(regal/regex
  [:capture [:capture  :any] [:repeat ::_ 3 nil]]
  {:resolver (fn [x] "\\\\1")})

;;=> #"((.)(?:\\\\1){3,})"
Any tips?

plexus15:01:44

(ns repl-sessions.poke
  (:require [lambdaisland.regal :as regal]
            [lambdaisland.regal.parse :as regal-parse]))

(regal-parse/parse #"(.)\1{3,}")
;; => [:cat
;;     [:capture :any]
;;     [:repeat [:lambdaisland.regal.parse/not-implemented [:BackReference "1"]] 3]]
backreferences aren't implemented, but seems like a common enough feature that they should be. Would you mind creating a ticket?

jjttjj15:01:20

Sure, I'll get one in towards the end of the day, thanks!

plexus17:01:11

First question would be if all engines support this (Java, ECMA, Re2)

jjttjj17:01:31

A google search seems to suggest that Re2 does not https://github.com/google/re2/issues/101

plexus22:01:07

ok, that's not the end of the world. For Re2 we don't strive for 100% compatibility, since they deliberately don't handle certain features

plexus17:01:58

@jjttjj here's a workaround you can do yourself:

(defmethod regal/-regal->ir [:ref :common] [[op idx] opts]
  `^::regal/grouped ("\\" ~(str idx)))

(regal/regex
 [:cat
  [:capture  :any]
  [:repeat [:ref 1] 3 nil]])
;; => #"(.)\1{3,}"

metal 1
jjttjj17:01:58

Awesome, thanks!