Fork me on GitHub
#off-topic
<
2018-11-23
>
jaihindhreddy-duplicate09:11:01

This is probably off-topic even for #off-topic but, I can specify a string less than 50 chars in length that is alphanumeric with this regex: ^\w{1,50}$ but how do I specify that the string should contain exactly one @?

gklijs09:11:24

What you want is something like ^[^@]*@[^@]*$ so a start, than anything but not '@', then '@', then anything but '@', then the end.

jaihindhreddy-duplicate11:11:42

Exactly. But I also want the constraint of max string length encoded in the same regex. Is that possible with regexes?

andy.fingerhut11:11:58

It is possible, but I don't know how to write it any shorter than a 50-way alternative ('or', using '|' as the usual syntax for separating alternatives).

andy.fingerhut11:11:17

One alternative for each of the possible 50 positions of the different character

andy.fingerhut11:11:31

You could write a small bit of code to help you generate such a regex. A question is: do you really need it to be a single regex, or can it instead be a small function instead?

☝️ 4
bskinny13:11:31

Perhaps a simpler, less performant, approach is to break it into two pattern matches with the first doing the length check and on successful match, the second doing the single-@ check.

runswithd6s18:11:33

Clojure Spec looks promising for this type of validation check: https://clojure.org/guides/spec

(s/def ::one-at-50max
    (s/and #(re-matches #"^\w{1,50}$" %) #(= 1 (count (replace % #"[^@]" ""))))))
(s/conform "some@string")

jaihindhreddy-duplicate18:11:00

I was writing a tiny subset of spec (no generators, no destructuring, no global semantics, only validation) where specs are pure data. I'm currently using a regex and a length bound to do this. Was just wondering if I could push the length check into the regex.

👍 4
emccue18:11:22

@jaihindh.reddy unless you are code golfing, just dont use regex

emccue18:11:14

okay im wrong if you want speed yeah regex

jaihindhreddy-duplicate18:11:46

Are you saying regex is faster than your code above?

emccue18:11:58

nope i havent done benchmarks

emccue18:11:05

so i have no clue

dominicm18:11:09

regex compiles to some pretty fast fsm though, be hard to beat.

emccue18:11:31

i just suspect it would be

dominicm18:11:34

There's faster libs out there though, because java's regex fsm isn't the most optimized.

dominicm18:11:09

you could definitely do this if you had direct access to the fsm. But I don't know that regex can express this easily.

emccue18:11:23

well, im in a theory of comp class

emccue18:11:28

hooray college

manutter5118:11:31

I would think you’d want the regex that gklijs proposed, plus (<= (count s) 50)

4
emccue18:11:10

so i can probably write up a FSM separate from a regex language

emccue18:11:36

and we can manually compile that to a simple regex

jaihindhreddy-duplicate18:11:22

That's what I was thinking. \w{a}@\w{b} where (< (+ a b) 50) is not regular grammar. A Context free grammar can represent that of course.

emccue18:11:55

that is definitely regular

emccue18:11:13

not in general for any n

emccue18:11:17

but for 50 it is

jaihindhreddy-duplicate18:11:35

Oops. My bad. For a general n

emccue18:11:00

since your problem is for 50 though, we can definitely do it

emccue18:11:08

gimmie a few minutes to write something up

jaihindhreddy-duplicate18:11:52

I guess

\w{0,0}@{0,49}|\w{0,1}@{0,48}|\w{0,2}@{0,47}|\w{0,3}@{0,46}|\w{0,4}@{0,45}|\w{0,5}@{0,44}|\w{0,6}@{0,43}|\w{0,7}@{0,42}|\w{0,8}@{0,41}|\w{0,9}@{0,40}|\w{0,10}@{0,39}|\w{0,11}@{0,38}|\w{0,12}@{0,37}|\w{0,13}@{0,36}|\w{0,14}@{0,35}|\w{0,15}@{0,34}|\w{0,16}@{0,33}|\w{0,17}@{0,32}|\w{0,18}@{0,31}|\w{0,19}@{0,30}|\w{0,20}@{0,29}|\w{0,21}@{0,28}|\w{0,22}@{0,27}|\w{0,23}@{0,26}|\w{0,24}@{0,25}|\w{0,25}@{0,24}|\w{0,26}@{0,23}|\w{0,27}@{0,22}|\w{0,28}@{0,21}|\w{0,29}@{0,20}|\w{0,30}@{0,19}|\w{0,31}@{0,18}|\w{0,32}@{0,17}|\w{0,33}@{0,16}|\w{0,34}@{0,15}|\w{0,35}@{0,14}|\w{0,36}@{0,13}|\w{0,37}@{0,12}|\w{0,38}@{0,11}|\w{0,39}@{0,10}|\w{0,40}@{0,9}|\w{0,41}@{0,8}|\w{0,42}@{0,7}|\w{0,43}@{0,6}|\w{0,44}@{0,5}|\w{0,45}@{0,4}|\w{0,46}@{0,3}|\w{0,47}@{0,2}|\w{0,48}@{0,1}|\w{0,49}@{0,0}

jaihindhreddy-duplicate18:11:14

Oops. Off by one error

emccue19:11:10

yeah i did n=3 and simplifying isnt the easiest task

emccue19:11:09

point being, if it isnt too slow, just take the readable code

emccue19:11:17

regexes as a last resort or a quick hack

emccue19:11:09

yep i give up on simplifying. got a 80 on the midterm for a reason

jaihindhreddy-duplicate19:11:52

Came up with this:

(defn valid-VPA? [vpa]
  (and (<= (count vpa) 50)
       (= [\@] (filter #(not (Character/isLetterOrDigit %)) vpa))))

emccue19:11:07

can a vpa be empty?

emccue19:11:09

@(\*\*+\*)+\*(@\*+\*@+@)+@

emccue19:11:12

thats for 3

andy.fingerhut19:11:03

I am not sure what 'vpa' is referring to in that figure, but if by that you mean "a set of states in a nondeterministic finite automata, initially containing only the start state(s), and at each step updated to a new set as symbols of input are consumed", then yes, I think it can become empty, depending on how the state machine is constructed.

emccue19:11:19

its what he called his function

andy.fingerhut19:11:39

oh, see it now. oops.

jaihindhreddy-duplicate19:11:28

nice catch. Shouldn't be empty

jaihindhreddy-duplicate19:11:39

a VPA is like an email and is a unique handle you can transact money using UPI (Unified Payments Interface), something the Indian govt. came up with.

emccue19:11:36

my only issue with your code is that (= [\@] (filter #(not (Character/isLetterOrDigit %)) vpa)) doesn't really read that well

jaihindhreddy-duplicate19:11:21

True. I'm writing PHP though for work. And this is far better 🙂

jaihindhreddy-duplicate19:11:31

Do you think, a team that adopted Elixir without anyone with significant experience in it, will do the same with clj?

emccue19:11:25

if there is a business need that is well served then yeah

Renan20:11:21

I'm a beginner on clojure, where clojure is "well served"?

emccue05:11:13

Someone else should take this question. I could give an answer but it would be far from complete.

emccue19:11:01

but "PHP is bad clojure is better" doesn't help the fact that you still have a php codebase you probably cant just throw away

4
emccue19:11:00

i am in no position to give advice, but first just try introducing it on a small project and see how it goes

jaihindhreddy-duplicate20:11:13

We got some near real-time stream processing to do, and Riemann seems like just the thing.

emccue05:11:13

Someone else should take this question. I could give an answer but it would be far from complete.