Fork me on GitHub
#datomic
<
2023-06-07
>
Eduardo Lopes19:06:30

I am trying to translate the below query to use re-find and I am not having success

[:find ?e ?id
 :where
 [?e :foo/identifier ?id]
 [(.contains ?id "2022")]
]
On my REPL I am executing something like this
(some? (re-find #"2022" "2022_foo_bar"))
And returns true. Changing query to something like this returns error:
[:find ?e ?id
 :where
 [?e :foo/identifier ?id]
 [(some? (re-find #"2022" ?id))]
]
Anyone does know how to do this re-find?

bridget20:06:45

I'd be curious to hear what your goal is with this query

Eduardo Lopes20:06:37

System debugging basically

bridget20:06:44

Well, I guess I'm looking for what your intention is with the query plus the re-find themselves. Is it something like do my ?id's have the values 2022 and 2022_foo_bar?

Eduardo Lopes20:06:48

Is a tax system that very often need some individual analysis for each customer, the identifier is a concat of many strings.

bridget20:06:24

Ok: so do my ?ids contain any values that include 2022 or 2022_foo_bar?

Eduardo Lopes20:06:14

Is if ?id values matches with the regex #"2022".

Eduardo Lopes20:06:38

2022_foo_bar
Is just for REPL debugging and validation of re-find

bridget20:06:49

So I'd start with just pulling all of the ids via the datalog, then using Clojure to look at the returned set to find the values you are looking for

bridget20:06:52

The datalog:

[:find ?e ?id
 :where 
 [?e :foo/identifier ?id]]

bridget20:06:46

Then with the result do some sort of list comprehension including the regex

(mapv (fn [[x id]] (re-find #".*2022.*" id}) query-results)

Eduardo Lopes20:06:09

yes, this Is achievable, but my intention was also using only datomic console for doing this.

Eduardo Lopes20:06:06

I think I found a way to do this but I'm not proud since it looks terrible and it is actually too much slow 😅

[:find ?e ?id ?pattern ?matcher ?find
 :where
 [?e :foo/identifier ?id]
 [(java.util.regex.Pattern/compile "2022" java.util.regex.Pattern/CASE_INSENSITIVE) ?pattern]
 [(.matcher ?pattern ?id) ?matcher]
 [(.find ?matcher) ?find]
]

Eduardo Lopes20:06:07

Actually worked until matcher, adding ?find I got an exception after a couple of minutes.

bridget20:06:20

So the original query is too slow? Or doesn't quite get the results you are looking for?

Eduardo Lopes20:06:58

Too slow and I got an exception after some minutes processing

👍 2
Eduardo Lopes20:06:32

Making a comparison to SQL would be something like

SELECT identifier FROM foo
  WHERE identifier LIKE "%2022%";

Eduardo Lopes20:06:00

Actually the above example is similar to .contains, this is better:

SELECT name FROM users
  WHERE name REGEXP '^[sp][aeiou]';

Eduardo Lopes20:06:40

This actually worked and Is really fast. Maybe is something of using the notation #"2022"

[:find ?e ?id ?pattern ?find
 :where
 [?e :foo/identifier ?id]
 [(java.util.regex.Pattern/compile "2022" java.util.regex.Pattern/LITERAL) ?pattern]
 [(re-find ?pattern ?id) ?find]
]

bridget20:06:49

One thought is to use d/since as it seems to have a time period element

Eduardo Lopes20:06:17

Since on REPL

user=> (java.util.regex.Pattern/compile "2022" java.util.regex.Pattern/LITERAL)
#"2022"

favila20:06:57

The problem with your original query is the nesting

favila20:06:09

[(some? (re-find #"2022" ?id))]

favila20:06:23

datalog parser only looks for vars one level down

🆒 4
favila20:06:01

[(re-find #"2022" ?id)] should be just as good, because find returns nil if it doesn’t match

favila20:06:56

if you don’t need regex you can also [(clojure.string/includes? ?id "2022")]

favila20:06:08

which will probably work? but console is weird

favila20:06:41

its not so helpful query parser sometimes makes legal things impossible to type

Eduardo Lopes20:06:12

Using [(some? (re-find #"2022" ?id))] generates an exception 😕 I think it is because of the #

favila20:06:22

it’s the some?

favila20:06:26

just take it out

favila20:06:43

ah, then it’s console messing you up

favila20:06:47

this is perfectly fine in a repl

Eduardo Lopes20:06:44

Even making a workaround with [(java.util.regex.Pattern/compile "\\d{3}\\." java.util.regex.Pattern/LITERAL) ?pattern] I think it is not getting the regex correctly

Eduardo Lopes21:06:02

Worked with raw chars but using regex notation do not return what is expected, a tried even creating a regex of "\\d" (which is true because all identifiers have at least one number) but doesn't return nothing

favila21:06:19

matchers are stateful, maybe that’s tripping you up

favila21:06:53

although I would still expect that to work, hm

favila21:06:06

more backspaces?

Eduardo Lopes21:06:16

If you see the print I send here it shows a different value for each one

Eduardo Lopes21:06:47

hmm but actually I only made a print screen of matchers when I manually wrote [(.matcher ?pattern ?id) ?matcher]

Eduardo Lopes21:06:07

Maybe re-find is using the same matcher

favila21:06:06

oh is your use of LITERAL deliberate?

favila21:06:41

that means it will match the literal value \d{3}\.

favila21:06:52

regex metachars lose their meaning

Eduardo Lopes21:06:59

You are absolutely right, thanks a lot.

Eduardo Lopes21:06:13

(it worked) 🥳

favila21:06:35

fwiw this works in a repl

(d/q '[:find ?e ?id
       :where
       [(java.util.regex.Pattern/compile "\\d{3}\\.") ?pattern]
       [?e :foo/identifier ?id]
       [(.matcher ?pattern ?id) ?matcher]
       [(.find ?matcher)]
       ]
     [[1 :foo/identifier "abcd123."]
      [2 :foo/identifier "abcd123f"]])
=> #{[1 "abcd123."]}
and if you can convince the console to take it should work there too

Eduardo Lopes21:06:53

[:find ?e ?id ?pattern
 :where
 [?e :foo/identifier ?id]
 [(java.util.regex.Pattern/compile "\\d{3}\\.123" java.util.regex.Pattern/CASE_INSENSITIVE) ?pattern]
 [(re-find ?pattern ?id) ?find]
]

bridget21:06:55

I found something in my old code with these where clauses. I wonder if they get around the above issues:

[(str "AA") ?matcher]
[(re-pattern ?matcher) ?regex]
[(re-find ?regex ?id)]

favila21:06:33

yeah I was about to suggest that because it applies type hints

😁 2
favila21:06:35

so should be faster

favila21:06:39

IIRC the console eats type hints

Eduardo Lopes21:06:41

This look so much better

favila21:06:13

(d/q '[:find ?e ?id
       :where
       [(re-pattern "\\d{3}\\.") ?pattern]
       [?e :foo/identifier ?id]
       [(re-matcher ?pattern ?id) ?matcher]
       [(re-find ?matcher)]
       ]
     [[1 :foo/identifier "abcd123."]
      [2 :foo/identifier "abcd123f"]])
is the equivalent

Eduardo Lopes21:06:45

Needs the (re-matcher) ?

favila21:06:10

nope, you can do (re-find ?pattern ?id)

favila21:06:31

it will make a matcher then discard it

Eduardo Lopes21:06:48

[:find ?e ?id
 :where
 [?e :foo/identifier ?id]
 [(re-pattern "\\d{3}\\.") ?pattern]
 [(re-find ?pattern ?id) ?find]
]

👍 2
favila21:06:28

it may not return something sensible for ?pattern

favila21:06:31

in your :find

Eduardo Lopes21:06:55

Yes, was just for debugging actually. Will edit it

Eduardo Lopes21:06:19

Thank you so much @U0G3Y3NA0 and @U09R86PA4, learned a lot with you. gratitude

🥳 6