Fork me on GitHub
#datomic
<
2022-10-20
>
emil0r12:10:31

How does one specify a transactor when using a postgres backed storage for Datomic pro? I get an exception error saying that I am trying to connect to localhost on the transactor, but the transactor is running on another machine. Looking over the documentation I cannot find anything about that for SQL

Kris C12:10:57

see the sample configuration in file DATOMIC_ROOT/config/samples/sql-transactor-template.properties

Kris C12:10:46

Relevant configuration:

protocol=sql
host=POSTGRES_SQL_HOST
port=POSTGRES_SQL_PORT
sql-url=jdbc:
sql-user=YOUR_DATABASE_USER
sql-password=YOUR_DATABASE_PASS
sql-driver-class=org.postgresql.Driver

emil0r08:10:21

Solved the problem. For future reference in the event that someone else runs across the same problem I’ll detail it here. 1. I had a postgres storage backed datomic database running on one server. Both the transactor and the postgres RDBMS was running on the same server. 2. I had to switch to a public IP later on in the process when I wanted to access it from another machine. 3. Doing so broke the transactor, but since it was by now running as a service, I didn’t see that until much later when I started looking at the logs 4. Trying to connect to datomic from my peer didn’t work a. It helpfully threw an exception, telling my that it could not access the transactor on the host “localhost” b. What the exception did not tell me was that it also tried the server that the RDBMS was running on as a transactor host as well. This sent me on a wild goose chase, trying to find the error on the peer side, when it instead could not connect to the transactor at all, including the correct transactor address c. The documentation is very unclear on how you connect to the transactor when you have a SQL backed storage, and does not at all tell you that it uses the same address for transactor as storage by re-using the storage address as a transactor address. i. I might be missing in the documentation how you could specify a transactor, but from what I could read on the connect function in the documentation for datomic.api, everything is baked in to the connect function 5. Finally looking at the logs of the transactor, once I had made sure I could connect to the storage, I saw that the transactor now was not connecting to the storage. Running everything manually showed any faults real quick, and allowed for easy debugging of the issue Hope it helps someone 🙂

emil0r08:10:10

This is for on-prem

JohnJ17:10:28

What I'm missing here? trying to use a peer classpath function in a transaction:

(defn create-movie [db e title genre release-year]
    [[:db/add e :movie/title title]
     [:db/add e :movie/genre genre]
     [:db/add e :movie/release-year release-year]])

  (d/transact conn [[bar/create-movie "foo" "The Goonies" "action/adventure" 1985]])

#object[datomic.promise$settable_future$reify__7837 0x1d654b5f {:status :failed, :val #error {
 :cause "Cannot write bar$create_movie@51468039 as tag null"
 :via
 [{:type java.util.concurrent.ExecutionException
   :message "java.lang.IllegalArgumentException: Cannot write bar$create_movie@51468039 as tag null"
   :at [datomic.promise$throw_executionexception_if_throwable invokeStatic "promise.clj" 10]}
  {:type java.lang.IllegalArgumentException
   :message "Cannot write bar$create_movie@51468039 as tag nul
l"...

favila17:10:03

(d/transact conn [['bar/create-movie "foo" "The Goonies" "action/adventure" 1985]]) ?

favila17:10:30

(assuming bar is the full namespace not an alias)

JohnJ17:10:31

yes bar is the full ns and I"m running transact from bar too, quoting gives :cause "Could not locate bar__init.class, bar.clj or bar.cljc on classpath."

favila17:10:13

You’re transmitting a symbol to the transactor, not a function object (regardless of what the docs look like)

favila17:10:39

so is this on-prem? does the transactor have bar.clj in its classpath?

localshred17:10:08

Missing {:tx-data [[...]]}

JohnJ17:10:37

oh it needs to be available to the transactor classpath? the docs make it look like you can setup set them up with the peer only if you want, like you can set them in the transactor or peer

JohnJ17:10:03

@U5RFD1733 the peer lib differ from clients in syntax

👍 1
favila18:10:54

> To add a classpath function for use by transactors, set the DATOMIC_EXT_CLASSPATH environment variable before launching the transactor, e.g. if you added your code in mylibs/mylib.jar:

JohnJ18:10:58

yeah but the whole thing makes it sound like you can set them in the peer only too, maybe I'm misreading A classpath function is an ordinary Clojure function added to the classpath of a Datomic peer or transactor. To add a classpath function for use by peers, use your ordinary classpath-building tools, e.g. tools.deps, leiningen, or maven.

favila18:10:49

In the peer it’s just normal code. You need it in the peer to do e.g. d/with on a tx that uses that fn

favila18:10:38

the contrast here is between funs installed into the db, which are addressed by keywords/idents, and fns addressed by symbol, which you just make sure are in the environment of whatever will execute them.

JohnJ18:10:00

thx, do you know if the env var only receives jars and does it have to be compiled?

favila18:10:37

It’s just adding it into the classpath, so it can be anything java/clojure can accept

favila18:10:00

e.g. you don’t have to AOT anything

favila18:10:24

this is literally just (require 'the-txfn-symbol-I-saw) at the end of the day

JohnJ18:10:28

ok, was wondering if it has to be a jar or just a dir can do since it's not clear how DATOMIC_EXT_CLASSPATH extends the CP

favila18:10:43

it’s all in bash--it’s just an append

favila18:10:50

or a prepend, don’t remember which

JohnJ18:10:15

ok, just normal java classpath stuff thx

JohnJ18:10:44

is 'Database functions' the old method in case you needed to use java?

favila19:10:38

‘database functions’ is older, but the difference is just where the code is stored. Is it stored-procedure-like, or code-like?

JohnJ19:10:11

I see, so classpath functions are not really stored procedures

favila19:10:39

right they are just accessible in the environment, which you can change. They’re not versioned with schema.

👌 1
favila19:10:36

no transaction or other data change makes them available

Dustin Getz18:10:00

What is the fastest way to do a substring check in a :where clause? essentially clojure.string/includes? or re-matches

favila18:10:02

[(.contains ^String ?s "substr")]

Dustin Getz18:10:24

and this is a fullscan of all datoms under consideration?

favila18:10:33

?s must be bound already

favila18:10:50

predicates like this can only reduce the result set

Dustin Getz18:10:02

How bad is this naive query then, are you saying it's not so bad?

(d/q '[:find [?e ...]
              :in $ ?needle :where
              [?e :order/email ?email]
              [(user.util/includes-str? ?email ?needle)]]
            db (or ?email ""))

(defn includes-str? [v needle]
  (clojure.string/includes? (clojure.string/lower-case (str v))
                            (clojure.string/lower-case (str needle))))

favila18:10:02

It’s scanning every order-email assertion, yes, but that’s from [?e :order/email ?email] not the predicate

Dustin Getz18:10:22

In this case would it be generally encouraged to slug the string to lowercase at transaction time to avoid the computation in the loop, or is stuff like this generally considered idiomatic then

favila18:10:24

btw if you want a case-insensitive match I recommend using something which doesn’t force new string allocations

Dustin Getz18:10:32

i see so the idea is to reduce memory pressure moreso than optimzie the speed

favila18:10:46

e.g. org.apache.commons.lang3.StringUtils.containsIgnoreCase() which uses String.regionMatches under the hood

🙏 1
favila18:10:03

I mean, it is faster too

favila18:10:19

Also if this is really all you are doing, and the number of emails is very large, it may be better to use d/datoms + filter directly because that will use much less memory (maybe passing all or part of the result as input to another query). Queries need to realize their result sets and can’t be computed lazily or incrementally.

👍 4
thumbnail18:10:30

I want to find the first t-value where one of a set of attributes is asserted (any of them is fine). Right now i query for any of the attributes, convert the tx->t and find the lowest number. But this scales pretry badly as the database increases. Afaik theres no index i can use, so im considering other options. (Datomic client btw)

Dustin Getz18:10:14

tx is ordered as well isn't it?

Dustin Getz18:10:12

how about something like

(d/datoms (d/history db) {:index :aevt :components ...})

1
Dustin Getz18:10:54

first datom for attr in :aevt index might be what you want assuming the e is all in the same partition and thus the ids increase (not sure how this works in cloud)

Dustin Getz18:10:58

the history db includes retractions, perhaps you want a regular db

thumbnail19:10:44

for context, right now i have this query:

(d/q {:query '{:find  [(min ?t)]
                     :in    [$ [?attr ...]]
                     :where [[?e ?attr _ ?tx]
                             [(datomic.api/tx->t ?tx) ?t]]}
            :args  [db relevant-keys]})

thumbnail19:10:57

I’ll give d/datoms a try 🙂

Dustin Getz19:10:53

on second thought, the history db may not be indexed, so hopefully the vanilla :aevt index has what you want and can answer this query efficiently – please report back what you find

thumbnail19:10:40

(time (->> relevant-keys
           (map #(first (d/datoms db {:index :aevt :components [%]})))
           (reduce (fn [tt [a e v t]]
                     (min tt (datomic.api/tx->t t)))
                   Long/MAX_VALUE)))
This is working about 200x faster (30s to 150ms)

Dustin Getz19:10:56

does it return correct answers? Lol

thumbnail19:10:04

Haha yeah it returned the right answer 😅

1
Dustin Getz19:10:46

i don't think you need tx->t either

Dustin Getz19:10:18

(or rather you can call it once at the end if you need the basis in that form)

thumbnail19:10:44

it’s only called once for every attribute (and it’s just a pure function)

thumbnail19:10:01

but i’ll clean this up at some point for sure 🙂.

onetom01:10:49

@UHJH8MG6S u said u r using datomic client, right? i can't find tx->t in that:

Unable to resolve var: datomic.api/tx->t in this context
Unable to resolve var: datomic.client.api/tx->t in this context

onetom01:10:57

also, the tx values contain some partition number, which might not be monotonic as time goes on, so i think we can do min on the :tx dimension of datoms. (i haven't checked this claim personally, just heard it from my colleague)

thumbnail06:10:20

:thinking_face: think i have datomic-pro on the classpath, but using client. I need t because i have to feed it into tx-range. This code is used in a synchronization mechanism to elasticsearch (datomics fulltext isnt gdpr compliant) I could run tx->t inside a query to keep it “pure”-datomic client