datomic 2020-03-12 | Slack Archive

onetom05:03:19

Is there a Clojure client library which provides the same interface as the Datomic Client API, but for accessing Datomic REST APIs? My use-case is that I would like to collaborate with someone who would be using a Datomic REST API from Python, while I'm using the same database from Clojure code.

onetom05:03:24

I guess my main motivation is the ability to share some in-memory Datomic DB, which is possible via the REST server, but I have no convenient Clojure interface to it. OR I can run a peer-server with an in-memory database, which I can connect to conveniently from Clojure via the Datomic Client API, BUT there are not Datomic Client API libraries for other languages, like Python. 3rd option would be to just run a transactor with protocol=mem in its config.properties file, but that throws this error:

java.lang.IllegalArgumentException: :db.error/invalid-storage-protocol Unsupported storage protocol [protocol=mem] in transactor properties /dev/fd/63

onetom05:03:00

The reason for wanting to share an in-memory Datomic DB is have a really tight feed-back loop within our office, where we have 1 machine with 80GB RAM, while other machines have only 16GB

fmnoise07:03:02

hi everyone, last time I periodically see the following error in logs

org.apache.activemq.artemis.api.core.ActiveMQNotConnectedException: AMQ119010: Connection is destroyed
	at org.apache.activemq.artemis.core.protocol.core.impl.ChannelImpl.sendBlocking(ChannelImpl.java:335)
	at org.apache.activemq.artemis.core.protocol.core.impl.ChannelImpl.sendBlocking(ChannelImpl.java:315)
	at org.apache.activemq.artemis.core.protocol.core.impl.ActiveMQClientProtocolManager.createSessionContext(ActiveMQClientProtocolManager.java:288)
	at org.apache.activemq.artemis.core.protocol.core.impl.ActiveMQClientProtocolManager.createSessionContext(ActiveMQClientProtocolManager.java:237)
	at org.apache.activemq.artemis.core.client.impl.ClientSessionFactoryImpl.createSessionChannel(ClientSessionFactoryImpl.java:1284)
	at org.apache.activemq.artemis.core.client.impl.ClientSessionFactoryImpl.createSessionInternal(ClientSessionFactoryImpl.java:670)
	at org.apache.activemq.artemis.core.client.impl.ClientSessionFactoryImpl.createSession(ClientSessionFactoryImpl.java:295)
	at datomic.artemis_client.SessionFactoryBundle.start_session_STAR_(artemis_client.clj:81)
	at datomic.artemis_client$start_session.invokeStatic(artemis_client.clj:52)
	at datomic.artemis_client$start_session.doInvoke(artemis_client.clj:49)
	at clojure.lang.RestFn.invoke(RestFn.java:464)
	at datomic.connector.TransactorHornetConnector$fn__10655.invoke(connector.clj:228)
	at datomic.connector.TransactorHornetConnector.admin_request_STAR_(connector.clj:226)
	at datomic.peer.Connection$fn__10914.invoke(peer.clj:239)
	at datomic.peer.Connection.create_connection_state(peer.clj:225)
	at datomic.peer$create_connection$reconnect_fn__10989.invoke(peer.clj:489)
	at clojure.core$partial$fn__5839.invoke(core.clj:2623)
	at datomic.common$retry_fn$fn__491.invoke(common.clj:533)
	at datomic.common$retry_fn.invokeStatic(common.clj:533)
	at datomic.common$retry_fn.doInvoke(common.clj:516)
	at clojure.lang.RestFn.invoke(RestFn.java:713)
	at datomic.peer$create_connection$fn__10991.invoke(peer.clj:493)
	at datomic.reconnector2.Reconnector$fn__10256.invoke(reconnector2.clj:57)
	at clojure.core$binding_conveyor_fn$fn__5754.invoke(core.clj:2030)
	at clojure.lang.AFn.call(AFn.java:18)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)

fmnoise07:03:49

datomic on-prem 0.9.6045 running in docker container in k8s

onetom08:03:41

I was also on the verge of trying to use a Docker containerized on-prem Datomic. If I understand correctly, that's not an officially recommended way to run a Datomic system. Can you share some setup instructions for it please?

fmnoise07:03:55

sometimes it results in restarting datomic with the {:message "Terminating process - Heartbeat failed", :pid 13, :tid 223}

fmnoise09:03:36

oh, just found out that peer version is lower than transactor, 5951 vs 6045, probably that's the issue 😞

fmnoise09:03:15

will back if issue happens after update

grounded_sage12:03:11

How do you do the equivalent of a SQL left-join?

favila13:03:29

one possibility is get-else with a sentinel value (not nil)

favila13:03:44

if the join is just for selecting, consider using pull instead

favila13:03:04

in which case the “nil” will be just a missing map entry

grounded_sage13:03:13

(d/q '[:find (pull ?e [*]) (pull ?e1 [*]) (pull ?e2 [:bank/name])
       :where
       [?e  :customer/id ?id]
       [?e1 :address/id ?id]
       [(get-else $ ?e2 :bank/id "No Value") ?id]
       [(get-else $ ?e2 :bank/name "No Value") ?name]]
     @conn)

grounded_sage13:03:32

That is what I am trying to do @U09R86PA4

favila13:03:11

You mean you want the map result of pull to say “No Value”?

grounded_sage13:03:00

That’s just the default value?

favila13:03:20

what you put here doesn’t make sense

favila13:03:47

I don’t know what your desired output is

grounded_sage13:03:21

Basically :customer/id and :address/id returns 300,000 results. When I join on bank it returns 10,000

favila13:03:21

what is ?e2 unifying against?

grounded_sage13:03:58

I effectively want to have the most possible results with the others kind of enriched with extra data when it is found

favila13:03:35

can [?e :customer/id] be missing?

grounded_sage13:03:52

favila13:03:17

Can [?e :customer/id "No Value"] happen?

grounded_sage13:03:39

favila13:03:40

or [?e1 :address/id "No Value"]?

grounded_sage13:03:19

Like I want it to start with :customer and effectively merge each result onto it with default values if they are not present.

grounded_sage13:03:43

Like if I joined a table on a column

favila13:03:55

the only thing unifying these clauses is ?id so I don’t see what the join is

favila13:03:57

is the value of :bank/id and :customer/id supposed to unify?

grounded_sage13:03:10

It only returns me results that have a :bank/id so I lose all the previous finds.

favila13:03:29

is there any ref attribute you can use to walk between a customer, address, and bank?

grounded_sage13:03:39

Yes I am unifying those and then wanting to add some extra data if it is there

grounded_sage13:03:08

I don’t quite understand refs yet so no.

favila13:03:18

ok, so you’re joining by some concrete value “id” (I guess a string?) which happens to be common to :customer/id :bank/id, and :address/id?

favila13:03:50

you can do this but it’s not the natural way of modeling relationships in datomic

grounded_sage13:03:58

Yes. But there is very little bank data.

grounded_sage13:03:16

What is the natural way of doing things in datomic? Using the refs etc?

favila13:03:30

using refs

favila13:03:52

(I would normally assume a :customer/id was scoped to customers)

favila13:03:11

so :customer/address -> some entity with address attributes on it

favila13:03:25

:customer/bank -> some attribute with bank entities on it

favila13:03:43

at this point you have a single pull expression, because it can see all the joins

favila13:03:08

you need two steps:

grounded_sage13:03:19

This is kind of a dirty data CSV import. So I haven’t got much luxury in the naming of things. Was just assessing if datalog will help me with pre-processing. But it’s harder than I thought

favila13:03:20

[?c :customer/id ?id][?a :address/id ?id][?b :bank/id ?id] for those with banks

favila13:03:13

[?c :customer/id ?id][?a :address/id ?id](not [_ :bank/id ?id]) for those without

favila13:03:56

get-else isn’t good here because it can’t make use of a value index for :bank/id

grounded_sage13:03:20

So you would split it into 2 queries?

favila13:03:24

I would

favila13:03:00

You could unify to one query using or and a sentinel for the missing bank case, but you can’t pull off that sentinel

grounded_sage13:03:18

Thanks. I was trying to avoid that but I guess it requires stronger schema for that.

favila13:03:59

up to you if it’s worth augmenting your data

favila13:03:26

it’s easy to find the common values via query and build the tx you need to make the refs

favila13:03:01

[?c :customer/id ?id][?a :address/id ?id] -> [:db/add ?c :customer/address ?a] for example

grounded_sage13:03:29

The problem I have is that this is a CSV ETL job. Daily drops with millions of rows and 10's of columns. So doing these checks doesn’t seem that efficient.

grounded_sage13:03:35

Though I could be wrong….

grounded_sage14:03:12

The CSV’s is pretty much always the same… with just a bit of new data.

favila14:03:31

I hate to say it, but sql might be a better fit, depending on what you do

grounded_sage14:03:53

I’m starting to believe it.

favila14:03:11

if you aren’t becoming the source of truth for what you ingest, and that stuff is stable-shaped already and you just want to make some joins, sql may be better

favila14:03:51

datomic shines with graph-shaped data you’re growing with (i.e. history of changes is important) and a primary datastore you add to incrementally with a live application

👍 4

favila14:03:22

it doesn’t do giant bulk imports well, and it can join by value but you miss out on a lot of graph-shaped niceties

favila14:03:02

meanwhile some sql engines can query csv directly, import csv with magical fast bulk importers, and are already used to joining by value

favila14:03:12

e.g. have you considered redshift or athena (for cloud things at huge scale)? I think they both work by sticking table-shaped files (e.g. csv) into s3 and then “just working”

grounded_sage14:03:30

It’s a shame. Because I was using Meander to transform the CSV’s before entering them into the DB. Then I was using the DB for the joins and pulling it all out. The semantics are pretty much the same because they both use logic programming.

grounded_sage14:03:46

We haven’t got huge scale

favila14:03:23

can you use meander to make the refs ahead of time?

grounded_sage14:03:11

You mean keeping them all to the same ns of the keyword?

favila14:03:39

I mean, is there something from the unit of work you can use to mint a unique upserting attribute

grounded_sage14:03:40

I don’t think so. It’s all good 🙂

favila14:03:57

e.g. {:customer/bank [:bank/unique-id customer-id]}

favila14:03:23

well it wouldn’t be like that exactly

favila14:03:31

[{:db/id "bank-123" :bank/unique-id "value-derived-from-customer}{:db/id "customer-123" :customer/bank "bank-123, ,, :other/customer "stuff",,,}]

favila14:03:59

that said, I think whether you use datomic or not should be driven entirely by what you plan to do after you ingest these CSVs

favila14:03:29

iirc there’s something that can do pulls against postgres

favila14:03:12

https://walkable.gitlab.io/ ?

grounded_sage19:03:27

@U1C36HC6N the convo.

grounded_sage19:03:33

@U09R86PA4 yea I find it interesting though I also think I need to know SQL and tables a bit more before working with such abstractions.

Vishal Gautam16:03:23

Hi I am trying to run datomic transactor. I downloaded Datomic. And following the steps from https://docs.datomic.com/on-prem/dev-setup.html. When I try to run the local transactor. I get this error. java.lang.Exception: 'protocol' property not set Any Ideas. Here is the full error log

Launching with Java options -server -Xms1g -Xmx1g -XX:+UseG1GC -XX:MaxGCPauseMillis=50
Terminating process - Error starting transactor
java.lang.Exception: 'protocol' property not set
	at datomic.transactor$ensure_args.invokeStatic(transactor.clj:116)
	at datomic.transactor$ensure_args.invoke(transactor.clj:105)
	at datomic.transactor$run$fn__22768.invoke(transactor.clj:387)
	at clojure.core$binding_conveyor_fn$fn__5754.invoke(core.clj:2030)
	at clojure.lang.AFn.call(AFn.java:18)
	at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
	at java.base/java.lang.Thread.run(Thread.java:835)

Vishal Gautam16:03:44

I downloaded Datomic, added the licence key in dev-transactor-template.properties and ran this script

bin/transactor datomic_properties/dev-transactor-template.properties

favila16:03:04

is protocol=dev in that file?

Vishal Gautam16:03:30

in the dev-transactor-template.properties file @U09R86PA4?

favila16:03:42

yes

Vishal Gautam16:03:26

@U09R86PA4 So this is what I have so far

protocol=dev
host=0.0.0.0
port=4334

license-key=<MY_KEY>

memory-index-threshold=32m
memory-index-max=256m
object-cache-max=128m

Vishal Gautam16:03:59

Now I am getting Terminating process - License not valid for this release of Datomic

favila16:03:48

so you miscopied the key, or the key isn’t valid for the version you’re using

favila16:03:25

you can check this at your http://my.datomic.com site

favila16:03:01

probably also in the email that sent you the license key

Vishal Gautam16:03:21

Okay rookie mistake, my key was expired :face_palm:

favila16:03:02

licenses are perpetual, so you can use an older version (released before it expired)

🎉 4

favila16:03:12

you just won’t get updates

Vishal Gautam16:03:17

@U09R86PA4 thank you so much. Local Transactor is running now

jackson22:03:31

What makes the dev database not production worthy? Are there signficant performance advantages to using postgres or another sql server? Mostly concerned about on-prem solutions at the moment.

favila01:03:49

Dev databases are embedded h2 databases, served by the same process as the transactor

onetom06:03:03

As a consequence, the data you handle with Datomic leaves in files on the disks of the transactor machine. From an operations perspective it's not a great architecture to couple these 2 concerns. Your transactor process doesn't need any persisted state, you could run it (or them) on ephemeral machines.

mavbozo09:03:01

for one of my company internal app, we use datomic dev with h2 database on a dedicated server. works fine til now.

jackson13:03:48

Good points. I had some unexpected time to play last night and already had postgres installed. For what I'm doing, my time to transact a fair amount of data went from 360s with dev to 40s with postgres. Hardly scientific testing, but indicative still.

2020-03-12

Channels