Fork me on GitHub
#datomic
<
2017-03-06
>
mx200005:03:14

Hi, why do “"All clauses in 'or' must use same set of vars”” in datomic ‘OR’ query?

mx200005:03:26

I want to make a different query based on a boolean input.

Kira Sotnikov07:03:50

I'm reading the docs and watching 3 SQL engines Psql, mysql and oracle, in my understanding here sqlite is missing, so, could I configure sqlite as backend engine for a transactor?

favila14:03:10

@lowl4tency I have done this just for fun, but why not use dev transactor?

wilkerlucio17:03:02

hello, is it possible to increase the transaction timeout for a specific transaction on datomic?

wilkerlucio17:03:46

we are having an issue that a particular transaction (that has to be atomic, we can't break it) needs more time to process, we are wondering if it's possible to change the timeout while running this tx in specific, without having to change the general configuration, is that possible?

marshall17:03:04

@wilkerlucio Unfortunately the timeout is system-wide

Matt Butler21:03:09

Getting a the following error when querying a datomic db frequently, mapping over d/datoms (lazily I believe)

2017-03-06 20:43:49,580[ISO8601] [clojure-agent-send-off-pool-56] WARN  datomic.slf4j - {:message "Caught exception", :pid 1297, :tid 202}
	org.hornetq.api.core.HornetQNotConnectedException: HQ119010: Connection is destroyed
	at org.hornetq.core.protocol.core.impl.ChannelImpl.sendBlocking(ChannelImpl.java:296)
	at org.hornetq.core.client.impl.ClientSessionImpl.deleteQueue(ClientSessionImpl.java:365)
	at org.hornetq.core.client.impl.ClientSessionImpl.deleteQueue(ClientSessionImpl.java:375)
	at org.hornetq.core.client.impl.DelegatingSession.deleteQueue(DelegatingSession.java:326)
	at datomic.hornet$delete_queue.invokeStatic(hornet.clj:256)
	at datomic.hornet$delete_queue.invoke(hornet.clj:252)
	at datomic.connector$create_hornet_notifier$fn__8108$fn__8111.invoke(connector.clj:210)
	at datomic.connector$create_hornet_notifier$fn__8108.invoke(connector.clj:206)
	at clojure.lang.AFn.applyToHelper(AFn.java:152)
	at clojure.lang.AFn.applyTo(AFn.java:144)
	at clojure.core$apply.invokeStatic(core.clj:657)
	at clojure.core$apply.invoke(core.clj:652)
	at datomic.error$runonce$fn__48.doInvoke(error.clj:148)
	at clojure.lang.RestFn.invoke(RestFn.java:397)
	at datomic.connector$create_hornet_notifier$fn__8085$fn__8086$fn__8089$fn__8090.invoke(connector.clj:204)
	at datomic.connector$create_hornet_notifier$fn__8085$fn__8086$fn__8089.invoke(connector.clj:192)
	at datomic.connector$create_hornet_notifier$fn__8085$fn__8086.invoke(connector.clj:190)
	at clojure.core$binding_conveyor_fn$fn__6772.invoke(core.clj:2020)
	at clojure.lang.AFn.call(AFn.java:18)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
Code looks something like this, causing roughly 20 datomic queries a second against a single db snapshot.
(let [db (d/db (d/connect uri)]
  (doseq [element (->> (d/datoms db)
     (map #(.e %))
     (filter #(some-query db %)
     (map #(some-other-query  db %))
     (map #(another-query db %)))]
(try 
  (transact-and-http-io element))
(catch (log-error e)))
Do i need to look at refactoring my code to reduce the frequency of queries? Happens at different times during processing the datoms. Thanks 🙂

favila21:03:06

queries should not impact this at all

favila21:03:34

are you creating and destroying connections frequently, maybe inadvertently?

favila21:03:11

or network problems?

Matt Butler21:03:01

This single function call uses the same db snapshot created from a single connection

timgilbert21:03:58

Say, is there a (prev-t) function to get the t value of the transaction immediately before (basis-t), or should I just grab it out of (d/tx-range)?

Matt Butler21:03:05

Added db binding for clarity in case it its doing something i dont understand fully

favila21:03:36

@mbutler the d/datoms call is pseudocode right? you are including index and segment etc (I would expect a different error)

Matt Butler21:03:12

sorry, yes. The code works perfectly in localdev

favila21:03:02

and so this works only sometimes in prod? I am still suspecting a network problem

favila21:03:25

is it any different if you d/connect + d/db only once at startup?

Matt Butler21:03:38

I would think maybe im accidentally realising the sequence (which is ~700k datoms) and maybe running out of memory, but the jvm doesnt die

Matt Butler21:03:03

I dont understand, if i d/db once at startup my entire app would use the same snapshot no?

favila21:03:26

you say you are running this code many times a seecond?

Matt Butler21:03:11

no this code runs once, and maps over a seq of datoms once calling datomic queries on the sequence as i go.

favila21:03:19

(-> (d/connect uri) (d/db) (d/datoms :eavt 1000) (->> (take 5))) does that run?

favila21:03:32

(I am seeing if you ever connect at all)

Matt Butler21:03:23

I should update the code, snippet as i think its not quite clear

favila21:03:47

I understand. But the stacktrace is related to connection failures with hornetq, which is transactor communication

favila21:03:55

queries do not communicate with the transactor

favila21:03:03

So I suspect you are not even connecting

favila21:03:17

or you connect, but it fails later

Matt Butler21:03:29

I think that error might actually be unrelated

Matt Butler21:03:52

So in short i see the seq being consumed and performing some-oi but that after a random amount of time the code just stops running

Matt Butler21:03:46

That error appeared at the same time in my logs and was datomic related, if its transactor related then its likely something else it my app not happy that it cant transact.

Matt Butler21:03:07

However the above code stopped running again but there was no hornet error this time (in the logs).

Matt Butler21:03:26

What would happen if there was a network issue when querying a datomic db?

favila21:03:01

I am not sure if queries continue when connection dies

favila21:03:27

the transactor connection uses hornetq, keepallives, reconnects, etc

favila21:03:45

it is used to send transactions and receive the transaction queue

favila21:03:01

the queries use the storage connection

favila21:03:15

an error there would look storage-engine specific

favila21:03:34

e.g. if sql is engine, would see jdbc in the stacktrace

Matt Butler21:03:24

But one would expect an error? this is run in a java thread surrounded with a try/catch and my system has an uncaughtExceptionHandler

Lambda/Sierra22:03:29

If there is network disruption between the Peer and Transactor, you will see an exception only when connecting or transacting. If there is network disruption between the Peer and Storage, you will see exceptions when reading (querying).

favila22:03:31

the exception you paste is probably in one of datomic's threads, not yours

Matt Butler22:03:17

Yes, I am now assuming that the stack trace was an anomaly

Lambda/Sierra22:03:38

In addition, the Peer's background threads will log connection errors with the Transactor, which may show up as HornetQ / Artemis exceptions.

Matt Butler22:03:55

Maybe suggesting that there is a connection issue, but not in the code above which is the code that fails.

favila22:03:20

how does it fail? not by exception?

Matt Butler22:03:22

So i am left with no error but code that silently dies

Matt Butler22:03:32

The cpu usage on the box drops to idle levels and the (some/io) that happens in the doseq stops being logged

wei22:03:46

has anyone written a way to serialize EntityMaps by db/id and entity-t? or is that a bad idea?

favila22:03:49

no return value from the expression?

favila22:03:18

how do you know it is not finished rather than dead?

Matt Butler22:03:19

so due to the doseq nil is returned. Not sure if anywhere id expect this to end up.

Matt Butler22:03:50

Reasoning for the code not being finished is that if i call that function again it starts up again and carries on a little further

Matt Butler22:03:26

based on the i/o at the end elements would be filtered out at the filter stage

favila22:03:08

does it work if you replace some-io with something trivial? like printing progress?

favila22:03:30

or can you log that the doseq actually finished?

Matt Butler22:03:21

Yes, those might be good steps. One of those unfortunate prod only bugs, that happens when dealing with large (d/datoms)

Matt Butler22:03:02

I forgot something that might matter, the (some-io) does transact, but again it only produced the hornet error when it ran/failed the first time. Subsequent times no error was logged.

favila22:03:44

does some-io apply backpressure?

favila22:03:11

maybe you overwhelmed a downstream system

favila22:03:37

(consistent with larger input)

Matt Butler22:03:55

Is it possible something is caching the error

favila22:03:49

i don't know what's in some-io. behavior of things when flooded is not always sensible

favila22:03:11

could be just hanging, retrying forever, whatever

favila22:03:18

no exception in that case

Matt Butler22:03:25

constructs a map, does another query and performs a http request

Matt Butler22:03:14

I think that maybe removing the i/o part would be a good idea. Maybe my http request (sync) is hanging forever

favila22:03:54

Yeah, my first instinct is to simplify this and get solid confirmation that the process is hanging

Matt Butler22:03:02

Im using clj-http which you'd expect to time out.

favila22:03:13

if you suspect datomic, take everything non-datomic out

Matt Butler22:03:44

probably a wise bit of advice

favila22:03:48

and get some observable indicator of progress or doneness

Matt Butler22:03:22

The some i/o + doseq logs but none of the map/filter stages

Matt Butler22:03:53

which is probably a mistake

favila22:03:54

do you know the last datom that should appear after all filtering?

favila22:03:02

or approx how many datoms to expect?

Matt Butler22:03:09

actually i can take a specific number of datoms off the front for now

Matt Butler22:03:21

so i can say exactly which datom + how many

favila22:03:08

could storage be throttling? (just a crazy idea)

favila22:03:38

would explain low cpu, queries waiting for their io requests to come back

favila22:03:52

I would expect timeout eventually though

Matt Butler22:03:00

this is between an ec2 node and dynamodb

Matt Butler22:03:24

and the storage metrics seem super low vs my provisioned capacity

favila22:03:03

well good luck figuring this one out

Matt Butler22:03:52

spitting it into generating the seq and consuming it was a good idea so thanks 🙂

marshall22:03:10

@mbutler check ddb metrics and alarms. Ddb throttling on reads could be responsible

Matt Butler23:03:32

yeah seems to be super low, single digit % of provisioned read. Getting late here, going to do some further testing tomorrow and report back 🙂