Fork me on GitHub
#datomic
<
2016-05-26
>
arthur.boyer02:05:49

I’m getting this intermittent error:

Critical failure, cannot continue: Could not write log
java.lang.Error: Conflict updating log tail
	at datomic.log.LogImpl$fn__7023.invoke(log.clj:484)
	at datomic.log.LogImpl.append(log.clj:476)
	at datomic.log$fn__6684$G__6596__6688.invoke(log.clj:59)
	at datomic.log$fn__6684$G__6595__6693.invoke(log.clj:59)
	at clojure.lang.Atom.swap(Atom.java:65)
	at clojure.core$swap_BANG_.invoke(core.clj:2234)
	at datomic.update$writer$log_block__18834.invoke(update.clj:335)
	at datomic.update$writer$proc__18842.invoke(update.clj:355)
	at datomic.update$writer$fn__18845$fn__18847.invoke(update.clj:364)
	at clojure.lang.AFn.run(AFn.java:22)
	at java.lang.Thread.run(Thread.java:745)
I’m not sure how to go about debugging this. I think it’s caused by some data and schema migration code, but at the application end I just get:
May 26, 2016 2:14:37 PM org.hornetq.core.protocol.core.impl.RemotingConnectionImpl fail
WARN: HQ212037: Connection failure has been detected: HQ119015: The connection was disconnected because of server shutdown [code=DISCONNECTED]
ExceptionInfo :db.error/transactor-unavailable Transactor not available  datomic.peer/transactor-unavailable (peer.clj:186)
Has anyone got any ideas on how to track the cause of this down?

taylor.sando12:05:13

Would there be a way to ask for a recursive attribute, but only a certain number of them? For example person/friend, you're asking the system to get me this person and 100 people related to him through person/friend. So if the person has 50 friends, and all his friends have 50 friends. You'd get the 50 friends, and then it would grab the next friend, and his 50 friends, but it would stop there? I feel like you'd have to do that manually through entity, rather than a query. You'd get the person, and then you'd have a local transient/atom which would hold friends while you walked the entity. Seems like you'd have to do it with reduce, so you could call reduced and stop the function early once you had found the 100 people.

taylor.sando12:05:11

I guess it would be loop/recur, not reduce

taylor.sando12:05:58

I'll have a look at it.

bkamphaus13:05:57

@arthur.boyer: is the storage responsive, or do you see a trend toward spikes in storage latency? (StoragePutMsec metric) — the log write is a write to storage. Basically if the transactor can’t update the log tail it’s usually a symptom of not being able to write to storage. Since this is how transactions persist/are made durable, the transactor will fail (Datomic is a CP system in CAP terms).

bkamphaus13:05:49

@arthur.boyer: didn’t read that carefully enough “Conflict updating tail” usually comes up if something changes storage out from under the transactor. Either a manual write or truncation to Datomic’s table/keyspace, or restoring a database in place (while transactor is up)

bkamphaus13:05:11

can also be consistency guarantees not met by storage during e.g. a riak or cassandra node rebalancing

arthur.boyer21:05:04

@bkamphaus: Restoring a database in place. It’s happening in my dev environment when I restore a database backup. Restarting the transactor prevents the problem. Thanks.