Fork me on GitHub
#datomic
<
2016-07-14
>
zentrope04:07:23

Oh, no. 0.9.5385 on FreeBSD: client can’t talk to host!

zentrope04:07:28

{:type clojure.lang.ExceptionInfo
   :message Error communicating with HOST 127.0.0.1 on PORT 4334
   :data {:alt-host nil, :peer-version 2, :password cvDgzartcCIkdLS6YW2s0M7IJd5zG1p0BEPfo28Dn/c=, :username Knqi3QCMoTYAPzQZgsRpbX9IcqLjhjgbvArvmUeOv64=, :port 4334, :host 127.0.0.1, :version 0.9.5344, :timestamp 1468471153847, :encrypt-channel true}

zentrope04:07:38

version 5344?

zentrope04:07:55

Ah. My pkg that updates datomic is … deficient.

isaac14:07:36

How many Datoms in one transaction is recommended;

hans15:07:55

depends on how long you're willing to let other transactions wait. if you're committing very large transactions, you'll also have to tune the transactor and the peers to tolerate long pauses.

hans15:07:13

100s are safe

isaac15:07:09

I has 272 new entities (about 700 datoms). I want to save it in one transaction. I got tempid coniflict.

isaac15:07:58

how many tempids in one transaction is safe?

marshall15:07:17

@isaac: 700 datoms should not be an issue. Can you post the exception here?

isaac15:07:24

1. Caused by datomic.impl.Exceptions$IllegalArgumentExceptionInfo
   :db.error/datoms-conflict Two datoms in the same transaction conflict {:d1
   [17592186046132 :project.stage/step 3 13194139535021 true], :d2
   [17592186046132 :project.stage/step 1 13194139535021 true]}

   {:d1 [17592186046132 :project.stage/step 3 13194139535021 true], :d2 [17592186046132 :project.stage/step 1 13194139535021 true], :db/error :db.error/datoms-conflict}

                 error.clj:  124  datomic.error/deserialize-exception
                  peer.clj:  400  datomic.peer.Connection/notify_error
             connector.clj:  169  datomic.connector/fn
              MultiFn.java:  233  clojure.lang.MultiFn/invoke
             connector.clj:  194  datomic.connector/create-hornet-notifier/fn/fn/fn/fn
             connector.clj:  189  datomic.connector/create-hornet-notifier/fn/fn/fn
             connector.clj:  187  datomic.connector/create-hornet-notifier/fn/fn
                  core.clj: 1938  clojure.core/binding-conveyor-fn/fn
                  AFn.java:   18  clojure.lang.AFn/call
           FutureTask.java:  266  java.util.concurrent.FutureTask/run
   ThreadPoolExecutor.java: 1142  java.util.concurrent.ThreadPoolExecutor/runWorker
   ThreadPoolExecutor.java:  617  java.util.concurrent.ThreadPoolExecutor$Worker/run
               Thread.java:  745  java.lang.Thread/run

marshall15:07:34

You’re attempting to assert that :project.stage/step is both 1 and 3 for the same entity (17592186046132) in the same transaction. If it is cardinality one, it can only get a single value at a time.

isaac15:07:17

Yeah! the message told our that. but I store them in two different entities, all entity assign :db/id via (d/tempid :db.part/project)

marshall15:07:19

are you using the (d/tempid :db.part/project <SomeNegativeNumber>) arity anywhere? If so, you may have inadvertently used the same negative indicator for what should have been two different entities

isaac15:07:53

no, I just use (d/tempid :db.part/project)

isaac15:07:06

(d/tempid :db.part/project)

isaac15:07:31

Oh, I find the bug! there has two entities has the some attribute value, and this attribute is set``` db.unique/identity

isaac15:07:03

The data is too large? 🙂

cezar18:07:28

Playing with Datomic Pro and it appears to me that every transaction is saved in one (or more) rows in the DATOMIC_KVS table. I think this could pose problems for me given that I intend to have 1B+ datoms in a single database and I doubt Postgres or MySQL will cope well with billions of rows in a single table. Especially given that Datomic appears to add at least one index to the DATOMIC_KVS table (the primary key). Should I try to bunch up my datoms into transactions as coarse as possible or should I look for a different underlying storage? (might be a pain given that I'd need to convince enterprise architects that we need a new NoSQL system deployed)

marshall18:07:55

@cezar: The ‘soft limit’ for Datomic datoms is 10Billion per database. There should be no problem using Postgres or MySQL for a 1B datom DB. Transactions should be sized ‘transactionally’ - that is put things you need to have atomically combined in a single transaction into the same transaction. It’s best to limit transactions to 100s or low 1000s of datoms if possible, but there is no harm in having small granular transactions.

cezar18:07:06

Yeah, I understand this @marshall but I'm trying to figure out if Datomic won't inadvertently slaughter the performance of the underlying RDBMS by simply inserting billions of rows into DATOMIC_KVS. In the past I have not had good experiences with tables that have huge row counts on traditional RDBMS engines like Postgres. Their B+ trees became progressively slower to the point where beyond ~100 million rows inserts became unusably slow.

cezar18:07:12

so my question is. Has anyone had good success with using a RDBMS for a large Datomic database or should I go with Cassandra/Couchbase etc as the backing store?

marshall18:07:10

The backend storage is strictly used as a k/v storage and all data are immutably written - in practice, yes, we have multiple customers running very large databases (>1B datoms) in production using RDBMs backing storage

cezar18:07:55

I see. Maybe I'm just not configuring my Postgres properly... on a related note does Datomic actually need the primary key on DATOMIC_KVS?

cezar18:07:17

because I'd be a lot more comfortable if there wasn't any indexes on the DATOMIC_KVS table

marshall18:07:49

The primary key is required.

cezar19:07:52

Are expectations though that NoSQL backing storages (e.g. Cassandra, Couchbase) are more likely to provide more consistent performance for large Datomic databases or is it a wash in your experience?

marshall19:07:02

We’ve seen very good and consistent perf with RDBMs backing storage; if you absolutely need the highest level of performance and throughput, yes, something like DynamoDB or Cassandra is merited

hans19:07:51

@cezar: We're running out .25 bn datoms database against a basically untuned Postgres and have no performance issues.

cezar19:07:20

@hans what is the ingestion rate that you observe if you fire your datoms at a rapid speed? Also how large or small are your transactions? I get about 34,000 datoms/sec which I'm very happy with but I still have a fairly small set and I'm sending them within large transactions (not ideal for my case)

hans19:07:21

Large transactions are one of the pain points that made us look for alternatives. We have not measured insertion rates in a long time, but we had no reason for that because performance is sufficient for our application.

cezar19:07:19

> Large transactions are one of the pain points that made us look for alternatives. @hans: are you saying that your transactions are too large for Datomic to cope with or that you had to resort to large transactions to make the ingestion performance acceptable?

hans19:07:52

We have a lot of transactions that are too large for Datomic, and we've been engineering around that by splitting the operations up into smaller units. This is just a workaround, though, and the lack of support for larger transactions is one of the reasons why we're using another database for our next project.

cezar19:07:17

oh I see. Kind of the opposite of the issue I may have. I have lots of data with no clear transaction delineation. Which is nice because I can make them arbitrarily large or small as long as I get good performance. However, the "time travel" of Datomic is nearly ideal for my use case so I'm not eager to explore alternatives. Out of curiosity though, what alternative are you looking at?

hans19:07:58

We're now migrating to MarkLogic for most of the things that we do. It suits our application space better, and in fact it has time travel as well. We're not planning to use that feature in the same way as we used Datomic's reified transactions, though, because we've learned that it is important to us to change the history if the business requires it or if we need to correct errors that occurred in the past.

cezar21:07:55

Never heard of that DB. Hard to find any real info on it beside marketing hype. Have you done a proper evaluation of its RDF capability? I'm a bit skeptical tbh from what little googling I've done on it...