Fork me on GitHub

I am looking to generate some queries based on user input. I think this is doable but given that Datomic doesn’t have a query optimiser, how do I make sure that I order the clauses in my :where in the right (or at least a reasonably right way)?


Are there some guidelines that I can go by when generating the query?


Are datomic s3 backups just the new transactions since the previous backup?


Is it possible for a transaction to have only partially completed due to java.lang.OutOfMemoryError?


@gardnervickers: Since 0.9.5130 Datomic backups have been incremental if they’re issued against the same storage location:


@sdegutis: Transactions are atomic, so they either complete successfully or fail, there is no way to have a ‘partial transaction’; did you see an OOM error on the transactor?


@casperc: Are you generating the entire query or just altering parameters based on the user input?


Phew. @marshall I just verified that it did not partially go through. Thank goodness for ACID compliance I suppose.


@marshall: I think so... I tried to d/transact-async a hundred thousand entities into existence, and got java.lang.OutOfMemoryError.


@sdegutis: create 100,000 entities within a single transaction? that is a bit on the large size for number of datoms in a single txn - do you need to be able to create those together atomically?


@marshall: Probably not, I'm devising a way of splitting this data migration into multiple transactions. Think I found a way.


@marshall: It does need to be done within the same 20 minutes though.


I’d definitely recommend splitting something that size up. I don’t think it should be particularly hard to get it through in 20 minutes, of course it will depend on the specifics of your system and schema, etc.


@marshall: Someone yesterday mentioned that Datomic recomends a maximum of 10 billion datoms in a database. After this migration we'll have gone from 5 million to 7 million, which eases my mind considering it's not even 1000th of the max.


But I just didn't anticipate that it would be too big for a single transaction.


But yeah I've got me an idea for splitting it up.


Incidentally, the “Understanding and Using Reified Transactions” talk here: discusses a few approaches to large operations that span transaction boundaries


@marshall: I am generating the entire query. Our data model forms a DAG and I am generating a :where clause joining from (if that is the right way to put it) one of the leaf nodes to the root.


It might just be that it is not a problem though if I put the clauses with input parameters first and then just join up towards the root.


@casperc: If your user-paramaterized clauses narrow the dataset fairly substantially, that sounds like a reasonable place to start. I’d recommend against premature optimization and tend to worry about making it faster only if you see significant perf issues

Ben Kamphaus20:05:11

@casperc: might be helpful to look at the code in the mbrainz sample database for generating rules: and the resulting rules: for graph traversal for collaborating artists.


@marshall: Sound advice, I’ll see how it performs. 🙂 I guess I was looking for some reference material of some sort for generating the query


@bkamphaus: Perfect, I got my wish delivered 🙂


Right, the other thing I was going to say was that it sounded like a recursive rule might fit the problem, depending on your schema.

Ben Kamphaus20:05:30

@sdegutis: if you haven’t yet, might want to check out the transaction pipeline example here: — though that’s for the step after you break up the transaction. (you put transactions on a channel that the tx-pipeline function would take from).


is there any way that calling (d/tempid :db.part/user) in lazy-seq's would result in producing the same db/id?


(when i go to transact the lazy seq, i mean

Ben Kamphaus21:05:34

@bvulpes: so not generally but two possible issues: 1 - messing up the code so you just generate it once and repeat the generated value, and 2 - transaction functions that generate tempids can unintentionally conflict with tempids generated on the peer


thanks bkamphaus


ran it down to a mistaken db/unique