Fork me on GitHub
#datomic
<
2018-01-30
>
kenny00:01:47

Why is #datom[17592186045433 87 "start" 13194139534330 false] included twice in the :tx-data for the last transaction in this snippet:?

(let [conn (d/connect db-uri)]
  @(d/transact conn [{:db/valueType   :db.type/string
                      :db/cardinality :db.cardinality/one
                      :db/ident       :test/attribute}])
  (let [{:keys [tempids]} @(d/transact conn [{:db/id          "start"
                                              :test/attribute "start"}])
        id (get tempids "start")]
    (:tx-data
      @(d/transact conn [[:db/add id :test/attribute "new"]
                         [:db/retract id :test/attribute "start"]]))))
=>
[#datom[13194139534330 50 #inst"2018-01-30T00:58:22.745-00:00" 13194139534330 true]
#datom[17592186045433 87 "new" 13194139534330 true]
#datom[17592186045433 87 "start" 13194139534330 false]
#datom[17592186045433 87 "start" 13194139534330 false]]

marshall01:01:05

@kenny you don't need to explicitly retract "start". The new value will 'upsert' and the retraction will be added automatically

marshall01:01:43

I have a theory as to why you see it in your example. If I'm correct it could be considered a bug, but won't influence anything negatively

kenny01:01:55

Right but in this case the transaction data is generated based on a "test" transaction with d/with.

marshall01:01:42

The example isn't using with

kenny01:01:13

I pasted the generated transaction data so others could replicate the behavior.

kenny01:01:04

I can explicitly de-dupe to work around it but it doesn't seem like the :tx-data was meant to include duplicate datoms.

marshall01:01:05

Is the duplicate causing a problem?

kenny01:01:20

Yes. DataScript doesn't like it 🙂

kenny01:01:50

Plus we sync all transaction data to a Kafka topic and this will produce lots of duplicate data.

marshall01:01:01

I'll bring it up with the team tomorrow.

kenny01:01:10

Awesome. Thanks!

kenny01:01:50

Actually, I spoke too soon. DataScript doesn't appear to be affected by it. Duplicate data argument still holds, however.

kenny01:01:24

Data would be duplicated in a Kafka topic and use additional bandwith to send to every connected client.

caleb.macdonaldblack03:01:07

I want to store my entity defaults in Datomic. Can I attach this to the attribute somehow? Can I create a custom attribute for my entity schema? Or do I need to have two seperate attributes like: entity/attr-a enitity/attr-a-default

robert-stuttaford06:01:54

@caleb.macdonaldblack you can assert additional facts onto the attr itself {:db/id :your/attr :your/attr-default-value <value>} of course, this means you need a definition of :your/attr-default-value with the same type 🙂

caleb.macdonaldblack06:01:25

@robert-stuttaford Thanks! I think that’s what I’m looking for

sleepyfox10:01:13

Question: if I want to write black-box tests for a REST API that is powered by datomic, is it possible to run a dev instance of datomic as a docker image with a canned test data-set?

robert-stuttaford10:01:42

yep @sleepyfox - if the transactor’s storage is inside the docker image (which dev does via an h2 database)

robert-stuttaford10:01:49

not sure if docker filesystem are mutable though? the transactor would need to be able to add stuff if you’re testing transactions

robert-stuttaford10:01:58

i don’t know docker at all 😉

sleepyfox10:01:02

I was wondering whether anyone had actually tried this before, or whether I am missing a trick that actually makes this unecessary...

sleepyfox10:01:51

Don't worry about Docker, it can do everything that I need it to, my question isn't really about Docker, but rather testing (micro)services backed by datomic

robert-stuttaford10:01:19

well, you could just use an in memory database, but that’s not black-box, because you’d need extra code to set the db up

robert-stuttaford10:01:42

it is much simpler though, because it can work basically the same as fixtures for unit tests

robert-stuttaford10:01:50

one option is to prepare a database with everything you need, back it up, then restore that db to a fresh transactor and provide that transactor uri to your service

robert-stuttaford10:01:04

that at least makes the process repeatable

robert-stuttaford10:01:19

and lets you iterate on the data and the tests without touching the service

sleepyfox10:01:22

Yes, it needs to be deterministic

donmullen17:01:18

What are some of things that might cause this error - a query that pulls in a rule :

java.lang.Exception: processing rule: (q__30187 ?job-num ?latest-action-date), message: processing clause: (is-large-alt? ?e), message: java.lang.ArrayIndexOutOfBoundsException: 14, compiling:(NO_SOURCE_FILE:47:27)

donmullen17:01:38

Seems strange that I get processing rule and java.lang.ArrayIndexOutofBoundsException after the query has been running for a while.

donmullen17:01:34

Taking out the attributes in the query that are also referenced in the rule seems to remove the exception. I’m new to using rules - but that doesn’t seem like something that would be disallowed.

donmullen17:01:25

@marshall any thoughts here?

marshall18:01:10

can you share the query and rule?

marshall18:01:27

sorry just saw you did

marshall18:01:21

i think this is because you’re asking for ?prop-area twice

marshall18:01:24

in your find spec

marshall18:01:13

user=> (d/q '[:find ?e ?e :in $ :where [?e :db/doc _]] (d/db conn))
ArrayIndexOutOfBoundsException 1  clojure.lang.RT.aset (RT.java:2376)
user=>

donmullen18:01:52

Doh! 😩 Thanks @marshall.

devn17:01:35

Did the some of the Datomic videos go away? Specifically, the ones from Datomic conf?

devn17:01:56

Was looking for Tim Ewald's talk on reified transactions, specifically.

sleepyfox17:01:46

I'm trying to use a value instead of a db like so:

(d/q '[:find ?last ?first :in [?last ?first]]
             ["Doe" "John"])
ExceptionInfo Query args must include a database  clojure.core/ex-info (core.clj:4739)

sleepyfox17:01:59

And I'm using (:require [datomic.client.api :as d]) with [com.datomic/client-cloud "0.8.50"] in my :dependencies

sleepyfox17:01:43

But I get the 'Query args must include a database' error as shown above. What am I doing incorrectly?

jocrau18:01:07

@sleepyfox AFAIK the Client API does not contain query capabilities itself but sends the query to the peer server. Thus, it lacks the capability to process collections as DB value. You might have to use the Peer library for that.

sleepyfox18:01:30

I was afraid you were going to say that.

sleepyfox18:01:22

Can you use the peer library with Cloud?

sleepyfox18:01:38

I'm guessing that's a 'no'...

jocrau18:01:02

Yes: No 😉

sleepyfox18:01:17

Context: we want a clean and simple way to test code, and mocking the db by passing it as a value seemed like a great way.

jocrau18:01:55

I did some experiments with the Client API. But I got stuck when I needed an easy way mock Datomic for testing purposes, like:

(def ^:private uri (format "datomic:$s" (datascript/squuid)))

(defn- scratch-conn
  "Create a connection to an anonymous, in-memory database."
  []
  (d/delete-database uri)
  (d/create-database uri)
  (d/connect uri))

sleepyfox18:01:52

Yup, this is the kind of thing that I wanted to use.

marshall18:01:46

You can use Peer Server to launch a mem database

marshall18:01:52

for local testing

marshall18:01:07

alternatively, you can easily tear off a testing db in your cloud system

marshall18:01:23

i.e. have a Datomic Cloud system for dev/testing and create a database, use it, then delete it

sleepyfox18:01:27

Yup. I was hoping to not have to switch between the Client and Peer APIs between actuals code and tests

marshall18:01:44

using Peer Server would not require you to switch to the peer API

sleepyfox18:01:09

I'd rather be able to mock out a db instead of creating an actual 'test' db in Cloud

sleepyfox18:01:50

But it seems like that isn't an option using the Client API only. Ah well.

jocrau18:01:35

Launching a local Peer Server locally to run a mem database would be a good compromise. But the current lack of support for delete-database and create-database makes creating a scratch-conn a bit messy.

marshall18:01:56

launching peer server with the mem database option creates the DB

marshall18:01:06

you don’t need to call create-database specifically

sleepyfox18:01:10

Thanks @marshall - I understand that I can spin up a Peer server to do this, but I'd prefer something more lightweight.

jocrau18:01:39

@marshall That’s right, but wouldn’t I have to either restart the Peer Server or retract the previous test facts to get a “clean slate” for the next test?

marshall18:01:25

or start it with a few dbs

marshall18:01:24

$ bin/run -m datomic.peer-server -h localhost -p 8998 -a myaccesskey,mysecret -d hello,datomic: -d hello2,datomic: -d hello3,datomic:
Serving datomic: as hello
Serving datomic: as hello2
Serving datomic: as hello3

timgilbert18:01:33

There's also https://github.com/vvvvalvalval/datomock which is super useful for this kind of testing scenario

timgilbert18:01:42

Oh, but it's peer-only, never mind

marshall18:01:03

Testing / dev is definitely one intended target of Cloud solo topology

marshall18:01:20

internally we use a solo system that is up all the time to provide tear off dbs for that kind of thing

cjsauer18:01:13

Speaking of datomic:mem://... connections, what does datomic.memoryIndexMax default to here? I'm currently trying to capacity plan and having trouble understanding the balance between this and the object cache...

marshall18:01:23

mem databases don’t have a persistent index (by definition), so they don’t have indexing jobs. they are entirely “memory index”

cjsauer18:01:37

I see. I'm able to get this error to occur in my experiments: Caused by: java.lang.IllegalArgumentException: :db.error/not-enough-memory (datomic.objectCacheMax + datomic.memoryIndexMax) exceeds 75% of JVM RAM If objectCacheMax defaults to 50% of VM memory, I imagine memoryIndexMax must be set to something in order to exceed 75%. This is where my question is coming from.

marshall18:01:17

both values are set in your transactor properties file

cjsauer18:01:39

Even for an in-memory database? I'm just connecting to datomic: for this test.

marshall18:01:06

what’s your xmx setting?

cjsauer18:01:37

300m I believe. Trying to "reverse engineer" how these capacity settings work, that's why it's so low.

marshall18:01:28

yeah, that’s unlikely to work to run a memory db

cjsauer18:01:08

I'm able to get it to run by setting the objectCacheMax system property really low, say 50m

cjsauer18:01:25

So I was assuming there was some sort of implicit memory index max too

marshall18:01:45

there probably is; unsure what it is set to

cjsauer19:01:18

@marshall I'm thinking it's something arbitrary from my tests. Appreciate the help, your links provide plenty of context đŸ»

alexk20:01:02

Even though I’ve got write-concurrency=2 in my transactor’s properties, and allocated a write capacity of 400 (!) for DynamoDB, I’m still getting throttled writes
I’m a bit surprised. How have you dealt with the limited nature of DynamoDB when running a transactor on it, how do you judge how much to bump capacity during a bulk import, etc.?

marshall20:01:24

@alex438 We have customers running sustained AWS write capacity of 1500, with a setting of 4000 for bulk imports. I would say 400 is on the low end for an active production system and I’m not surprised you’re getting throttled during a bulk import

marshall21:01:17

you can either increase the capacity or provide some throttling at your load

marshall21:01:27

i.e. reduce the rate at which you’re transacting

alexk21:01:17

Intriguing. Lots to review, thanks.

marshall21:01:53

@alex438 One value prop of Datomic Cloud is that it doesn’t use Dynamo the same way and a similar write load against the system can be handled with a much lower Dynamo throughput setting

marshall21:01:24

in many of our internal experiments the Dynamo autoscaling with Datomic Cloud rarely even hits 100 while running a large batch import

alexk21:01:47

Neat, and what about backing a transactor by Postgres?

marshall21:01:52

On-Prem can run with Postgres storage. it works well, we have a lot of customers doing so You would need to provision a sufficiently beefy PG instance to handle the write load; it may or may not be a win vs. DDB on the cost front

alexk21:01:04

got it, thanks