Fork me on GitHub
#datomic
<
2016-02-10
>
timothypratley07:02:32

Are there any command line tools for importing TSV files into Datomic? (Assuming an existing schema, just want to transact in new facts, ideally with a low startup time cost)

val_waeselynck07:02:22

@onetom happy to share more details about how we do testing by forking connections if the blog post is not enough simple_smile

val_waeselynck07:02:43

I may release a sample application or Leiningen template at some point

onetom07:02:00

@val_waeselynck: that would be really great!

onetom07:02:32

i tried your mock connection and it works so far

onetom07:02:25

i was using this function to create an in-memory db with schema to serve as a starting point for forking in tests:

(defn new-conn
  ([] (new-conn db-uri schema))
  ([uri schema]
   (d/delete-database db-uri)
   (d/create-database db-uri)
   (let [conn (d/connect db-uri)]
     @(d/transact conn schema)
     conn)))

onetom07:02:14

i guess your empty-db fn is doing something similar

onetom07:02:03

have you released this mock connection as a lib anywhere yet? if it served you well so far, it would make sense to create a lib, no?

onetom07:02:01

actually i would expect cognitect to supply such a solution out of the box if it is a really sound approach as @robert-stuttaford hinted above

val_waeselynck07:02:40

@onetom: yes I'll probablye roll out a lib soon, just wanted to get some criticism first

onetom07:02:09

ok, here is my criticism: why is it not on clojars yet!? ;D

val_waeselynck07:02:13

My next blog post will be a guided tour of our architecture, so it'll probably cover this in more details

onetom07:02:46

happy to hear!

val_waeselynck07:02:48

And I wouldn't be surprised if this was actually the implementation of Datomic in-memory connections simple_smile

onetom07:02:42

yet it takes longer to just create/delete in-mem dbs

onetom07:02:58

do u think it's just the overhead of specifically transacting the schema?

val_waeselynck07:02:16

performance is not the biggest win IMO, being able to fork from anything is

val_waeselynck07:02:30

including your production database, I do it all the time

onetom07:02:49

that's what i was missing from your article. you haven't established a baseline which your are comparing your solution to, so im not sure what would be the alternative approach and how much faster is it to use the mock connection

onetom07:02:19

that sounds a bit risky to work w the production fork, no? i always work on restored backups, but our db takes only a few seconds to restore still, so that's why it's viable atm

pesterhazy07:02:55

@val_waeselynck: your article inspirational, will def try that for us as well

val_waeselynck07:02:57

why risky ? once you fork, it's basically imossible to write to the production conection

val_waeselynck07:02:32

(well , granted, the risk is that you forget to fork :p)

onetom07:02:20

that's what i meant simple_smile

val_waeselynck07:02:23

@pesterhazy: thank you simple_smile this encourages me to roll out a lib then

pesterhazy07:02:43

that would be very useful I think

pesterhazy07:02:05

just the mock connection itself would be great as a lib

val_waeselynck07:02:11

@onetom: anyway, I'm generally not too worried about accidental writes with Datomic, they're pretty easy to undo

onetom07:02:15

@val_waeselynck: your test example is the most heartwarming thing i've seen in a long time that's how i always hoped to describe integration tests and now you made it a reality by putting the dot on the I (where I = datomic simple_smile)

pesterhazy08:02:22

now if someone could build a better deployment strategy for datomic on AWS with live logging, that'd be great too (I just had the prod transactor fail to come up twice, without a way to find out what the problem was; only to work the third time, for no apparent reason)

onetom08:02:40

@val_waeselynck: are you using any datomic wrapper framework, like http://docs.caudate.me/adi/ or something similar?

val_waeselynck08:02:23

@onetom: no, never heard of such a framework 😕

val_waeselynck08:02:59

quite happy with datomic's api (except for schema definitions)

onetom08:02:33

well, that's one of the obvious areas where some framework could help

onetom08:02:52

but then migrations become tricky if u have a source file representing your schema, since the DB itself is not the single place of truth anymore

onetom08:02:06

but i read your article about conformity, so i will try that approach soon

val_waeselynck08:02:11

@onetom @pesterhazy I gotta run but happy to discuss this further, actually it would be really nice if you could persist your main questions and critics as comments of the blog post, so others can benefit from it :)

robert-stuttaford08:02:38

@pesterhazy: that logs rotate from the transactor rather than stream is problematic for me too. it’s made logs totally useless for every instance that our transactors failed in some way

casperc08:02:34

So is it just me or does the Datomic client just never return when submitting a malformed query?

casperc08:02:22

Like this one:

(d/q '[:find (pull ?be [*])
       :where $ ?id
       :where 
       [?be :building/building-id ?id]]
     (d/db @conn)
     2370256)

casperc08:02:29

(with two :where clauses)

casperc08:02:27

currently the process is using a lot of CPU, so apparently it is doing something

onetom08:02:53

@casperc: this doesn't hang for me:

(defn idents [db]
  (q '[:find ?eid ?a
       :where $
       :where
       [?eid :db/ident ?a]] db))

(->> (new-conn) db idents pprint)

onetom08:02:20

but it doesn't have a 2nd param either; let me try that

onetom09:02:53

that still works and no cpu load

onetom09:02:27

im on [com.datomic/datomic-free "0.9.5344"]

pesterhazy09:02:10

@robert-stuttaford: exactly. you have logs, but only the next day and only in case nothing goes wrong (which is precisely the case where you're not particularly interested in the logs)

pesterhazy09:02:43

it'd be already helpful to be able to specify a logback.xml so you can set up your own logging

robert-stuttaford09:02:19

we use http://papertrailapp.com and it’d be great to use logback’s syslog appender with that

pesterhazy09:02:20

I know that this is possible in principle by hacking the startup scripts, but that's way harder and hit-and-miss than any admin would like

pesterhazy09:02:26

we use papertrail as well

pesterhazy09:02:23

the other thing the ami's are missing is the ability to pull in your own libraries (which you require when you use them in transactor fns)

casperc09:02:25

@onetom: Weird. Thanks for testing it though. What are you getting as a return value?

onetom10:02:45

i was getting the exact same results

onetom10:02:29

or i was getting this error:

java.lang.Exception: processing rule: (q__23355 ?name ?ip ?cluster-name ?cluster-subdomain), message: processing clause: [$rrs ?subdomain ?ips], message: Cannot resolve key: $rrs, compiling:(ui/dns.clj:74:1)

casperc14:02:21

I am wondering a bit about lookup refs. It looks like they throw an exception when the external id being referenced, is not present which I think is fair. For my use case, I just want the ref to be nil (or not added). Any way to make the lookup ref optional?

Ben Kamphaus15:02:53

@pesterhazy: and @robert-stuttaford definitely understand the point around log rotation vs. streaming. Re: launch problems, we did add a section to the docs on troubleshooting common “fails with no info” issues under “Transactor fails to start” here: http://docs.datomic.com/deployment.html#troubleshooting — adding to lib/ and configuring different logging, though, definitely fall under the use case (at least at present) for configuring your own transactor machine vs. using the pre-baked AMI.

Ben Kamphaus15:02:05

We do hear and consider your feedback there, but nothing short term to promise on those topics.

pesterhazy15:02:00

@bkamphaus: I realize it's a larger undertaking, not blaming you

pesterhazy15:02:40

I'm probably going to build an amazon playbook to set up datomic on ec2+dynamo, that should make things a lot easier for folks

robert-stuttaford16:02:20

those docs are great, @bkamphaus ! thanks for taking note simple_smile

val_waeselynck17:02:35

@casperc: in what context? Query / transaction / entity ?

akiel20:02:30

Is there a defined order in which the pull api returns multiple results? I ask this because the entity api returns sets for cardinality-many attributes which can be compared without taking the order into consideration. But the pull api returns vectors where the order matters.

ljosa22:02:25

no, the order is undefined