Fork me on GitHub
#datomic
<
2016-11-30
>
ovan10:11:39

Does anyone have experience using Amazon Aurora as a storage backend with Datomic? They claim it's MySQL compatible, whatever that means, and what I understand about how datomic uses storage services it doesn't need much (or anything) in terms of database implementation specific features.

karol.adamiec10:11:07

out of pure interest if on AWS then why not use DynamoDB? can you share some rationale? asking because in similiar position i have not even considered other storage so i am curious about what i might have missed šŸ™‚

ovan10:11:46

DynamoDB is definitely the other option we're considering (actually we're not sure we're going to go forward with Datomic at all but right now it feels promising). Taking Datomic into use means a lot of new stuff to learn from operational perspective. We have a lot of experience running with RDS MySQL but nobody in our team has tried DynamoDB yet. So basically if Aurora would work nicely as a backend that might be one less new thing to take into use right now. I would like to understand the options in general and how the storage affects things so we can make at least somewhat informed decisions. I couldn't easily find a whole lot of information about how to choose the storage backend for Datomic and what are the tradeoffs there.

ovan11:11:41

Price is another consideration. My hunch is that Aurora might be a cheaper option to start with. That said, we haven't done any calculations yet so I might be totally off base here.

pesterhazy11:11:58

@ovan, aurora is a modified version of mysql, so in all probability it'll work just like mysql (plus datomic's sql needs are likely not to be very sophisticated as it uses it as a k/v store)

ovan11:11:03

@pesterhazy, thanks. That matches with my understanding.

mitchelkuijpers13:11:05

@ovan From a pricing perspective I can recommend dynamodb. Using it feels a bit like cheating because datomic caches almost everything

jonpither13:11:58

Any recommended JVM args for the transactor beyond the documented memory settings?

jonpither13:11:32

And the peer for that matter

jonpither14:11:24

Anyone have any helper code for converting entity maps into a sequence of datoms..?

robert-stuttaford14:11:26

not hard to write šŸ™‚

jonpither15:11:41

submaps etc, can get slightly tricky, but am doing it

dominicm15:11:22

Apparently there's a gist somewhere with the code already written

rauh15:11:33

@jonpither Datascript does this as well in its implementation. Yoou could also just do a with on some temp db. Thoough needs a schema

pesterhazy16:11:57

is map-form-tx->vec-form-txs a mechanical, pure fn, or does it require looking at the existing db?

karol.adamiec16:11:17

how can i replace a ref of cardinality one that is a composite?? 1) find the id 2) retract id (no loose datoms) 3) assert new entity i reeeaaallyy want something simpler ā€¦ any ideas?

karol.adamiec16:11:38

i can easily update the main entity (i have uniqe on it). But then each update creates a new referenced entity, even though the thing is marked as a compositeā€¦ ;/

karol.adamiec16:11:11

;User
    {:db/id #db/id[:db.part/db]
    :db/ident :user/email
    :db/valueType :db.type/string
    :db/unique :db.unique/value
    :db/cardinality :db.cardinality/one
    :db/doc "Email"
    :db.install/_attribute :db.part/db}
   {:db/id #db/id[:db.part/db]
    :db/ident :user/shipping
    :db/valueType :db.type/ref
    :db/isComponent true
    :db/cardinality :db.cardinality/one
    :db/doc "Shipping address"
    :db.install/_attribute :db.part/db}

karol.adamiec16:11:45

{:db/id [:user/email ā€œ"]
:user/shipping {
                    :db/id #db/id[:db.part/user]
                    :address/line1 "66666one"}}

karol.adamiec16:11:31

so on shcema like above the transaction is creating a NEW address entity each time šŸ˜ž

jcf16:11:09

@karol.adamiec you need to either merge the existing component entity with the new attributes, or retract it, and add a new entity. It can't be done with the map form of a transaction. You need :db/add and :db/retract.

jcf16:11:14

Has anyone here hit the 10 billion datom limit recently? Wondering if the Datomic team are testing with larger databases these days. I can partition data into separate databases, and maintain multiple connections, but I'd like to avoid that complexity for a while.

zane16:11:57

Hey all. I'm trying to introduce some memoization to some functions I've written that take datomic database values as arguments. Is there any way to uniquely identify the connection a given database value came from?

jcf16:11:39

@zane: you can't use the URI you used to create the connection?

zane16:11:26

The functions in question don't take the URI.

zane16:11:07

I could have them take extra arguments and cache based on those, but I'd rather not if I can avoid it.

jcf16:11:10

You could memoize outside of the query functions, but I guess you don't want that.

marshall16:11:17

@jcf Just yesterday @stuarthalloway mentioned 100B in the Datomic Workshop at Conj.

marshall16:11:36

If you think youā€™re building a system that will need 10-100B datoms, you should email me

marshall16:11:00

and weā€™ll talk about the specific details/challenges with administering a database of that size

marshall16:11:08

but itā€™s definitely not a ā€œlimit"

karol.adamiec16:11:18

@jcf how can i get the id of :user/shipping entity to wrap it all up in one transaction?

jcf16:11:31

@marshall my client has a paid support agreement in place. I can see if I can get them to add me to the ZenHub account (I'm assuming you guys are still using that?) and go through official channels if you wantā€¦ we've already been sold - I just need to see if I can do this without Cassandra.

marshall16:11:26

Sure - Just send an email to support @ cognitect and let us know what client youā€™re working with

jcf16:11:25

@karol.adamiec something like this:

(let [user-id (d/entity (d/db conn) [:user/email ""])]
  [[:db/retract user-id :user/shipping shipping-id-to-remove]
   {:user/email ""
    :user/shipping [{:address/line "etc"}]}])

jcf16:11:57

You might want/need to diff the components however. That's beyond what I can type up in Slack however.

karol.adamiec17:11:45

@jcf but the real issue for me is how do i get shipping-id-to-remove?

karol.adamiec17:11:56

having only the email?

jcf17:11:01

Load the user, and you'll get the shipping IDs back from Datomic. (map :db/id (-> conn d/db (d/entity [:user/email "[email protected]"]) :user/shipping)) will give you a list of all the shipping IDs.

jcf17:11:17

That assumes email is unique. If not you'll need to query, and pick the right user.

karol.adamiec17:11:57

can i do that inside of a transaction?

karol.adamiec17:11:14

i am fighting uneven battle trying to use REST API šŸ˜ž

jcf17:11:14

No. Do the work in the peer, and send the transaction to the transactor.

jcf17:11:32

Oh. I'm using Datomic from Clojure. I don't know anything about the REST API.

jcf17:11:53

Maybe the new client stuff will make your life easier. It was announced in the last couple of days.

karol.adamiec17:11:32

oh yes. i am waiting. Ehh. Thanks. Will fire couple https req at the db then. Tried to avoid that šŸ˜ž

karol.adamiec17:11:02

on a realted note do lookup refs nest?

jcf17:11:15

@karol.adamiec not sure I follow. A lookup ref is of the form [attribute value], and you can't do something like [attribute [attribute value]].

karol.adamiec17:11:46

yeah, i tried and failed, but that is exactly what i would liek to do šŸ™‚

marshall17:11:50

Also, it was mentioned in the Blog post, but we now have a Feature Request & Feedback portal available - if you log into my.datomic there is a link to ā€œSuggest Featuresā€ in the top nav; go there and vote for/suggest improvements and/or clients in your language of choice

jcf17:11:05

@marshall open source. troll

marshall17:11:32

ā€¦dont feed the trolls šŸ˜‰

karol.adamiec17:11:14

@marshall is a nested lookup ref a technical possibility or am i deeply misunderstanding how datomic works?

jcf17:11:15

I'd love to see support for one of Google's Cloud storage engines.

jcf17:11:39

@karol.adamiec you more than likely should be using a query.

karol.adamiec17:11:54

yeah! but i need to transact! šŸ™‚

karol.adamiec17:11:04

that means query first, transact next

karol.adamiec17:11:11

i clojure it is almost the same

jcf17:11:17

Do the query, and then transact. That's fundamental to how Datomic works.

karol.adamiec17:11:23

over rest you feel the pain

jcf17:11:44

You almost always want to offload work to your peers, and only transact simple additions and retractions.

jcf17:11:56

Otherwise you hit timeouts doing full index scans etc.

karol.adamiec17:11:16

yeah, i think i try to constantly abuse datomic šŸ˜®

jcf17:11:24

It sounds like it! šŸ˜‰

karol.adamiec17:11:50

it is the rest trap. However i try to convince myself that firing off requests is fine ā€¦. i always end up trying to minimize the amount of traffic, which is a datomic antipattern surely.

jcf17:11:10

@karol.adamiec are you sending requests from the browser or some backend service?

karol.adamiec17:11:26

nodejs šŸ™‚

jcf17:11:39

Can you not keep an HTTP connection open with the REST API?

jcf17:11:21

If you keep connections alive, then it doesn't matter so much. It's the cost of establishing a connection I'd worry about.

jcf17:11:32

Send the requests! šŸ˜„

karol.adamiec17:11:37

i think i am fine anywya

karol.adamiec17:11:45

it is a small evcommerce shop

marshall17:11:46

You canā€™t nest lookup refs - in this case I think query is the right approach

marshall17:11:14

if you need atomicity of the lookup and transact, you can either use a transaction function or use a more ā€˜optimisticā€™ concurrency strategy and use cas

jcf17:11:12

CAS works really nicely. Transaction functions are a last resort for me because they can end up being slow (at least when I've abused them in the past).

jcf17:11:36

For efficient functions where you use a fast index it can be all good.

karol.adamiec17:11:05

i think query is right thing to do. get the id. if exists, retract, if not do nothing. then assert full user enity again with address.

karol.adamiec17:11:47

but navigating lookup refs like pulls would be nice šŸ™‚

karol.adamiec17:11:55

i could abuse datomic longer šŸ˜„

jcf17:11:07

If you're using Clojurescript you can use clojure.set to work out what you need to retract etc. From JS I guess you have to write it all yourself. šŸ˜‰

jcf17:11:08

I haven't written vanilla JS in years. Before React was released at least.

karol.adamiec17:11:09

well anyway the right thing to do is query->transact. I do not need CAS semantics per se. All i wanted to do is to be lazy and fire off one, maybe a bit tricky transaction and have it do it for me šŸ™‚

curtosis20:11:46

for the simple, ā€œembeddedā€ app case, is the recommended best practice still using the Peer library, correct? Rather than starting up a transactor AND a peer-server AND the app+client?

robert-stuttaford20:11:48

i wonder if you can have a process be its own peer-server

robert-stuttaford20:11:05

allowing you to code with client but only have one jvm run

robert-stuttaford20:11:26

is that a possibility @jaret ?

curtosis20:11:41

I thought peer-server still needs to talk to a transactor (& storage)

robert-stuttaford20:11:45

at least keeps the code portable

robert-stuttaford20:11:50

yes, you always need a transactor

robert-stuttaford20:11:53

for durable storage

curtosis20:11:01

oh I see what you mean

curtosis20:11:52

jvm:(client + peer-server) + jvm:(transactor)

curtosis20:11:32

the benefit would be primarily sticking to the client API, right?

robert-stuttaford20:11:57

means you can make the decision to go separate peer-server later on when need be

curtosis20:11:57

but Iā€™m still learning the peer API! šŸ˜‰

robert-stuttaford20:11:13

then use the peer! šŸ™‚

curtosis20:11:49

if a client process canā€™t be its own peer-server, then itā€™s moot

robert-stuttaford20:11:36

itā€™s verrrry early days yet. weā€™ll figure it out šŸ™‚

jaret20:11:04

@robert-stuttaford you cannot have a process be its own peer-server.

curtosis20:11:26

I think from my quick read, for most of the ā€œembeddedā€ use cases I can think of, the peer library is a much better fit.

curtosis20:11:43

just missing string tempids šŸ˜›

ovan20:11:07

@robert-stuttaford, just listened to a defn podcast where you talk about Datomic. In the light of recent changes it was fun to hear the part about the problems with peer-based licensing model. šŸ™‚ Anyway, thanks for doing the podcast, really helpful information for our team as we're considering Datomic for our next project.

marshall20:11:18

String tempi tempids are in the peer too

curtosis20:11:03

ah - thatā€™s not in the summary table: http://docs.datomic.com/clients-and-peers.html

curtosis20:11:36

but it is in the text later: "Peers continue to support tempid structures, and in addition they also support the new client tempid capabilities."

timgilbert20:11:43

@zane: better late than never, I hope, but database values have a :id attribute which generally points to the URL of the connection they came from

robert-stuttaford20:11:29

thought so. so, if you want to use peer-server and a durable db in dev, youā€™re starting 3 processes now

robert-stuttaford20:11:54

@ovan, yeah šŸ™‚ how quickly our discussion became legacy! so totally happy about the changes this week. if you have any questions in aid of your decision, letā€™s have em. i love learning about other contexts

ovan20:11:43

@robert-stuttaford, Thanks. I do have a couple of question if you don't mind. You mentioned in the podcast that you ran the first year or so with Postgres as a storage backend and only later moved to DynamoDB. Would you do the same again or just start directly with dynamo? I'm mainly concerned about operational aspects like tuning the capacity. Also, what's you experience with operating the transactors. Any surprises that were hard to debug or fix?

ljosa23:11:33

Is there any reason not to use the dev storage in production in a situation where persisting to the local file system is okay? Is it named dev to discourage production use?

ljosa23:11:51

Also, is there any different between the dev storage and the free storage that is in the free version of Datomic?