Fork me on GitHub
#datomic
<
2018-01-17
>
James Vickers00:01:15

Has anyone compared Datomic's basic write performance against other SQL databases? I wrote some tests that pit PostgreSQL against Datomic and in most cases Datomic is 2-5x slower - even with tests that always insert new records. Is that other people's general expectation also, or do I need to do some tuning (I run Datomic with defaults) to make write performance somewhat comparable with other SQL databases?

val_waeselynck06:01:19

@U7M6RA2KC the license forbids to published such benchmarks (I know, it sucks).

val_waeselynck06:01:25

However, that's not necessarily a very interesting benchmark - for most uses of a SQL db like postgres, the reads will slow down the writes, so raw write capacity won't be the limiting factor for throughput

val_waeselynck06:01:18

That Datomic has lower raw write throughput is to be expected IMO, because of the differences in the index data structures

fmind13:01:44

@U06GS6P1N agree. Would you know the difference with other storage backend like cassandra or dynamoDB ?

James Vickers14:01:26

Thanks for the replies. Maybe I'll do a performance test and includes some Peers reading while writes are in progress. The difficulty is selling colleagues on a database with the initial premise that it might be slower than our the more common one (though, like you're saying, in practice it might be faster due to separation between Peers and Transactor)

val_waeselynck18:01:18

@U7M6RA2KC 2 aspects you should sell: low-latency, horizontally scalable reads + reads stay available even when writes are overwhelmed, which is a greeaaaaat situation to be in operationally

stuarthalloway14:01:20

refresh your cache

stuarthalloway14:01:09

@petrus where did that inbound link come from?

Petrus Theron14:01:55

Looks like the docs moved to sub-path /cloud

stuarthalloway14:01:02

thanks @petrus ! — investigating

Petrus Theron14:01:49

The pricing layout on the AWS Marketplace is confusing. It's not obvious that the The "Infrastructure Pricing Details" is not a line total at bottom although it's laid out like an invoice. I didn't realise I could click between the "Datomic Cloud Bastion" and "Datomic Cloud" (do I need both? Is Bastion like a "lite" version?) so I went ahead and accepted thinking the total cost only that of a t2.nano, but later on it shows I will be billed for both separately? Link: https://aws.amazon.com/marketplace/pp/prodview-otb76awcrb7aa

stuarthalloway14:01:37

@petrus we agree — that layout is compelled by AWS, and we are working with them to implement improvements

stuarthalloway14:01:27

@petrus happy to explicate here

stuarthalloway14:01:51

@petrus you want the bastion so you can connect to your system from the internet

Petrus Theron14:01:45

with Datomic Cloud, would I automatically get any updates, e.g. if a bug or vulnerability is discovered?

stuarthalloway14:01:12

@petrus depends. AWS can auto update you for e.g. meltdown

Petrus Theron14:01:42

cool - I mean Datomic versioning, not so much OS/VM/hardware level

stuarthalloway14:01:42

But for a Datomic bug you would need to update your CloudFormation stack after notification from us.

stuarthalloway14:01:43

You would not need to go back to the marketplace site — you could just grab an upgrade per https://docs.datomic.com/cloud/operation/upgrading.html

Petrus Theron15:01:01

Are "On Prem" upgrades also rolling? I manually set up a transactor recently. No way to "migrate" my existing system to Datomic Cloud?

sleepyfox15:01:21

For the production topology of Datomic Cloud, are writes scaled across the number of nodes in the tx group or is there still only a single transactor per database?

stuarthalloway15:01:24

@petrus you can do rolling upgrades to On-Prem. Migration is an ETL job, we will provide tools but have not done so yet

stuarthalloway15:01:04

@sleepyfox each db will have a preferred node for writes. Cloud will not make writing to a single db faster, but you can have many more dbs on a system as you scale horizontally. See https://docs.datomic.com/cloud/operation/scaling.html

stuarthalloway15:01:15

@petrus notes for anyone considering On-Prem to Cloud migration: https://docs.datomic.com/on-prem/moving-to-cloud.html

robert-stuttaford15:01:08

@stuarthalloway dude 🙂 btw “On-Pre” heading typo in “Other Differences” section in your moving-to-cloud doc

robert-stuttaford15:01:25

@stuarthalloway, what does “Symbol magic” mean? (nm found it)

stuarthalloway15:01:57

conversion of strings to symbols to help languages without a symbol type

stuarthalloway15:01:05

I am looking at you, Java

robert-stuttaford15:01:25

how much sleep have you had, Stu? 🙂

johnj15:01:45

does the transactor and peer server run in the same ec2 node for the solo version?

mitchelkuijpers15:01:33

Really looking forward to the cloudsearch integration, we would totally love to move over to the cloud offering

mitchelkuijpers15:01:22

in the docs it says future.. Would that mean month's or should I think in that it might take a year?

stuarthalloway15:01:06

@lockdown- the arch change is bigger than that — there are no transactors or peer servers, just cluster nodes. That said, on Solo there is only one node. 🙂 See https://docs.datomic.com/cloud/whatis/architecture.html for more.

robert-stuttaford15:01:47

@stuarthalloway does this mean that cloud could potentially handle more writes somehow?

stuarthalloway15:01:17

@robert-stuttaford more per system: yes, more per db: no

stuarthalloway15:01:00

1 transactor + 1 for HA <= N cluster nodes

robert-stuttaford15:01:28

ok - so the solo topology is actually a transactor and a peer-server rolled into one, and adding node 2 takes you to HA writes + load balancing and adding more nodes adds more read scale after that. right?

robert-stuttaford15:01:45

of course, this summary ignores all the changes on the storage layer

stuarthalloway15:01:51

there are no transactors, any cluster node can handle any write

stuarthalloway15:01:26

that won’t allow more writes per db because the underlying CAS in DDB is still the gatekeeper

stuarthalloway15:01:43

but you can have many more dbs

robert-stuttaford15:01:07

is there a different theoretical datom limit for cloud?

robert-stuttaford15:01:56

the on-prem limit is due to peer memory to hold the roots. i guess peer-server has a similar concern?

stuarthalloway15:01:57

dbs and query still use the same data structures, so nothing is really different there

bmaddy15:01:08

@stuarthalloway I (and some of the people I'm working with) think your comment about just having cluster nodes is pretty darn cool. It might be worth considering putting that on the main marketing page. I'm excited to try this thing out!

stuarthalloway15:01:26

@bmaddy thanks! Will consider.

robert-stuttaford15:01:28

@stuarthalloway can we now control read-only vs read/write at the connection level? (with client / cloud)

stuarthalloway15:01:53

@mitchelkuijpers TBD on search integration, user demand will drive

mitchelkuijpers15:01:52

That makes sense

eggsyntax15:01:54

@stuarthalloway re "there are no transactors or peer servers, just cluster nodes" -- the move away from a single-transactor model seems huge (and like it must have been extremely challenging to do while maintaining same guarantees).

stuarthalloway15:01:09

@robert-stuttaford IAM integration is pretty deep, so you can give a client IAM creds that are e.g. read only for a db https://docs.datomic.com/cloud/operation/access-control.html#sec-2

stuarthalloway15:01:35

@eggsyntax at the risk of making it sound less cool, that part was actually pretty easy

stuarthalloway15:01:33

transactors always had that guarantee — if we removed the code that manages HA, you could have N transactors. Semantics would be fine but perf would be terrible

eggsyntax15:01:01

Interesting, I never realized that.

shaun-mahood16:01:15

Congrats to the whole Datomic team - really glad you managed to get through all the AWS hoops!

eggsyntax16:01:39

Seconded! Exciting stuff 🙂

stuarthalloway16:01:04

thanks! it has been intense working through the process

chrismdp16:01:00

Hello! Just using your Cloud Formation template now - is there a way of setting up the template to use an existing VPC?

stuarthalloway16:01:53

@cp not at present. We did a bunch of testing and found too many variables to support

stuarthalloway16:01:59

@cp happy to discuss in more depth if that is a blocker

stuarthalloway16:01:15

Note that the bastion lets you have dev access from the internet if you want

chrismdp16:01:49

We’re just figuring out how we’d connect our running services from different VPCs into the nodes

stuarthalloway16:01:21

yes, that seems to be the marketplace preferred way

chrismdp16:01:07

I’m guessing we do that via the bastion

chrismdp16:01:55

^^ that makes sense

stuarthalloway16:01:08

use peering (not the bastion) for prod

stuarthalloway16:01:17

your security auditor will tell you the same thing 🙂

chrismdp16:01:54

cool - that makes sense thanks

chrismdp16:01:07

congrats on the launch!

chrismdp16:01:18

one more quick question: why is the API endpoint a http://datomic.net url? (cf https://docs.datomic.com/cloud/getting-started/connecting.html#creating-database) - I’m wondering how the plumbing all fits together

stuarthalloway16:01:28

@cp that name lives only locally inside Datomic Cloud’s VPC, allowing clients to have a stable name to connect to

chrismdp16:01:22

cool - thanks

souenzzo16:01:51

[aws/client - off] I'm on peer API. as-of db's can't be with-ed? I'm trying to do that, but it's not working

(let [db-as-of (d/as-of db t)
      ;; :my-attr is "isCompomnent true"
      {:keys [db-after]} (d/with db-as-of [[:db/retract :my-attr :db/isComponent true]])]
  (:db/isComponent (d/entity db-after :my-attr))
  ;; => true
  )

stuarthalloway16:01:37

@souenzzo that is correct, you cannot with an as-of db

stuarthalloway16:01:15

intuition: as-of is not time travel, it is a filter

souenzzo16:01:15

It's on docs? It fails with not error. Even tx-data on return is "correct". Very difficult to debug.

stuarthalloway16:01:00

Hm, it should be, I know it has come up before.

eggsyntax16:01:31

Looks like that requires login to view, and no public signup that I can see.

eggsyntax16:01:25

(not a problem for me personally; I was just going to look out of curiosity)

favila16:01:39

huh. I got to it via https://www.datomic.com/support.html click "feature request"

favila16:01:09

now that I think about it you probably need a http://my.datomic.com account

eggsyntax16:01:47

I have one but I'm not signed in; I'll see if that changes it.

eggsyntax16:01:25

Nope, still doesn't work -- unless you're suggesting I should actually put my http://datomic.com name/pwd into http://receptive.io.

eggsyntax16:01:05

Nor can I get to it via the "feature request" link on http://datomic.com. But again, NBD for me, just letting you know.

favila16:01:36

That may have been what I did

eggsyntax16:01:38

Aha! Clicking a different "feature request" link did work. 😜

eggsyntax16:01:36

The one at the very top of the page.

eggsyntax16:01:53

Next to "log out"

val_waeselynck18:01:00

Please do upvote the request, we're really missing out on something great there 🙂

stuarthalloway16:01:49

@favila until then it should at least throw an exception. Logging it for a future release.

stuarthalloway16:01:28

@souenzzo sorry that threw you, we will make it more failfast

denik17:01:31

@stuarthalloway re: cloud, is there a transition path from solo to production? Also the pricing (since it only shows EC2) is the same. At equivalent usage (what solo can handle) is prod expected to be more expensive? If so, how much?

stuarthalloway17:01:43

@denik you can transition from Solo to Production with a CloudFormation upgrade https://docs.datomic.com/cloud/operation/upgrading.html

stuarthalloway17:01:18

@denik if you are looking at the marketplace pricing, their site is currently not capable of telling you which instance types go with which topologies.

stuarthalloway17:01:06

Solo runs on a single t2.small, and Production runs (typically at least 2) i3.larges

shaun-mahood17:01:33

I'm going through the setup right now, and for Oregon I got $21/month on solo and $233/month on production as the estimated costs. I think it said that was for 2 i3.large.

stuarthalloway17:01:39

But there is no fixed equation: pricing is per hour, and running >1 instance for availability is up to you. So Production cost will vary substantially with use

stuarthalloway17:01:05

@shaun-mahood that sounds reasonable — start with Solo until you need more

denik17:01:10

Of course. Thanks @stuarthalloway. I’m very excited!

shaun-mahood17:01:40

Oh yeah, solo will be overkill probably for what I'm doing - I was just curious since the estimator was kind of impenetrable to understand.

stuarthalloway17:01:49

“kind of” is generous

stuarthalloway17:01:05

@shaun-mahood belay that, the sales dept informs me you should run Production for everything, and forget you turned it on 🙂

marshall17:01:37

with an ASG size of 12

shaun-mahood17:01:45

Still cheaper than an on-prem license 🙂 (I mean, without Marshall's "help")

stuarthalloway17:01:03

@shaun-mahood yes. That text is inside the CFT, so triggers a deeper review process. We will fix it the next time we navigate that process.

stuarthalloway17:01:56

OTOH, we can change http://docs.datomic.com with an S3 push, so if the text-about-the-text could be better that is an easy fix 🙂

shaun-mahood17:01:54

So I'm guessing you have something you can point people to when they complain that requiring a jira ticket to change a docstring is too onerous? 🙂

stuarthalloway17:01:45

@shaun-mahood btw did you watch the video walkthrough? It is short https://www.datomic.com/videos.html

cjsauer17:01:18

Got a solo stack launched and transacted some movies. This is awesome 🙂

shaun-mahood17:01:48

Not yet, the setup so far has been exceptionally straightforward and easy from the setup instructions. Just waiting for the stack to finish creating in the cloudwatch dashboard right now.

stuarthalloway17:01:34

@cjsauer hooray! Out of curiosity, are you running as AWS owner or did you do the “authorize a Datomic admin with IAM” step?

cjsauer17:01:59

@stuarthalloway I used the admin group setup route and tested it with my non-root account. Worked like a charm.

marshall17:01:12

that’s great to hear

stuarthalloway17:01:38

“easy” vs “securious” always a challenge

stuarthalloway17:01:02

btw I think I just made that word up, it means “serious about security”

marshall17:01:39

i thought it meant curious about security

cjsauer17:01:08

Short for supersecuriousticexpialadocious

sleepyfox17:01:35

Have moved past the movie example and moved to importing real data. We used existing IAM users authorised with the datomic policy.

sleepyfox17:01:36

Have found it to be pretty straightforward, although AWS's console UI is (as usual) awful.

ljosa17:01:48

Right now, it looks like the ~$30 estimate for production with t2 in the AWS console is real.

marshall18:01:33

@sleepyfox awesome! Glad to hear it

marshall18:01:13

pushed a fix for that typo

denik18:01:18

stuck trying to run the socks proxy. aws ec2 describe-instances... returns the system name, aws iam get-user returns the user in the iam group, yet running the script returns Datomic system <my-system-name> not found, make sure your system name and AWS creds are correct.

stuarthalloway18:01:28

Hi @denik. Triple check your spelling of the system name and AWS region wherever they appear

denik18:01:54

@stuarthalloway already did, before the final message it also prints

To see help text, you can run:

  aws help
  aws <command> help
  aws <command> <subcommand> help
aws: error: argument command: Invalid choice, valid choices are:

acm                                      | apigateway
autoscaling                              | cloudformation....

stuarthalloway18:01:38

@denik that sounds like an argument is not getting to an invocation of the AWS CLI inside the script

marshall19:01:35

@denik can you check the version of your aws cli?

marshall19:01:11

aws --version

denik19:01:13

@marshall aws-cli/1.10.24 Python/2.7.10 Darwin/16.7.0 botocore/1.4.15

marshall19:01:22

i think your CLI might be too old

marshall19:01:31

aws-cli/1.11.170 Python/2.7.14 Darwin/17.3.0 botocore/1.7.28

marshall19:01:35

is what i’m using

marshall19:01:49

i’m wondering if the version you have doesnt have the required aws sub-command(s)

marshall19:01:00

err that should be $ pip install awscli --upgrade --user

Dustin Getz19:01:07

Is this channel logged anywhere

Dustin Getz19:01:36

Shoot, logs at clojurians-log are busted, the don’t index anymore

jaret19:01:42

but that only has to 11-16

shaun-mahood19:01:43

I got everything up and running - the getting started docs and videos are excellent, thanks for putting in the effort to make them so solid.

jaret19:01:29

@denik did upgrading the CLI work for you?

denik19:01:39

currently yak shaving over the update process. I’ll keep you posted

denik19:01:32

updating did the trick. it works!

jaret19:01:59

Thanks! We’re making a note of that and will update the docs to reflect the need to upgrade.

denik19:01:44

Great. Note: I spend most time trying to update the CLI using aws official guide which didn’t work on OSX. That’s why I though I had updated even though I did not.

denik19:01:19

brew install awscli did the trick

jaret19:01:41

ok, good to know

denik19:01:19

brew install awscli did the trick

jaret20:01:43

Query groups are coming soon as a means for scaling reads https://docs.datomic.com/cloud/operation/scaling.html#sec-3

jaret20:01:57

But they are not yet fully implemented 🙂

jaret20:01:05

We’ll have more documentation when they are

denik20:01:45

thanks @jaret for two separate projects, is the idea to create two different databases in the same system or two different systems each with one db?

jaret20:01:28

No the intention is more for An AutoScaling Group (ASG) of Nodes used to dedicate bandwidth, processing power, and caching to particular jobs. Unlike sharding, query groups never dictate who a client must talk to in order to store or retrieve information. Any node in any group can handle any request.

jaret20:01:15

Oh apologies, I misunderstood your question and thought we were still discussing query groups.

jaret20:01:57

@denik Generally, you should create two separate stacks for separate projects. But can you tell me more about the projects? will they share data?

jaret20:01:30

@denik important to note that two databases in the same system is totally doable in cloud. But its the sort of thing that I’d need more details on in order to provide a full recommendation.

denik20:01:28

@jaret I see, they wouldn’t share data. I’d just like to have a system to quickly spin up durable experiments. Most of them will be deleted eventually (after months), but I’d want to migrate the ones that get traction into a new system eventually. It’s a bit like a (into new-system (select-keys databases [proj-specific...]))

jaret20:01:16

Yeah, that use case would be fully supported in one system. And definitely recommended.

denik20:01:41

@jaret as well as migration to a new system?

marshall20:01:19

there is not currently an API to “move” a db from one system to another

stuarthalloway20:01:51

likely to be some interesting requirements there ^^ e.g. re-encrypting

ljosa20:01:57

Does Datomic Cloud include support equivalent to what we get with the $5k on-prem license?

viesti21:01:19

hmm, appearance of Datomic Cloud made me look again on this (announced at last reInvent so quite new): https://aws.amazon.com/about-aws/whats-new/2017/11/aws-privatelink-on-aws-marketplace-now-available/

marshall21:01:03

@ljosa Support is covered here: https://www.datomic.com/pricing.html (bottom of page)

marshall21:01:21

we will be enabling AWS Marketplace PSC (basically opt-in to sharing your support contact info) asap

marshall21:01:05

at which point all users who are subscribed will have access to submit tickets/etc 24x7, shorter SLAs, etc are available as a separate support contract

ljosa21:01:48

Yes, I saw that but I wasn't sure how to interpret it. There was a question internally whether support was a reason to stay with the on-prem edition. I wasn't sure whether the support included with Datomic Cloud was equivalent to what we have now with Datomic Pro or whether we'd have to add a separate support contract.

marshall21:01:34

the included support with On-Prem (nee Pro) carries a shorter SLA (2 day IIRC)

marshall21:01:14

if you’re already using on-prem, I’d highly suggest you look at: https://docs.datomic.com/on-prem/moving-to-cloud.html

marshall21:01:48

Cloud is a new product, with different underlying structures and requirements, so it’s not “plug and play” WRT moving from pro to Cloud

ljosa21:01:07

Thanks! Don't worry, it would take us some time to be ready to switch over … for one thing, we'd have to turn our peers into clients.

marshall21:01:43

cool. just wanted to make you (and everyone) aware of the diferences

ljosa21:01:21

do you have plans to eventually release client libraries for other languages, such as javascript?

jaret21:01:20

@ljosa We’re tracking which languages users want on our “suggest a feature” page located by following the link at the top right on http://my.datomic.com. We’re prioritizing issues here and looking for input on which clients are most desired.

ljosa21:01:43

does it matter whether I'm logged in with the paid company account or my own unpaid account?

jaret21:01:48

All votes count, but we weigh larger organizations that represent teams of developers over individual users. I’d recommend voting from your company account or an email with the same domain.

ljosa21:01:40

ok, I voted for javascript and erlang from <mailto:[email protected]|[email protected]>!

shaun-mahood21:01:13

@jaret: Any idea on the possibility of a CLJS client? I couldn't find an issue on receptive but I have this feeling that someone talked about it at some point.

jaret22:01:19

Its on our radar, but we should log a receptive request so we can gather customer feedback. I am going to log that one right now.

timgilbert22:01:18

Re: clients, I'd personally favor the work being put into a documented wire protocol versus like three language-specific clients. But I guess once the first one is released the protocol will be de facto documented

jaret22:01:12

Yeah it’s always been the intention to create more language libraries for Client and enable our customers to create their own. So that approach is being considered.

shaun-mahood22:01:21

There are 2 questions that I can't find guidance on in the docs - I've got hunches for both but they may be totally out to lunch. - Is there a recommended way to connect to datomic cloud from an application not running on AWS? - Is there a story for how to backup and restore a DB?

stuarthalloway22:01:45

@shaun-mahood re q2, there will be N stories for different purposes

stuarthalloway22:01:56

e.g. disaster recovery vs. moving db somewhere else vs redundant copies for safety

stuarthalloway22:01:10

… but not done yet, stay tuned

stuarthalloway22:01:58

@shaun-mahood re q1, for dev you should connect through the bastion server. Is that what you mean?

shaun-mahood22:01:13

I'm thinking of running our app locally rather than on AWS (at least for a while).

shaun-mahood22:01:58

I'm planning to integrate it with existing locally hosted storage and systems and gradually replace all the non-datomic and non-clojure bits, then pop the app hosting out to AWS.

stuarthalloway22:01:20

@shaun-mahood we intend to make tooling to help with that, but it is not ready yet

shaun-mahood22:01:41

This is my first real foray into AWS (outside of really S3 and really simple things), but I've lost my socks proxy connection once already so I assume it's not a good production connection.

shaun-mahood22:01:41

I'm glad I'm at least asking questions that make sense and are on the radar 🙂

stuarthalloway22:01:01

@shaun-mahood correct, socks proxy is for dev