Fork me on GitHub
#datomic
<
2018-11-23
>
grzm08:11:35

I’m working on CI with ions so am looking to use datomic.ion.dev in CodeBuild. What are the minimal permissions are required to access the datomic-cloud S3 repos? (Or perhaps I’m approaching this wrong?)

stijn10:11:10

@grzm: according to my knowledge it's currently impossible to use CodeBuild unless you are in the same region as the datomic-releases S3 bucket. CodeBuild connects to the outside world through a specific AWS managed VPN endpoint and that doesn't allow cross-region S3 requests. I have filed a support ticket with Cognitect (@jaret). If you are in the same region (is it us-east-1 , I don't know), I believe it should be possible. On the permissions, I have no answer, we were testing with admin permissions first, and will narrow it down once everything works. It would be good to have some documentation though.

grzm15:11:00

Thanks, @U0539NJF7 Can you add more to “impossible to use CodeBuild”? I’m now thinking of maybe creating a Docker image with ion-dev jars and using that in the CodeBuild environment.

stijn16:11:03

@grzm: this is the feedback I got from AWS Support

stijn16:11:06

For security and performance reasons, the traffic from CodeBuild is configured to egress via a VPC Endpoint only. The VPC endpoints used are the VPC endpoints of the AWS Service. Therefore, even if you have not configured the CodeBuild to use a VPC endpoints, the traffic gets routed via VPC Endpoint of AWS Service. And unfortunately, VPC Endpoint service do not support cross-region request[1].

If we want to access the S3 bucket in different region, then our best option will be using Cross Region Replication with a destination bucket in the same region as our CodeBuild project. When an object is created in the source bucket, it will automatically be replicated in the destination bucket by S3 and therefore there is no need to manually copy the object to destination bucket. Although the operation is asynchronous, objects are typically replicated nearly instantly. Please see this documentation[2] for more information about cross-region S3 replication, and this documentation[3] fro information on how to set it up.

grzm17:11:51

Thanks for that. Good stuff to think about. How are you currently doing CI?

mkvlr12:11:20

we’re starting to work on https://nextjournal.com/mk/datomic a runnable article about datomic. The goal is to enable others to learn datomic without having to do any setup. I’ve included the datomic free license at the end of the article. If possible, I’d like to get confirmation from someone at Cognitect that it’s ok to do this. The way I read the license it should be but would be great to get confirmation.

👏 2
👌 1
mkvlr12:11:25

To be clear: datomic free is downloaded in this article and turned into an docker image which is later reused without having to download it again.

marshall14:11:51

@mkvlr Yes, Datomic Free is redistributable, so that is fine

mkvlr14:11:26

@marshall awesome, thanks!

Chris16:11:26

I'm considering an event sourcing application. Although Datomic is a good fit for most of the application's needs, the rate of events is likely to be high, maybe 1000/s and I understand this is not ideal given the transactor. Would it help to batch these events so there are fewer writes per second, even if they are larger writes and the overall data rate remains the same?

dustingetz16:11:38

It is my understanding that the transactor is not actually the bottleneck here, but the total number of datoms → size of the db indexes. the rule of thumb per https://www.datomic.com/cloud-faq.html is 10 billion datoms. 10,000,000,000 / 365 / 24 / 60 / 60 = 317 datoms per second average throughput

Chris11:11:21

Thanks, that’s very helpful. It seems I probably can’t use Datomic for this application then, it is likely to exceed that average. A shame, it was my first choice.

samcgardner11:11:58

You can safely write well over 300 Datoms/s assuming that many of your events modify existing Datoms (which most use-cases primarily do). If you really do want to mostly append new keys C* is probably a closer fit

dustingetz14:11:33

@U8S4V8JE5 Do you have evidence of this? (I understand the reasoning – the index size for present-time queries should reflect the total number of datoms under consideration – but a comment from marshall suggested this may not be the case – http://tank.hyperfiddle.net/:dustingetz.storm!view/~entity('$',17592186047105) )

samcgardner14:11:00

So I think the issue represented there isn't the same thing - that's just saying that you have to perform very large commits to your backing storage if you have massive transactions or blobs, and that's generally problematic for most backing stores. I'm just making the point that there's an enormous difference between 300 inputs per second of any kind and 300 new writes per second - I know anecdotally of usage that's well above 300/s, so it depends on the eventual number of datoms in the DB, not the number of writes/updates

samcgardner14:11:45

I don't think I added anything to what you said, was just clarifying for Chris as he seemed to take your comment as a "no" for Datomic in his use-case, which wasn't clear to me from what he said

Chris16:11:45

Thanks for the further detail. I’m struggling a bit to keep up but it seems maybe Datomic could work. The use case is building a graph with edge weights incremented based on a stream of events, plus the occasional new node. The number of nodes would be a few thousand, and I’d expect them to average 100 edges each, so maybe only a million datoms total. But the rate of updates is pretty high - 1000 events/s, with each event updating 10-100 edges. I think these could be batched to an extent, but couldn’t say how much that would help.

dustingetz22:11:02

You might be able to just test it at the repl

👍 1