onyx 2018-05-11 | Slack Archive

@lucasbradstreet thanks so much! That makes a lot of sense

👏 Congrats on releasing Pyrostore. Looks awesome! It's the exact tool to fit the architecture I've been working on for the last couple of years. Wish I had it when I started 🙂 Keep up the great work!

👍 8

michaeldrogalis14:05:12

Thanks @manderson!

nrako16:05:37

For Pyrostore, I am trying to understand a bit more. I see this on the blog-

Pyrostore's consumer reads records directly out of cloud storage, and it's intelligent enough to cross its reads back into Kafka when records are not yet available in the archive.

I don’t think I follow what cross its reads back into Kafka means. Is Pyrostore intended to be a Kafka-history-stream alongside of Kafka to be used by select consumers? Or would you envision all consumers read from "cloud storage" as a scalable stream-with-cost-effective-scalable-storage around Kafka? Both/and?

michaeldrogalis16:05:37

@nrako Both, sort of. The archive in cloud storage will always be a little behind on what's actually in Kafka because, physics. Consumers can choose the policy for which storage they read out of when the records it wants exists in both.

michaeldrogalis16:05:02

It let's you trade-off read scalability (better against the cloud), latency (better against Kafka), availability (probably better against the cloud), etc.

nrako17:05:49

I see. Thanks for the note. So Pyrostore proposes to be an infinite, cost-effective replication of the Kafka stream, and where the Pyrostore consumers are subscribed (whether Kafka itself or archive) is a configurable implementation detail.

lucasbradstreet17:05:50

That’s a pretty good summary, yes.

nrako17:05:02

Sounds great. Just now reading Designing Data-Intensive Applications and thinking through what an implementation would look like. An infinite Kafka stream seems critical. Thanks for the feedback...

michaeldrogalis17:05:57

Great book 🙂

eoliphant18:05:07

Hi I have a quick conceptual question. I’ve played around with onyx for some simple use cases. I’m now looking into implementing something along the lines of calderwood’s commander pattern. And just discovered there’s already an onyx example 🙂 My question is more around ‘unit’s of deployment’ with onyx Let’s say I take your commander example, it’s my ‘accounts’ processor, all good. Now I want to add a ‘customers’ processor to the mix, keeping its state in its own datomic db. Is it simply a matter of a similar project that i jar up and point to the same zookeeper/kafka/etc ?

lucasbradstreet18:05:07

@eoliphant Onyx is pretty flexible in this respect. The main thing is that the jar is started for a given tenancy contains all of the code necessary to run the jobs for that tenancy.

lucasbradstreet18:05:53

@eoliphant so you could have two separate jars on two separate tenancies, each from a project that runs its own code. Or you could have a jar that is able to run code for both, on the same tenancy.

lucasbradstreet18:05:26

Or lastly you could have a jar that can run code for both on separate tenancies, which gives you some more scheduling / node isolation.

eoliphant18:05:49

ok that helps. In my case these guys are basically microservice/command processors. so yeah they have all the code they need for the commands they handle and processing events they may be interested in. So conceptually they should be relatively independent. Based on what you’re saying, it sounds like, for me, each service/processor should be in its own tenancy

lucasbradstreet18:05:52

Sounds right. It’ll be easier to schedule as you can just add more nodes to a tenancy as you wanna scale up

eoliphant18:05:03

so beyond that, say these guys are dockerized, etc. I’d just run 1 to n copies for reliability, etc?

eoliphant18:05:11

gotcha

lucasbradstreet18:05:46

Yeah, you can add more peers than you need so the job will continue running as nodes fail

2018-05-11

Channels