onyx 2016-03-22 | Slack Archive

ymilky00:03:47

@richiardiandrea: you can use the embedded libraries I have, but I prefer them more for unit testing personally. If you want something running closer to a production environment but on your dev machine, I'd go with a docker image

ymilky00:03:25

This one isn't the best image ever, but it certainly is quick to get started with - https://github.com/wurstmeister/kafka-docker

richiardiandrea00:03:18

ah yes I remember that one

ymilky00:03:37

if not those, then as I mentioned https://github.com/ymilky/franzy-embedded and https://github.com/ymilky/travel-zoo together work. I haven't pushed a real version to clojars yet of anything but travel-zoo, but I plan on doing so when I get more testing from users.

ymilky00:03:29

If you use franzy-embedded, be sure to use the startable versions unless you want to write some extra code. I might put out one or two additions to help with that, but honestly the startable versions are good enough for 99% of people

richiardiandrea00:03:41

yeah I might go for this

ymilky00:03:02

my flow is to develop against docker and unit test with the others

richiardiandrea00:03:09

I will be franzy all the way 😄

richiardiandrea00:03:52

I guess the first step is testing and having a good strong dev environment

ymilky00:03:08

nice...well if you run into problems, let me know. I'd say also make sure you grab https://github.com/ymilky/franzy-admin to make your life easier for dealing with creating topics, partitions, fixing stuck things, etc.

richiardiandrea00:03:38

yes I will, I actually did not notice you had the embedded package

ymilky00:03:41

It's not too hard to get something going, but it takes some more time to develop your data flows with consumers and producers. For example you probably will want to spend some time on writing some threads and go blocks with core.async. I usually do this by wrapping things in components and starting the threads inside there, much like you see in the onyx source as well. I've also used manifold in conjunction with all and the rest of the jvm concurrency options. I might put out some examples of this kind of stuff, but my feeling is everyone's flow is very different that copying one isn't really smart anyway.

richiardiandrea00:03:55

yes you are right, I was actually thinking of trying quasar and pulsar, concurrency (with clojure) is usually fun for me so I don't mind that 😉

ymilky00:03:06

ok sounds good, just make sure you are careful with the shutdown. I've helped a few people with problems and usually they're doing something like tight looping on a consumer that can never get shut down properly because the loop can never be exited. In core.async, this is very easy if you just close a channel or send a value on a dropping buffer or promise chan

richiardiandrea00:03:46

yeah, thanks for the advice, it is true that it might be tricky to remember

ymilky00:03:36

the consumer should be polling in its own thread, and shutdown with wakeup! from another, while pretty similar for the producer. I'm not sure why so many people keep doing things like creating and destroying consumers/producers in loops. Not something with Franzy, just something I've seen in all languages and real-life code in Kafka. Most of the time you want something running a long time.

richiardiandrea00:03:29

I wonder what happens in core.async if you throw an InterruptedEvaluationException, just went to check for a sec

richiardiandrea00:03:56

I might need to try it out. no useful googling so far

ymilky00:03:11

Why do you need to throw that exception? Or do you mean catching it inside a thread or go block that is running something else?

richiardiandrea00:03:49

no just wondering as corner case, no need of course, just digressing 😉

ymilky00:03:50

In general, I've never heard that you can/are supposed to interrupt core.async threads or go blocks except by closing them yourself, i.e. you can't interrupt all threads on exit without doing it manually

ymilky00:03:08

in practice it's actually not a problem I've found, especially using component or mount

richiardiandrea00:03:29

yes it does not make sense as you can return nil and this signals closing it

ymilky00:03:38

usually you've done something super wrong if you need to interrupt all threads at a low-level and you're not writing an os or something

richiardiandrea00:03:43

yeah

ymilky00:03:03

just close the channels and use alts! on a kill chan or control chan

ymilky00:03:59

most of the time for a loop, I also/instead just check a when on some channel if it's closed or not when I put, which means you should stop recuring/looping

richiardiandrea00:03:07

yeah this protocol of nil-ing for closing the chan really makes everything super smooth

richiardiandrea00:03:32

ok good, this is very useful and thanks! I will try your franzy packaging, I am actually excited about it 😉

gardnervickers05:03:14

Careful with closing a channel from the consumer side. Channels are for multi-writer->multi-reader concurrency and there’s no guarantee that in between calls to closed? and your >!!, the channel is actually open.

ymilky05:03:41

I actually avoid ever closing anything from the consumer side, I just let the loop break due to putting on to a closed chan or close it from the creator/producer, often a component that owns the chan in its stop method, perhaps with a take from the go block itself to block until it's done. Perhaps I'm explaining bad, but essentially I never use closed?, just closed in the sense of the standard when/if check on the op. From what I've seen in onyx, ex: lifecycle/peer files, this is the same approach.

lvh15:03:55

In the template, why is the Aeron driver separate from the peers?

lvh15:03:16

In order to improve test coverage and reduce the amount of code there, I’m looking at potentially merging them since it doesn’t seem like I ever want one without the other

michaeldrogalis15:03:16

@lvh: In production, you'll likely want to run Aeron in a separate JVM to avoid thread contention. We gave it its own namespace so that you can have a separate -main, which would easy the transition.

michaeldrogalis15:03:50

Aeron is pretty sensitive to context switching if there are too many active threads. Certainly for development you can put them together, though.

lvh15:03:36

aha, OK, gotcha

lucasbradstreet16:03:38

It's also so that if you get a big GC pause in your main JVM, you don't take out your media driver

avocade21:03:12

@michaeldrogalis: just a quick Q, is there a plan to move from zookeeper to etcd, and other more nimble components like that for the other services, to move to a more light dependency stack?

avocade21:03:14

btw thanks for onyx, I loved storm but was blown away by the data pureness and comprehensivess of your project

michaeldrogalis21:03:45

@avocade: Not at the moment, no. Etcd can't act as a substitute for ZooKeeper. It lacks a number of primitives - the most prominent being ephemeral nodes, which we in turn use as observation triggers.

michaeldrogalis21:03:52

Thank you.