onyx 2016-08-01 | Slack Archive

whats the difference between onyx-starter and onyx-template?

when i try to run lein test inside my newly created template i get an exception involving bookeeper. im 50% sure its because i need to execute it inside the docker container.

org.apache.bookkeeper.bookie.BookieException$InvalidCookieException: Cookie [4
                                                                     bookieHost: "10.6.6.178:3196"
                                                                     journalDir: "/tmp/bookkeeper_journal/3196"
                                                                     ledgerDirs: "1\t/tmp/bookkeeper_ledger/3196"
                                                                     ] is not matching with [4
                                                                     bookieHost: "192.168.0.11:3196"
                                                                     journalDir: "/tmp/bookkeeper_journal/3196"
                                                                     ledgerDirs: "1\t/tmp/bookkeeper_ledger/3196"
                                                                     ]

^ as 192.168.0.11 is my docker_host ip. I suppose i’m not sure what the development flow should look like. On another project that used docker, i develop/tested the app inside the docker container by using volumes to avoid re-building each time. Would something that that work here?

gardnervickers01:08:52

192.168.0.11 is not accessible from within a docker container

gardnervickers01:08:11

Are you running bookkeeper externally?

Chris O’Donnell01:08:15

^ I get the same exception with the leiningen template

Chris O’Donnell01:08:30

git clone -> lein test -> that exception

Chris O’Donnell01:08:40

err, lein new rather

Drew Verlee01:08:10

@gardnervickers: I thought their was a local bookeeper & zookeeper that onyx used for testing. I’m just doing what codonnell laid out and expecting it to work. This is my first time stepping outside the onyx-learn session, so prepare for more newbie questions 🙂

gardnervickers01:08:24

I’m not seeing this on a new template. Try this? http://www.onyxplatform.org/docs/user-guide/latest/faq.html#cookie-exception

Drew Verlee01:08:30

> Are you running bookkeeper externally? I’m not personally do anything bookeeper related at this point, which might be the problem.

Drew Verlee01:08:44

success! ill make sure to check the FAQ next time.

Drew Verlee01:08:25

@codonnell: ^ try removing the bookeeper folders as per @gardnervickers suggestion above. my lein test runs without issues now.

Chris O’Donnell01:08:08

@gardnervickers @drewverlee thanks for the suggestion; worked for me as well. I'm embarassed I didn't see that earlier.

Travis13:08:08

Does an onyx peer have any type of health check that can be given to something like marathon ?

lucasbradstreet13:08:47

There’s not really a ‘good/bad’ health check, but you can build your own monitoring config that will give you a pretty good idea based on the metrics that you would generally care about. I just built a prometheus end point for Onyx that might help here. I know Kubernetes can integrate with that, but I’m not sure about Marathon

gardnervickers13:08:35

@camechis: All I’m doing is (spit :ok /opt/health) in kubernetes

gardnervickers13:08:52

After I startup the peers

gardnervickers13:08:02

Thats more a readiness thing though

Travis13:08:22

Yeah, just curious if there was something there or not so marathon can do health check status for upgrades/rollbacks and such. @lucasbradstreet I am very interested in the prometheus stuff. I was thinking of giving prometheus a try for monitoring/metrics

vijaykiran14:08:37

@michaeldrogalis: some of the docs in user-guide are still under md format, I’m gonna convert them to adoc tonight

thomas14:08:07

Hi, I had some questions with regards to Onyx and see if our current set of problems would be a good use case for Onyx. At the moment we have about 7 micro services, each implemented in Python and they communicate with each other through Kafka queues. ie. each micro service takes a message from a queue, does some work and posts its result back on a different queue. At the end the result (fail/pass) gets posted to a Elastic Search DB. Work enters the system through a HTTP POST call. We have several instances of some of the micro services, some a few more than others as some tasks are longer running (downloads are needed in some and thus can take a long time) *) Would it be feasible to do something like this in Onyx? *) Would we still be able to upgrade each functional bit one at a time? *) Could we have more workers of a certain type? A short answer would be sufficient as the chances of me convincing my team to use Clojure/Onyx are pretty slim 😉 I was just wondering how I would solve a problem like this and whether Onyx would help here. Thank in Advance, Thomas

michaeldrogalis14:08:07

@vijaykiran: Cool, thanks. Let me know if there's anything I can do to aid.

michaeldrogalis14:08:23

@thomas: To the last two bullets - yes, Onyx would be good at this. It's less useful for processing very long running tasks. More specifically, if the time to process a single record in a task is high, the replay granularity is too coarse for your needs.

dignati15:08:34

@michaeldrogalis: Kind of a fuzzy question but what would you consider a long running task, roughly?

michaeldrogalis15:08:19

@dignati: Anything where the cost of a full replay is too expensive. Probably anything higher than 2-3 minutes is prohibitive for most companies -- I'd suspect.

michaeldrogalis15:08:40

There's no functional problem with doing a replay for a very long running task - it's just a matter of asking if you're willing to pay the cost.

dignati15:08:46

Alright, just wanted to get a rough feeling what you meant with 'long'. Thanks 🙂

dominicm15:08:13

@michaeldrogalis: The replay overhead could be subverted by having a store somewhere which is used as a "did this already run?" right?

dominicm15:08:34

Also, how do things like onyx-datomic work around replay problems?

dominicm15:08:39

Sorry, didn't see this in docs

michaeldrogalis15:08:25

@dominicm: Using a centralized data store to track progress would make throughput really low. Onyx uses an in-memory algorithm to incrementally track progress in with a ~20 byte constant space per segment, and uses the input medium to handle restoring from a fault. http://www.onyxplatform.org/docs/user-guide/latest/architecture-low-level-design.html

michaeldrogalis15:08:30

See the "Messaging" section.

michaeldrogalis15:08:40

onyx-datomic depends on which input task it's using. You can read from the transaction log as a stream, or read from a partition of datoms at a particular basis-t. For the latter, we partition the full range into discrete zones, then propagate the zones downstream. Each zone's progress is tracked independently using ZooKeeper offsets.

thomas15:08:49

thank you @michaeldrogalis and regarding time… what kind of time frames are we talking about… seconds, a few minutes? 5 or 10 minutes?

michaeldrogalis15:08:25

@thomas: See discussion above with @dignati ^

thomas15:08:17

oops sorry missed that. thank you.

michaeldrogalis15:08:05

Sure thing.

dominicm15:08:53

@michaeldrogalis: Oh, I'm a fool. I thought that onyx-datomic was a write, not a read.

michaeldrogalis15:08:25

@dominicm: It does both 🙂

dominicm15:08:42

Oh, 😛. Then I really do not understand your answer about replay issues for datomic. I think my more general question is, how do I handle database writes with onyx, given that they might replay?

michaeldrogalis15:08:59

@dominicm: Is your question aimed at handling idempotent writes?

michaeldrogalis15:08:14

e.g. if I write record X twice, how do I make it show up once?

dominicm15:08:35

@michaeldrogalis: I think so, yes.

michaeldrogalis15:08:23

@dominicm: That's more dependent on your database and application than Onyx. It can happen with any distributed application. Writer tries to write, maybe fails, tries again.

dominicm15:08:51

@michaeldrogalis: Onyx is my first experience with distributed programming, so still trying to figure out where the lines are. I'll have a dig into some keywords now I know it's part of the larger problem 🙂 Thanks a bunch

michaeldrogalis15:08:27

@dominicm: Sure thing, good luck. Happy to answer any other questions.

Travis17:08:06

Do output steps need to be a plugin or can we just right a simple task ?

michaeldrogalis17:08:48

See Leaf Tasks in the Functions section of the user guide.

Travis17:08:56

👍

Travis17:08:33

Ok, what if I need to write to a database

Travis17:08:34

lol

gardnervickers17:08:25

If you’re fine not having message guarantees you can just write to the database in your leaf task. If you want to preserve at-least-once message processing you should make an output plugin.

Travis17:08:33

Ok, essentially what I am attempting here is we want to store the original segment ( before processing ) into Cassandra. We want to have the original data for any kind of future batch processing / debugging purposes

michaeldrogalis17:08:20

You'd still get at-least-once if you used a leaf task -- you'd just be missing out on better batching control and a few other things if you used a plugin.

Travis17:08:22

gotcha, we were thinking of having a little fork at the beginning of our pipeline that sends the orginal data out of kafka and store is and cassandra on one side and then processes/enriches on the other

Travis17:08:38

the last stage of the second fork would be a window / trigger to write to elastic

Travis17:08:44

actually wondering what to use for a leaf with the window part since the trigger is really the end for us?

Travis17:08:03

onyx.tasks.null i wonder?

michaeldrogalis17:08:43

:onyx/fn :clojure.core/identity

Travis17:08:22

ok, so just use the identity function for our final leaf task if the window/trigger is our real final step in the pipeline

michaeldrogalis18:08:20

Yeah.

Travis18:08:44

ok, cool! thnx @michaeldrogalis !

sparkofreason22:08:32

I have a case where segments are processed in the context of some metadata which can change. Some external trigger will signal that updated metadata is available, and one way or another tasks need to be updated. Any "best practices" around this kind of scenario? The two main thoughts I've had are having some mechanism that just restarts the job when metadata changes, or perhaps having a plug-in.

michaeldrogalis23:08:07

@dave.dixon: I'd recommend restarting the job since it's pretty quick, and it keeps mutability out of the picture. I'd probably be inclined to push a message onto Onyx's centralized log, and use a log subscriber to read along the log and look out for restart signals. Since log subscribers are stateless, you can run multiples of them to handle a fault.

sparkofreason23:08:11

@michaeldrogalis: Nice, thanks. With multiple subscribers, does only one subscriber receive a log event? Or should I just not worry about multiple job restarts?

2016-08-01

Channels