This page is not created by, affiliated with, or supported by Slack Technologies, Inc.
- # admin-announcements (10)
- # beginners (18)
- # boot (29)
- # capetown (2)
- # cider (46)
- # cljs-dev (1)
- # cljsrn (69)
- # clojure (126)
- # clojure-android (9)
- # clojure-gamedev (3)
- # clojure-greece (16)
- # clojure-poland (13)
- # clojure-russia (45)
- # clojure-spec (27)
- # clojure-uk (21)
- # clojurescript (99)
- # cursive (1)
- # datascript (1)
- # datomic (42)
- # functionalprogramming (10)
- # hoplon (47)
- # instaparse (12)
- # jobs (5)
- # jobs-rus (9)
- # keechma (22)
- # lein-figwheel (8)
- # leiningen (5)
- # luminus (1)
- # mount (7)
- # off-topic (1)
- # om (15)
- # onyx (47)
- # other-languages (14)
- # planck (28)
- # proton (8)
- # re-frame (30)
- # reagent (15)
- # remote-jobs (3)
- # slack-help (2)
- # untangled (9)
- # yada (6)
Hello, any reason the https://github.com/onyx-platform/onyx-kafka-0.8 is on 0.8 and in "maintenance mode" ?
Hi @nha. Bceause it’s the 0.8 plugin, using a different dependency which supports 0.8. Therefore it’s in maintenance mode, while the mainstream plugin supports 0.9 (and possibly 0.8)
For what it's worth, I tested the 0.9 bindings with 0.8 - they're incompatible.
If you're trying to prevent an empty batch being processed by a lifecycle, it's probably just better to check whether the batch is empty and keeping the batch timeout long
@lucasbradstreet: This is a symptom of the way we're using Onyx in this project. It may offend your sensibilities but we're basically only ever firing one segment
@acron: Unless the job completes very quickly, you should probably look at redesigning that somehow. You're going the most coarse fault tolerance possible. If any step in that single segment's processing fails, it needs to go back to the root task that it came from.
For segments that process quickly, it's completely acceptable. But for one segment per job you're potentionally paying a heavy price.
@michaeldrogalis: yeah, we realise we're in non-standard territory but there are still elements of a job that are asynchronous and the way we've designed peers is that they can participate in multiple jobs
Well, the peers have a bucket of fns - each job can be any arrangement of those fns - so one job might be A->B->C, another job might be X->Y-Z
Hard to say since I'm not looking at the code, but I think Onyx can already do what you're thinking of without any extra code. Every virtual peer can participate in any task, unless you used tags to specify otherwise. The only iron-clad guarantee right now is that every virtual peer will work on at most one task at a time.
Which is why typically all functions get deployed to all peers. Onyx will selectively use them.
And as we're in the unique circumstance where we know there's only one segment,...hence my question about the timeout
Right. Yeah, we don't support indefinitely blocking. It's too far outside of what it was designed for. You can jack up the timeout super high, but that's just a bandaid. What's the harm in processing an empty batch?
We've written some plugins to introduce state into the job - this allows us to merge tasks in a job and also introduce loops... we need to add empty batch handling into those plugins, that's all
Empty batch checking is the way to go, though. I cant think of another way to handle it without introducing new primitives into the streaming engine.
Onyx 0.9.7 is officially out. The plugins and documentation are still building, but you can get core from Clojars right now. The rest of the build should finish in the next 2 hours.
With Onyx that would be a quick way to get multiple sequential aggregations while preserving fault tolerance.
@michaeldrogalis: it was from this article: http://hortonworks.com/blog/storm-kafka-together-real-time-data-refinery. The summery was:
I understand the first two reasons, but not > Modularize your key cluster resources to most intense processing phase of the pipeline i suppose i don’t understand what extra modularity is achieved. Ill have to research a bit and see if anything clicks. I recall seeing a talk by another company that did something very similar. > With Onyx that would be a quick way to get multiple sequential aggregations while preserving fault tolerance. @gardnervickers: How does interviewing kafka introduce more fault tolerence? Thanks!!!!
* Incrementally add more topologies/use cases * Tap into raw or refined data streams at any stage of the processing * Modularize your key cluster resources to most intense processing phase of the pipeline