This page is not created by, affiliated with, or supported by Slack Technologies, Inc.
2016-11-25
Channels
- # bangalore-clj (5)
- # beginners (225)
- # boot (36)
- # cider (1)
- # clara (2)
- # cljsjs (1)
- # clojure (76)
- # clojure-belgium (1)
- # clojure-conj (1)
- # clojure-india (4)
- # clojure-italy (5)
- # clojure-korea (1)
- # clojure-russia (22)
- # clojure-spec (35)
- # clojure-uk (52)
- # clojurescript (67)
- # community-development (17)
- # core-logic (2)
- # cursive (2)
- # datascript (28)
- # datomic (44)
- # emacs (1)
- # funcool (3)
- # hoplon (14)
- # lein-figwheel (2)
- # leiningen (2)
- # luminus (3)
- # midje (3)
- # mount (2)
- # nyc (2)
- # om (54)
- # om-next (1)
- # onyx (30)
- # re-frame (57)
- # reagent (19)
- # ring-swagger (23)
- # slack-help (10)
- # spacemacs (2)
- # specter (1)
- # vim (23)
Please review and merge https://github.com/onyx-platform/onyx-dashboard/pull/74
I’m presenting an Introduction to Stream Processing with Kafka and Onyx at ClojureX next week.
@jasonbell Will it be recorded?
What is the best way to just discard segments in output? In my workflow, I’m only interested in aggregations which I sync with a trigger. Currently I have an output function.
You can return an empty vector from your output onyx/fn or you can use flow conditions to avoid then being sent down at all
And the segments will be ack’ed if the go into the output function? I ask because I have problems that my job is not finished in some situations.
My problem was the default :onyx.messaging/inbound-buffer-size
of 20.000. I need more like 1 million otherwise the job just stalls. Would be good to add this to the FAQ.
@akiel Can you explain how you went about crafting your job? Usually that knob doesn’t need to be touched.
My workflow is a but unusual. I start with a seq input of 7 segments. The first task extracts about 100 segments from each input segment. The second task is 1:1 on segments. Now the third task blows up each segment up to 1 million times. After that I group by a function and aggregate over a global window. A trigger writes out the aggregates.
@akiel You’re using almost all the features. 🙂 Cool.
Yeah, very large fan outs are rare, but it makes sense for your situation there. What domain are you working in?
Nice, that’s pretty cool.
@michaeldrogalis My window aggregation is idempotent. So I don’t need exactly once aggregation updates. Can I configure my job so that I don’t need Bookkeeper? I already use :onyx/deduplicate? false
and have no :onyx/uniqueness-key
set.
@akiel It can tolerate a replay?
BK can’t be turned off when using aggregations. Most aggregations can’t gain exactly once semantics without the extra help.
How do you prevent aggregating a duplicate segment?
Got’cha, that would do it. Is it feasible for you to write a lifecycle that maintains an atom to write your state into? You can periodically put that on storage.
Since assoc-in is friendly to replays, you can actually get away without aggregations-proper in Onyx altogether and not use BK.
I had the atom solution before. But I hit problems and started with the window aggregation. At the end my problems are likely solved by the inbound-buffer-size
config. So I may go back to the atom solution.
What were the issues that you came across?
Ah, right’o. Seems like you’re on the right track then.