This page is not created by, affiliated with, or supported by Slack Technologies, Inc.
2018-03-13
Channels
- # atlanta-clojurians (1)
- # beginners (148)
- # boot (13)
- # capetown (1)
- # cider (38)
- # cljs-dev (92)
- # clojure (61)
- # clojure-brasil (1)
- # clojure-dev (6)
- # clojure-italy (21)
- # clojure-losangeles (1)
- # clojure-nl (7)
- # clojure-norway (20)
- # clojure-spec (70)
- # clojure-uk (64)
- # clojurescript (69)
- # core-async (11)
- # cursive (6)
- # data-science (2)
- # datomic (50)
- # docker (2)
- # duct (12)
- # figwheel (1)
- # fulcro (81)
- # graphql (19)
- # jobs (3)
- # jobs-discuss (44)
- # keechma (1)
- # leiningen (1)
- # mount (2)
- # off-topic (10)
- # onyx (19)
- # parinfer (3)
- # portkey (6)
- # re-frame (4)
- # reagent (145)
- # reitit (1)
- # ring (1)
- # ring-swagger (2)
- # specter (1)
- # sql (4)
- # tools-deps (43)
- # unrepl (29)
- # vim (1)
what's the best way to aggregate a list of segments using group-by-key without knowing the size of the list? Basically I have an incoming set of segments that are related through their ids and I need to have them grouped. I'm having a hard time wrapping my head around using state as a way to group segments
@dbernal Windowing to the rescue! Why do you think you need to know the size of the collection?
ok I that's what I was starting to look into. I was mostly looking at the group-by-key test code and seeing how it was merging maps. I thought I needed to know the size of the collection so that the task would know when to emit the full aggregated collection. For a batch job, would I use the global window type and use the state map to handle the grouping of segments?
@dbernal Correct, yeah. You definitely want windows and triggers there.
group-by-key will segment data across windows, but the windows themselves will maintain the state, and triggers will emit aggregations
You can also access window contents without triggering via HTTP
@michaeldrogalis is there a particular way to trigger at the end of the batch? How would I be able to trigger after all segments have been processed by the task?
@dbernal What input storage are you reading out of?
@dbernal The job will complete after all segments have been read out of a SQL table, and you'll receive a trigger event for completion. You can flush them then
@michaeldrogalis is there a particular trigger type I would use?
Shouldn't particularly matter if all you're doing is looking to flush at the end - otherwise just make it no-op based on non-matching event types.
Almost all of the triggers count job-completed as a reason to trigger e.g. https://github.com/onyx-platform/onyx/blob/0.12.x/src/onyx/triggers.cljc#L84
ok cool. Got it! Thanks for all the help @michaeldrogalis @lucasbradstreet
Anytime!
@dbernal yep, use :trigger/emit
, see http://onyxplatform.org/docs/cheat-sheet/latest/
generally you will want to use :onyx/type :reduce
on the task that does the sending downstream, because you typically don’t want to pass down the segments that are being reduced
@lucasbradstreet ok thank you!