Fork me on GitHub
#onyx
<
2018-07-11
>
sparkofreason22:07:26

I have a batch job driven by a custom input task which generates segments. Those segments are aggregated in time windows, the aggregated values emitted, and further aggregated in a larger window, So, for example, first aggregation is in one hour windows, next aggregates those in a one day window, ultimately feeding a dashboard. The last one or two hour bins always seem to come up short, but only when running in the production environment, not locally. I've tried various triggers, including doing it every segment, not sure where else to look. Any suggestions appreciated.

lucasbradstreet22:07:10

What kind of trigger are you using? With batch jobs it’s possible it’s not triggeri g on final job completion.

lucasbradstreet22:07:22

Are you trigger/emit or using trigger/sync?

sparkofreason22:07:19

I've tried everything I could think of, from my own hybrid segment-timer trigger, to timer, to triggering on every segment. Using trigger/emit. Batch size was 100, if that matters.

lucasbradstreet22:07:06

I believe you’re hitting this https://github.com/onyx-platform/onyx/issues/779, assuming you’re getting a job-completed signal. I apologise if it’s missing from the trigger/emit docs. I’ll fix the docs if so. I really want to fix this but it will require an extra signalling phase to safely complete a job.

sparkofreason22:07:33

Thanks, not sure about a job-completed signal per se, but I assume that's occurring since the job status shows as completed. Lifecycles to the rescue again :the_horns:

lucasbradstreet22:07:06

Right. This might not apply in your case but one option is to emit a sentinel record which triggers the final emit before the job finishes. If you’re using kafka the plugin supports emitting one when it hits end offsets that you supply

sparkofreason22:07:20

Input is my own, so I presume I emit the sentinel. Is that just :done?

lucasbradstreet22:07:07

Yep. That’ll work except it gets a little complicated if you end up partitioning the work somehow, eg. groups

lucasbradstreet22:07:47

It’d probably be best if I fixed the multiple phase completion

lucasbradstreet22:07:06

If you could confirm that’s what’s going on it would help push me over the edge. It can get complicated to fix it otherwise

lucasbradstreet22:07:47

Since you need to signal to each group that it’s finished and to flush

sparkofreason22:07:51

No groups. Let me give it a shot.

lucasbradstreet22:07:05

That should make things easier then

lucasbradstreet23:07:02

Hmm. Did you change anything about the code that implements the window protocols that you’re using? That’s a weird one as the window record should never be nil

lucasbradstreet23:07:32

Oh it might just be on the coerce

lucasbradstreet23:07:53

Maybe it’s passing a nil value for the time unit in? I’m on my phone so it’s hard to check the code that’s calling it

lucasbradstreet23:07:58

Ah. Done / sentinel support was removed and that is not a map, so the done keyword is probably getting passed through. If you return a valid segment and then check for that segment in the trigger you should be good