2018-09-21 onyx | Clojure Slack Archive

onyx 2018-09-21

2018-09-21T14:56:48.000100Z

I inherited an onyx (0.9.x, we are upgrading but it will be a bit and we have to make do with 0.9.x for a bit) application a while back and I have some basic questions on vpeer / task allocation. We are using the job percentage scheduler, if we add more replicas that spin up more vpeers after the job has been submitted, do tasks dynamically get allocated to the vpeers on those replicas? or do the peers have to be up at the time the job is submitted for them to get allocated?

2018-09-21T15:05:20.000100Z

at the moment the task allocation is static; if a job has. say. 4 vpeers for a single task, that will remain "fixed"

2018-09-21T15:05:45.000100Z

dynamic scaling is a bit tricky in the context of aggregates / window functions, but not completely undoable

2018-09-21T15:08:16.000100Z

ok, so if I understand correctly, If I add vpeers via another replica after the jobs have been submitted and started, those new vpeers will not pick up any tasks?

2018-09-21T15:24:22.000100Z

@lmergen am I getting that right, or am I misunderstanding what you mean, there

2018-09-21T19:38:09.000100Z

@djjolicoeur yes this is correct

2018-09-21T19:38:41.000100Z

it will recover from crashed servers, among others, but it will not be able to dynamically scale up or down

2018-09-21T19:39:30.000100Z

thanks, just validated. we were submitting jobs, then spinning up replicas, which meant we had a ton of unused vpeers

2018-09-21T19:40:06.000100Z

yeah that's not the correct approach -- first spin up the peers, then submit jobs

Lutz 2018-09-21T06:34:59.000100Z

Then how can one implement this? After all, in the documentation for the watermark trigger it states "This is a shortcut function for a punctuation trigger that fires when any piece of data has a time-based window key that is above another extent, effectively declaring that no more data for earlier windows will be arriving." And do I understand right that if you introduce more peers, your window would be fragmented? Is there a way to combine those windows in the case of many peers, to get the actual window of all according data?

lucasbradstreet 2018-09-21T06:37:53.000100Z

So you can do it if you know the group key for the window by sending a segment that will be routes to the right set of windows. However that is not possible with many peers and no routing.

Lutz 2018-09-21T06:40:51.000100Z

Since you are online, here is my follow-up noob question: A setting of :onyx/max-peers 1 for the task that is referred to in the defined windowing will prevent this, correct?

lucasbradstreet 2018-09-21T06:47:11.000100Z

That’s correct

lucasbradstreet 2018-09-21T06:47:48.000100Z

(Assuming that no grouping is used. If grouping is used you will need a segment for each group)

Lutz 2018-09-21T06:54:36.000100Z

Okay, thanks for the clarification. In my application I do use grouping, but per group it is ensured that at least one segment is emitted in each window.

Clojurians Log v2

onyx 2018-09-21