Fork me on GitHub
#onyx
<
2016-06-23
>
Travis15:06:00

Just saw this blog post

manderson15:06:11

I was looking over the Flink documentation and saw an interesting feature... their Workers, which seem to be equivalent to Onyx Peers, allow for multiple tasks to be run in a single task slot, thereby allowing for reuse of resources on resource-light tasks. Wondering if something like this has been considered for Onyx (or even feasible)? One concern I'm running into right now is that for a long running workflow (say connected to kafka), a single peer could be always tied up on a very light task (eg: simple data transform), which may be a waste of resources as a whole. Or am I misunderstanding?

lucasbradstreet15:06:11

@manderson: this will be possible after the current refactor - i.e. tasks sharing a thread. The best approach at the moment is to give those sorts of tasks high batch-timeouts to ensure they’re not doing work most of the time, which frees up some resources for other tasks

lucasbradstreet15:06:37

The latest refactor gives us a lot of flexibility about how we interleave computation

manderson16:06:13

so, the peer will still be assigned exclusively to the task, but if the batch-timeout is high it will be idle and allow for other peers to leverage the freed up resources, correct?

lucasbradstreet16:06:37

Yeah, the thread will be blocked, so at least it won’t be burning CPU

lucasbradstreet16:06:52

It’ll be up to the OS to schedule the other threads though

manderson16:06:26

gotcha. i'm very interested in the refactor. any docs anywhere yet?

lucasbradstreet16:06:28

Not yet, it’s in a pretty heavy state of flux atm. Hopefully we’ll have something a bit more alpha in a month or so

manderson16:06:51

very cool. i think that will be a big win.

lucasbradstreet16:06:33

Yeah, I agree. It wasn’t the reason for the refactor, but I’ve made sure that the way we’ve architected things will allow for it

lucasbradstreet16:06:43

We’re currently property testing peers interleaving actions with a single thread, which allows us to find some pretty complex bugs

lucasbradstreet16:06:48

So this kinda fell out of that too

manderson16:06:12

nice. yeah, that's a hard problem.

lucasbradstreet16:06:31

Can’t do that if you are starting threads all over the place 😄

manderson16:06:56

have you looked into Pulsar/Fibers at all for that?

lucasbradstreet16:06:11

I’ve had a look at pulsar. It’s interesting, but I really don’t want something else in the middle that might slow things down

manderson16:06:20

yep, makes sense

lucasbradstreet16:06:24

It’s important enough that I want to own it

manderson16:06:29

totally agree

manderson16:06:22

just FWIW, a couple other Flink features that struck me: - the Kafka listener allows consuming from multiple topics into a single workflow - resource scheduling. this one is a bit more vague, they're using YARN, which is less than ideal IMO, but if there was some similar mechanism in Onyx perhaps leveraging Mesos or something, that could be really powerful.

manderson16:06:27

random thoughts 🙂

lucasbradstreet16:06:44

For the kafka listener, I think expanding out to multiple tasks programatically isn’t so bad, especially once we can have threads that share tasks

lucasbradstreet16:06:12

For the second part we’re going to come up with some kubernetes tutorials for this sort of thing. Mesos should be similar

lucasbradstreet16:06:30

We’ll get there

manderson16:06:34

awesome. yep, great stuff. appreciate all the hard work!

manderson16:06:03

i'll keep my ears open for updates on the refactor

andrewhr16:06:38

+1 on this refactor 😄