This page is not created by, affiliated with, or supported by Slack Technologies, Inc.
2016-06-23
Channels
- # admin-announcements (2)
- # arachne (2)
- # beginners (76)
- # boot (241)
- # cider (14)
- # cljsrn (2)
- # clojars (3)
- # clojure (94)
- # clojure-android (12)
- # clojure-dev (33)
- # clojure-gamedev (1)
- # clojure-greece (3)
- # clojure-india (1)
- # clojure-nl (2)
- # clojure-quebec (3)
- # clojure-russia (21)
- # clojure-spec (38)
- # clojure-uk (72)
- # clojurescript (62)
- # cursive (20)
- # datascript (3)
- # datomic (14)
- # devcards (1)
- # dirac (14)
- # emacs (11)
- # hoplon (7)
- # jobs (2)
- # keechma (1)
- # lein-figwheel (9)
- # leiningen (9)
- # luminus (1)
- # off-topic (6)
- # om (13)
- # onyx (30)
- # planck (181)
- # proton (3)
- # re-frame (6)
- # reagent (6)
- # specter (108)
- # spirituality-ethics (7)
- # untangled (3)
I was looking over the Flink documentation and saw an interesting feature... their Workers, which seem to be equivalent to Onyx Peers, allow for multiple tasks to be run in a single task slot, thereby allowing for reuse of resources on resource-light tasks. Wondering if something like this has been considered for Onyx (or even feasible)? One concern I'm running into right now is that for a long running workflow (say connected to kafka), a single peer could be always tied up on a very light task (eg: simple data transform), which may be a waste of resources as a whole. Or am I misunderstanding?
@manderson: this will be possible after the current refactor - i.e. tasks sharing a thread. The best approach at the moment is to give those sorts of tasks high batch-timeouts to ensure they’re not doing work most of the time, which frees up some resources for other tasks
The latest refactor gives us a lot of flexibility about how we interleave computation
so, the peer will still be assigned exclusively to the task, but if the batch-timeout is high it will be idle and allow for other peers to leverage the freed up resources, correct?
Yeah, the thread will be blocked, so at least it won’t be burning CPU
It’ll be up to the OS to schedule the other threads though
Not yet, it’s in a pretty heavy state of flux atm. Hopefully we’ll have something a bit more alpha in a month or so
Yeah, I agree. It wasn’t the reason for the refactor, but I’ve made sure that the way we’ve architected things will allow for it
We’re currently property testing peers interleaving actions with a single thread, which allows us to find some pretty complex bugs
So this kinda fell out of that too
Can’t do that if you are starting threads all over the place 😄
I’ve had a look at pulsar. It’s interesting, but I really don’t want something else in the middle that might slow things down
It’s important enough that I want to own it
just FWIW, a couple other Flink features that struck me: - the Kafka listener allows consuming from multiple topics into a single workflow - resource scheduling. this one is a bit more vague, they're using YARN, which is less than ideal IMO, but if there was some similar mechanism in Onyx perhaps leveraging Mesos or something, that could be really powerful.
For the kafka listener, I think expanding out to multiple tasks programatically isn’t so bad, especially once we can have threads that share tasks
For the second part we’re going to come up with some kubernetes tutorials for this sort of thing. Mesos should be similar
Lots to do 🙂
We’ll get there