Fork me on GitHub

@lucasbradstreet that's actually interesting, you would think that lots of small tasks are easier to scale / manage than a few big tasks. is this an area that could be improved upon within onyx ? i recall that each virtual peer had a dedicated worker thread, is this correct ?


@lmergen It does, yeah. The design is set up to one day allow peers to share the same thread, but there's work to be done to bring it home.


I've written an optimizer for a few customers that takes an Onyx job, compresses small tasks together, and spits out a new Onyx job. Never did get any of them over the line to open source though 😕




That's the beauty in the data driven approach. You can write your tasks at a granularity level that's suitable to your domain, and let an optimizer fuse tasks together that are better off running as a single unit.


This is definitely an area that would be great for the community to work on if anyone's interested.


@lmergen there are two main problems with it, one is that the overhead from all of the messaging and upkeep can be far greater than the cost of the function call. The second problem is that when you run up a lot of peers on a single machine (unscaled), all those costs bite more and you need to tune Onyx.


I think the one place where that may not be true is that we may not be evicting the groups themselves.


right, i was thinking indeed about the overhead of maintaining a large number of channels


i think it would be relatively easy to make one thread hjandle multiple virtual peers, but it's a completely different problem altogether to fix the messaging channel issue


at which point you realise that it's probably the wrong place to fix things 🙂


so in the end, design your workloads for fewer tasks with high CPU usage and you will be fine


Yeah, it’d still be good to allow peers to share threads for low use tasks. As of 0.10+, it wouldn’t be too hard to build since the peer is built to be asynchronous and yield when it has nothing to do


The main reason I haven’t done it yet is that there are additional concerns around scheduling, as suddenly you have some peer sharing threads, maybe some that are not, and you would probably want to rate them differently in the scheduler.