Fork me on GitHub

@michaeldrogalis I saw the mention of Onyx Helm Charts the other night . I am just getting started with kubernetes, what does this provide ?


@gardnervickers can answer that better than me


@camechis We have a few cloned/modified charts we use internally here that might be useful. We’re just now getting to the point were our internal Onyx peer charts are stable enough to generalize and extract out to a separate parametrizable chart. That’s probably a ways off though. If you hit any problems setting up your Onyx peers let me know. There are some tricky bits like setting the BIND_ADDR for Onyx, running Aeron as a sidecar container and setting the memory volume for /dev/shm. Kubernetes manages it all extremely well though.


@jsonmurphy Maybe it was just me using DQ wrong, but I did a naive port to .10 and encountered difficulties with running multiple jobs on the same DQ. DQ seems to save part of the queue in memory, and if you open the queue from a different job, it doesn’t realize that there are already items in there, in the sense that if you complete the last item that a particular job thinks exists in the queue, it was erasing the entire queue for me. Might have been my implementation, but I believe I saw some one else mention this issue in relation to having multiple threads processes.


Ah, if that’s true that would be problematic. I didn’t know DQ worked that way.


Is there anyone here who can been unable to run Onyx because of it’s operational requirements? We’re looking at adding a 3rd runtime environment. As of current, we have fully distributed Onyx, which can horizontally scale compute over granular units of storage (e.g. a single Kafka partition). We also have onyx-local-rt, a pure, in-memory version of Onyx that trades off distribution/fault tolerance for portability to JavaScript runtimes. We’re sizing up adding another runtime that fits in the middle, roughly akin to what Kafka Streams offers. KS helps scale horizontally across many Kafka partitions, but uses only single threaded execution of a single partition - so ability to scale and reshard data is limited, but operational requirements are lower.


This idea would take a normal Onyx job - capable of running in distributed mode, local, or this new in-between runtime, and fuse it down to a single process, which could be deployed as a container. All activity would be completely isolated to that container/process/whatever. That means scheduling is fully offloaded to something like a container orchestration manager.


Would love to hear thoughts if this is appealing to anyone while we develop the idea more.


Heh, nobody’s answering that. Zach is long gone from Factual.


@michaeldrogalis RE: 3rd runtime, would this new runtime basically address those of us who are using the dev environment in production, running on one node?


Yes, and still achieve all the fault tolerance capabilities distributed mode has to offer, at the expense of scalability (all tasks run on a single node, with no messages/parallelism in between tasks)


That would be awesome. Because then you could always scale up to the distributed model and save that work for later?


You’d also be able to host hundreds of smaller jobs with limited infrastructure, and leave all the scheduling work to Kubernetes.