Fork me on GitHub

when i look at the documentation of trigger/emit, it says that the segment(s) it returns are emitted downstream. what exactly is “downstream” in the context of a trigger ?


Downstream is any task that is connected to the trigger task that is downstream


So if you have A -> B -> C -> D, C will receive the segments from the emit


Subject to flow conditions


right, that makes sense


i already kind of assumed that, but it wasn’t obvious from the documentation


will C also still receive the ‘regular’ segments from B, as is the case when you use sync ?


@michaeldrogalis hey hi hey. There was a time where you had mentioned that with some of the upcoming changes in Onyx (possibly in 0.10) it’d be possible to implement certain machine learning algorithms in Onyx. Do you remember what that was exactly? I was hoping to start doing some research around that.


It might have been related to ABS, but I don’t recall exactly


@stephenmhopper Yeah, you’re remembering correctly. ABS is complete. We’re just waiting to get a little breathing room with everything else that’s going on to release 0.10.0 final. We still have validation in place to disallow cyclic workflows, but that can safely be removed now. The only hitch is that we need to do a tiny bit more work to offer exactly-once aggregations with iteration.


It’s not a priority for us right now, but we could outline the work remaining if anyone wanted to take a shot at it. It’s well within reach. That would allow for ML programs to run on Onyx.


Yeah, I’d be interested in hearing more about that. Right now, I’m working on finding a convenient way to use Tensorflow in an idiomatic way from Clojure. The problem is that the DSL / contract / design that I’m building out is very similar to the way Onyx handles data. My design doesn’t map easily to underlying Tensorflow constructs, but would map well into Onyx (potentially)


Cool, yeah we’re close enough that it’s worth looking at. @lucasbradstreet IIRC, the last step is ensuring consistent checkpoints under iteration, right?


@stephenmhopper Out of curiosity, which algorithms are you implementing?


Well, my goal is to build a useful set of neural net abstractions, similar to the way people use Keras on top of Tensorflow or Theano. Right now, I’m targeting Tensorflow as the underlying runtime, but I’d be interested in exploring what this would look like on top of Onyx


@michaeldrogalis that’s right, we mostly need to make sure the checkpointing handles iteration


Because I’m 95% certain that Onyx handles distributed computations better than Tensorflow


Do you have some recommended reading for me on using ABS to write ML algorithms?


Cool. Yeah, we ought to get moving on this front. It’s one of those big ticket items that we wanted for a while, and now that the majority of the work is finished, we should bring iteration to the finish line.


ABS is the underlying algorithm for doing fault tolerant message passing, you wouldn’t be working with it directly. On the internal side of things, we need to make sure Onyx can properly recover windowing contents during an iterative cycle. On the API front, we need to decide that will look like as far as an Onyx program.


I imagined extending flow conditions, but we should research what Flink is doing to get a baseline.


okay, well I have to run, so if there’s anything you think I should read up on to get a head start on things, just PM it over to me


Sounds good. Thanks!