Fork me on GitHub
#onyx
<
2016-04-26
>
Drew Verlee00:04:40

is apache bookeeper used to coordinate state between nodes in the topology or between peers in a node?

gardnervickers00:04:22

The peers use it to persist incremental state updates when using aggregations.

Drew Verlee01:04:42

so to be clear, its not for communicating between nodes in a workflow?:

(def workflow
  [[:read-segments :does-some-aggregration
   [:does-some-aggregation:write-segments]])
rather it would be for making sure peers inside :does-some-aggregation were on the same page?

gardnervickers01:04:17

Yea, Aeron handles the message transport from :read-segments to :does-some-aggreagation

gardnervickers01:04:55

Bookkeeper is used to make sure that in the event one of your :does-some-aggreagation tasks crash, it’s state can be recovered

Drew Verlee01:04:05

awesome, thanks for clearing that up!

acron13:04:13

Is there a way to merge segments in a workflow? I have a diamond pattern and I want the last task to wait until he has a segment from each of its two parent tasks

gardnervickers14:04:35

Hey, currently Onyx has framework level support for doing one join per leaf task

gardnervickers14:04:06

This is planned to change soon with the implementation of the ABS streaming engine.

gardnervickers14:04:55

But until then, you can get away with breaking your "diamond" into two workflows and persisting to disk in the gap between the two workflows.

acron14:04:19

Is that literally the only option? I'm struggling with this window aggregation

gardnervickers14:04:45

@acron: Unfortunately at the moment yes, that's the only way we can ensure message delivery guarantees and provide aggregation, this is a symptom of our current messaging model.

lucasbradstreet14:04:28

If you want to output the results to another task it's kind of tough

acron14:04:49

Ok, we might well look into a custom plugin to do this then, hacky as that may be

lucasbradstreet14:04:01

We realise that it's a bad experience and we want to fix it

acron14:04:55

We're already using redis to facilitate loops so why not some more circus tricks parrot

lucasbradstreet14:04:01

I was thinking the other day that it would actually be possible to do joins like that in a fault tolerance way by manually acking, only after the join is made and a segment is output

lucasbradstreet14:04:02

Hopefully you can take these hacks out after 0.10.0 anyway

acron14:04:15

Got an ETA? 😉

lucasbradstreet14:04:56

It's a bit hard to estimate at the moment. Hope is in the next couple of months, but it could take longer

lucasbradstreet14:04:26

I could see it taking less

acron14:04:24

When a workflow creates two segments, from a single segment, as a result of a fork do they share an ID? How does the input ack these segments?

gardnervickers14:04:31

@acron: Onyx tracks the entire tree of messages, so it is aware of what message created what message(s).

acron15:04:34

@gardnervickers: Say we did use an intermediate persistence to aggregate segments, assuming we had an 'input' task reading the aggregation back in as a segment, would it matter what ID that new segment had? Would it need to relate to the old segments?

gardnervickers15:04:08

After you successfully persist to (say kafka), the root segment would be fully acked for that part of the workflow and not be part of Onyx’s message guarantee’s, instead being part of Kafka’s. When you read from that Kafka topic again, it would essentially be a new segment with no ties to the old one.

gardnervickers15:04:33

However, if the segment is lost in the second workflow it will be replayed from Kafka, so you still wont loose messages.

acron15:04:09

Right...via the retry mechanism in onyx-kafka? (for example)

gardnervickers15:04:34

To onyx, they will appear as two separate sources

acron15:04:58

Do retries occur on the same peer in which a segment originated?

lucasbradstreet15:04:01

@acron generally, although not if that peer fails and another peer restores from a checkpoint

acron15:04:15

Ok, good to know

acron15:04:29

I've seen some plugins retry from messages stored in an atom

zamaterian16:04:20

Have any of you experienced problems with aeron segfaulting the jvm - when running 2 or more peers when scaling up using docker-compose ?

zamaterian16:04:29

so far it only happens with peer #2 when having a 2 peers cluster.

michaeldrogalis16:04:06

@zamaterian: My guess is that the amount of shared memory for the containers are getting depleted.

zamaterian16:04:30

thx ok, will look into it. btw I have simple workflow an seq with 10000 entries all are unique no duplicates, and an out which is an kafka writer. When peer 2 crashes the kafka topic ends up with containing duplicated entries. Shouldn’t onyx prevents that from happening ? Or have I misunderstod something..

michaeldrogalis16:04:56

@zamaterian: Is the task that does the writing windowed? Only windowed tasks shield you from duplicates.

michaeldrogalis16:04:17

Deduplication is relatively expensive, so it's not provided unless you turn it on.

zamaterian16:04:28

And that is done using windowing thereby bookkeeper ?

zamaterian16:04:41

thx will look into that simple_smile

michaeldrogalis16:04:35

@zamaterian: Note that there's nothing that can be done from any task, windowed or not, from trying to write to Kafka, then crashing, then recovery and writing again. Kafka doesn't have a transactional writer, so either way you need to deal with duplicates there.

dg20:04:45

Is there a recommended way to do fan-in from multiple tasks? (IE beyond just aggregating output segments from one task)

gardnervickers20:04:30

Fan-in (otherwise known as stream joins) are supported by Onyx's aggregations

gardnervickers20:04:14

We don't have another platform-level way to do that currently

dg20:04:14

Okay, thanks, I'll dig in to that

michaeldrogalis20:04:16

@dg: Incidentally I'm working on a streaming joins application in Onyx for a client right now. There are enough application level obscurities that any solution we put directly in Onyx wouldn't be enough.

dg21:04:38

Understandable. My app just has tasks that output 1 segment, so the join is conceptually simple, but I understand that's a long way from the general case of streaming joins