Fork me on GitHub
#onyx
<
2016-03-18
>
lucasbradstreet07:03:51

Onyx users may be interested in this post, since we build our aggregations in Onyx in a somewhat similar way, using BookKeeper (which Manhattan also uses) https://blog.twitter.com/2016/strong-consistency-in-manhattan

lucasbradstreet13:03:28

How do we feel about this? https://github.com/onyx-platform/onyx/commit/ab9c24e201524dba3926d67ddc4d370d72e87b26 Basically this gives us an ability for the task component to tell why it’s being shut down, so that things like triggers know whether the task is just being re-scheduled or whether the job has been completed (maybe you want to only write to a database with the final value, not just because the peer lost a connection to ZK, or a new job was submitted). I’d get rid of the varargs on start-new-lifecycle and always supply an event. I’m more looking for advice about whether you like the strategy. In a full change I'd add the field to the record.

lucasbradstreet13:03:15

The only other way I can see to do this is to communicate the cause via channels, but onyx has several channels to stop the task, and it seems way better to do this by the component

lucasbradstreet13:03:23

This is currently something that is a problem with jepsening Onyx. Currently I just use one peer for the task with a trigger on it, and then write out the timestamp with the trigger call. We only look at the trigger call with the latest timestamp when checking whether the state is correct. This wouldn’t be a problem if we knew that we were calling the trigger during job completion.

lucasbradstreet13:03:12

The main issue that I can see is that we don’t have a two phase job completion, so we can’t be sure that the job completion trigger gets made successfully. For this to happen, I think we would need to have an signal that is sent out after seal-output, to signify that all the triggers have been successfully called.

lucasbradstreet13:03:28

You can imagine that happening where the job is completed, but the peer on the task writing the trigger out is then partitioned from where it’s writing out to. The job is still “complete” but the trigger never successfully completed.

gardnervickers13:03:51

Does this become a non-problem with ABS where every peer failure results in replaying the state from upstream?

gardnervickers13:03:12

In the current way we do things, I was under the assumption that if a peer is partitioned during it’s :sync fn then those segments are retried

lucasbradstreet13:03:17

Maybe the two phase part becomes a non-issue. How we signal job completion is still a problem.

lucasbradstreet13:03:44

Ok, so the problem is if you only want to write out your final state value on job completion

lucasbradstreet13:03:07

i.e. you’ve processed all your segments, and now the overall scheduler has figured out that your job has completed, now you want to write the value out

lucasbradstreet13:03:08

I think this will be a problem whether you’re using ABS or not. Imagine batches of segments come in where | is barrier. | 1 2 3 | 3 4 5 | | | | | | :decide-to-complete-job

gardnervickers13:03:51

Ok I see what you mean

lucasbradstreet13:03:20

It’s just a safe implementation for a job-complete trigger where the end is unknown

gardnervickers13:03:23

During :decide-to-complete we have a failure, the rest of the peers have already completed and cleaned up

lucasbradstreet13:03:56

Our scheduler can’t handle that problem because it does the job completion in a single phase

gardnervickers13:03:18

It’s like a 2-phase commit problem

lucasbradstreet13:03:35

Anyway, the commit I posted doesn’t solve this issue

lucasbradstreet13:03:56

It just solves how to signal to the task/peer that it’s being shutdown via a job completion

lucasbradstreet13:03:08

Rather than a re-scheduling

gardnervickers13:03:16

Yea the root of the problem would require coordination among all the peers

lucasbradstreet13:03:10

I haven’t really thought enough about how to solve that problem. First I just want to know whether the peer is being stopped because the job is complete or not

lucasbradstreet13:03:20

That will be a lot better than what we currently have

lucasbradstreet13:03:40

But yes, I think it’ll require some coordination to solve the greater issue

gardnervickers13:03:12

I mean having that info available (why a job is moved/canceled) is going to be useful regardless

lucasbradstreet13:03:47

Yes, it’s definitely closer to the ideal. I think it’s achievable before 0.9.0 too, which is why I’m pushing it now.

bridget13:03:47

And it's immediately useful to you, lucasbradstreet? You can use it in the jepsen testing?

lucasbradstreet13:03:20

Well, it would save me from having to read back all the trigger writes and grab the one with the greatest timestamp.

lucasbradstreet13:03:02

Mostly I like that I could test it with jepsen 😛

lucasbradstreet13:03:49

Personally, I think users generally shouldn’t care to write out if a vpeer was rescheduled, only if it meets their normal criteria or if the job was completed.

lucasbradstreet14:03:31

We should generally try to minimise rescheduling of those peers anyway, but writing something out when it’s going to be replayed on another peer, and continue is generally not the main concern

gardnervickers14:03:48

In your example above, where | is barrier. | 1 2 3 | 3 4 5 | | | | | | :decide-to-complete-job, wouldnt the last peer not clear the upstream buffers if it’s partitioned before :decide-to-complete-job?

gardnervickers14:03:14

Sorry I know this doesent help our current problem but just thinking about it

lucasbradstreet14:03:14

It would, but say you only want to actually call the trigger at the end of the job

lucasbradstreet14:03:25

Not, say, at the end of every barrier

lucasbradstreet14:03:54

I realise that could be impossible to do exactly once. I’m just aiming for at least once

lucasbradstreet14:03:04

Sorry about the onyx-jepsen walk through. I smashed my head really hard into a wall yesterday 😕

lucasbradstreet14:03:26

Heh, there was some clothes hanging in the doorway and my depth perception decided to screw with me 😛

michaeldrogalis15:03:30

I adjusted the font on the website to make the documentation more readable: http://www.onyxplatform.org/docs/user-guide/latest/testing-onyx-jobs.html

michaeldrogalis15:03:39

Let me know what ya'll think.