Fork me on GitHub
#onyx
<
2017-05-22
>
devth14:05:34

it doesn't look like http://www.onyxplatform.org/docs/user-guide/latest/ covers Checkpointing (aside from a few references). should it?

michaeldrogalis15:05:38

@devth We cover it a little bit under the ABS section, but yep, it should have a dedicated section. One of the last tasks is to document that before release 0.10. http://www.onyxplatform.org/docs/user-guide/latest/#_asynchronous_barrier_snapshotting

lmergen16:05:16

btw i understand that ABS syncs checkpoints to S3, is this correct ? and if so, how do i configure it ?

lucasbradstreet16:05:54

@lmergen that’s right. Configuration via these keys

lucasbradstreet16:05:43

hmm linking to searches isn’t working, but search for s3 in http://www.onyxplatform.org/docs/cheat-sheet/latest/#/peer-config

lucasbradstreet16:05:50

via the search bar at the top

lmergen18:05:52

thanks! couldnt find this in the user guide

lmergen18:05:49

ah i see zookeeper is used by default. i assume it's contained within the tenancy id ?

lmergen18:05:44

cool. i’m trying to figure out how i can use this to “snapshot” my aggregates as well (use this epoch?), so that i can rebuild aggregates based on these snapshots

lmergen18:05:07

(aggregates that are based on incremental window triggers)

lmergen18:05:14

i’m not exactly sure about the guarantees onyx gives me — the checkpointed! call is probably not atomic for all inputs and outputs at the same time eh (meaning i cannot use the epoch there)

lmergen18:05:36

ah wait! i just noticed that this, in fact, is consistent — while the ABS is snapshotting, input is temporarily stopped, correct ?

lucasbradstreet18:05:58

@lmergen yes, each task will stop, take a snapshot of the state, then resume processing while pushing the snapshot to S3 asynchronously

lmergen18:05:12

that’s pretty rad

lucasbradstreet18:05:53

Yeah, the model makes it really easy to take consistent snapshots without slowing things down too much.

lmergen18:05:55

so all i have to do is persist the epoch in my aggregate store as well, and i can just recover aggregate state as well

lmergen18:05:30

or rather, bring it back in the state as it was at the moment the last barrier was injected

lucasbradstreet18:05:54

Right, yes, when it recovers it will tell you the epoch that it’s recovering to, so you just have to load the corresponding epoch

lucasbradstreet18:05:03

note that it’s an epoch and replica version combination

lucasbradstreet18:05:12

since the epoch resets each time the cluster changes.

lmergen18:05:29

makes sense

lmergen18:05:34

well this is pretty cool

lucasbradstreet18:05:50

ideally it’d be a single number which is the combination of each, but I was worried about overflowing a 64 bit long if I combined them

lmergen18:05:30

right, well, that’s an implementation detail i guess

lucasbradstreet18:05:56

Yeah, it would just be easier on users if they only had to ever worry about one increasing number.

rads20:05:03

is there anything equivalent in Onyx to a stream join like in Kafka Streams?

rads20:05:51

e.g. A is a stream that I want to enrich with stream B and they both share a primary key

lucasbradstreet20:05:06

You currently need to build your own aggregation to do this.

rads20:05:21

ok that's what I thought, thanks!