Fork me on GitHub
#onyx
<
2016-10-03
>
robert-stuttaford09:10:22

@lucasbradstreet just confirming that for :onyx.messaging/bind-addr localhost is fine in single-node mode?

robert-stuttaford09:10:27

@lucasbradstreet -- would love to know your thoughts on the terraform stuff i shared on saturday, particularly around highstorm ansible stuff

lucasbradstreet09:10:28

I'm interested, but haven't had a look yet

robert-stuttaford09:10:02

cool 🙂 of particular note is the use of systemd to handle aeron + peers + jobs processes

Drew Verlee14:10:56

If i wanted onyx to operate over an unbounded stream of data and output the largest number/max it had seen in the stream what would the recovery process look like if the process crashed? Would it need to replay all the data? Is there a mechanism to persist the current max somewhere? Example input and output. 1, 2, 1, 3, 2 -> onyx -> 1, 2, 2, 3, 3 Naively i could see sandwhiching onyx between two kafka streams and re-reading the last written max in case of a failure but i’m assuming their is a native mechanism for this.

michaeldrogalis14:10:49

@drewverlee Onyx recovers from the last successfully acknowledged message and plays the stream forward from that point. It's able to determine what the max, or whatever aggregate you're using, is by replaying another log that it uses specifically for incremental state updates. See http://www.onyxplatform.org/docs/user-guide/0.9.11/#aggregation-state-management for an explanation.

dominicm14:10:21

I didn't get the point of windows too much, until I read https://www.oreilly.com/ideas/why-local-state-is-a-fundamental-primitive-in-stream-processing now my mind is pretty blown.

Drew Verlee14:10:02

@michaeldrogalis, Thanks, I read over the docs but i’m still a bit unsure of the details. can you give me an example of a “changelog update”, I have never been sure if this changelog contains the the messages-that-were-processed or if it contained meta information about the job itself. for instance, if i’m reading in values: 1, 2, 1*crashes, 3, 2 -> onyx peer -> 1, 2, 2, 3, 3 And the onyx peer crashes after reading the second 1. Then does the changelog contain [1, 2, 2] ? or maybe just [2] (the current max).

Drew Verlee14:10:24

@dominicm that looks like a great source, i’ll read it over right now.

michaeldrogalis14:10:46

@dominicm We had a very fun time implementing them.

dominicm15:10:37

@michaeldrogalis I remember reading your blog post on them. I was impressed, but unsure how I could use them for my purposes. Then I read the oreilly thing.

michaeldrogalis15:10:52

@drewverlee It contains details to apply a function to advance the state from one entry to the next. This file contains the state transitions for the built-in aggregates: https://github.com/onyx-platform/onyx/blob/0.9.x/src/onyx/windowing/aggregation.cljc

gardnervickers15:10:20

@drewverlee It might help to think of the aggregations as a reduction over the sequence of segments flowing into the window, where the “accumulator” piece of the reduction is checkpointed to durable storage every time an extent is triggered.

michaeldrogalis15:10:51

@drewverlee It contains [1 2], per your example.

michaeldrogalis18:10:54

Correct. We're going to support another kind of state recording in the future too, but incremental snapshots is the existing mechanism recording state.

robert-stuttaford18:10:09

hey @michaeldrogalis 🙂 hope you and the gang are doing well. how’s startup life?

robert-stuttaford18:10:06

hah, yes, paperwork. ain’t that fun

michaeldrogalis18:10:21

Joking aside, pretty awesome. I'm itching to show our new product ontop of Onyx. It's a game changer.

michaeldrogalis18:10:27

How about you? How's stuff?

robert-stuttaford18:10:29

itching to see it 🙂

robert-stuttaford18:10:56

doing super awesome, thanks. i’ve been getting my mend on — i’m sure you saw mention the terraform stuff i shared over the weekend

robert-stuttaford18:10:33

i’ve been rebuilding our infrastructure layer from scratch. redoing all the builds, run scripts, environment vars, etc etc. truly cathartic

michaeldrogalis18:10:56

It feels like giving the car a wash when you do that. I agree, cathartic is the word 🙂

michaeldrogalis18:10:00

@colinhicks This is incredibly good work, wow.

robert-stuttaford18:10:15

what does this do?

colinhicks18:10:16

Thanks! It was fun. The commit history is pretty messed up thanks to a botched rebase, but I managed to get github reviewability

colinhicks18:10:56

If anyone wants to review the PR, lmk and I'll add you as a collaborator

robert-stuttaford18:10:26

that’s fantastic!!

colinhicks18:10:32

@robert-stuttaford, you can also compare onyx-gen-doc's own README to its template: https://raw.githubusercontent.com/colinhicks/onyx-gen-doc/master/README.template.md for meta-documentation goodness

robert-stuttaford18:10:53

well done colin. i’m sure this is going to make life a lot easier for the onyx team to manage an ever growing list of projects

colinhicks18:10:39

thanks - hopefully, indeed!

michaeldrogalis19:10:14

Looks awesome. Ill do a bigger review tonight and coordinate getting this into the release process for each repo 🙂

michaeldrogalis19:10:18

Huuuuge thank you. ^^

colinhicks19:10:32

sounds good. you're welcome 🙂

Drew Verlee19:10:48

> I'm itching to show our new product ontop of Onyx. It's a game changer. Release soon so i can include it in my onyx vs spark vs flink comparison 🙂.

michaeldrogalis19:10:48

@drewverlee It's looking like December will be the time that we get our first few customers on board and put out a public technical preview. All I can say now is that we have made some truly novel advancements in distributed processing. I'm expecting the community to grow substantially after going public with it.