Fork me on GitHub
#onyx
<
2016-08-11
>
aengelberg17:08:10

Is it possible to set a window on an output task?

aengelberg17:08:24

My goal is to keep track of how many segments have fully propagated from the input to the output, then fire a trigger if that reaches a certain number of segments.

lucasbradstreet17:08:40

Yes, that is possible

lucasbradstreet17:08:55

You can also set onyx/fns on input and outputs

Travis17:08:57

This might be a dumb question or hard to answer; We set up metrics/monitoring with the grafana/riemann/influx stack and setup queries pretty much the same as in the onyx-benchmark project. The question is I have no idea what we are really looking at like what is good/bad?

lucasbradstreet18:08:15

The two I monitor most are complete latency (the time it takes for a segments to travel through the entire DAG and have all its child segments acked), and retry count

lucasbradstreet18:08:28

If the complete latency is about what I expect, and the retry count is zero, then things look good

lucasbradstreet18:08:34

If not, then you need to start drilling in to things

Travis18:08:03

thanks, will give that a shot. Hard part is, I don’t really know what to expect, lol

lucasbradstreet18:08:32

Oh, the other thing to look out for is the pending count.

Travis18:08:44

yeah, what exactly does that mean?

lucasbradstreet18:08:54

This will only be measured on your input tasks. Basically it’ll give you an idea of how many segments from the input task are in flight at any time

Travis18:08:00

in my case i am reading from kafka

lucasbradstreet18:08:29

Right, in the case of kafka, it’ll probably get refilled pretty quickly anyway, so it may just sit around 10000 (default pending-size), and that is OK

Travis18:08:30

ah ok, i was thinking it might mean what is left to be pulled from kafka

Travis18:08:43

10K is the number i am seeing

lucasbradstreet18:08:14

It’s hard to advise what numbers to look out for, because many of them will depend a lot on your problem

lucasbradstreet18:08:44

The only unequivocally bad one is message retries

Travis18:08:49

yeah, i totally get that

Travis18:08:01

thats why i wasn’t sure how much help i could get on this, lol

lucasbradstreet18:08:24

If your retry count is 0 then it becomes more of a question of whether the complete latency / throughput is good enough for your needs. If it isn’t then you should start drilling down into what you can optimise

Travis18:08:44

ok, that definitely helps out

Travis18:08:37

I really hope to be able to bring you guys on at some point for some kind of consultation

Travis18:08:20

is it MAX_complete_latency ? or one of the precentiles thats better to look at?

lucasbradstreet18:08:10

I usually just look at max (basically 100th percentile), because it’s calculated over X time period (10s I think?). It depends on what kind of numbers you’re trying to hit though.

lucasbradstreet18:08:24

RE: consultation, I think performance tuning / going to production is a good time to get us to help a bit. Let us know if you’re ever close to pulling the trigger 🙂

Travis18:08:54

I will, believe me!

Travis18:08:09

thanks for everything you guys do as it is

lucasbradstreet18:08:45

You’re welcome 🙂

aengelberg19:08:07

@lucasbradstreet: If I set a window on an output task and hook a trigger up to it (say, with a :segment trigger type), is that trigger guaranteed to sync after the segment has completed its operation in the output task?

aengelberg20:08:10

A different question: Do I not need to worry about using :onyx/uniqueness-key if I know that I'll have unique records coming in through the input? I want to protect against the window counting a record twice that was inputted once (due to network failure), but I'm not sure if I need the uniqueness key to guarantee that.