onyx 2016-08-18 | Slack Archive

Travis00:08:58

For a task that has a window ( collect-by-key ), How does this affect the batch latency metric on this task? Does it count what the trigger does at all?

michaeldrogalis04:08:14

@camechis: It doesn't, no. It doesn't really affect the latency other than the time to put the segment in the window.

jeroenvandijk08:08:42

Congrats guys!

jeroenvandijk08:08:49

Btw, good point about not going serverless. We recently discovered AWS Lambda is kind of joke

otfrom08:08:59

jeroenvandijk: in what particular way? AFAICT it is probably reasonable for some domains (esp if you want to do event driven stuff on AWS bits of infra like s3/kinesis/etc)

jeroenvandijk08:08:24

they have a global queue for all the functions per account

jeroenvandijk08:08:33

we ran into this limitation

otfrom08:08:34

ouch

jeroenvandijk08:08:34

i hope they will remove it in the future, but we we’re logging cloudwatch data via AWS Lambda and had a (too) large queue. This had effect on the scheduling of other unrelated Lambda functions

jeroenvandijk08:08:11

So for nothing serious it sounds perfect 🙂

otfrom08:08:01

lol

robert-stuttaford09:08:06

i'm using lambda for load-testing right now, with clojider. working great for that 🙂

jeroenvandijk09:08:18

That’s great 🙂

jeroenvandijk09:08:35

I don’t trust it anymore though

robert-stuttaford09:08:16

it's all about tradeoffs, as always

lucasbradstreet09:08:47

Thanks. I mean, a lot of the way it will work is inline with "serverless", but there are real resources underneath and users need enough control and insight to get the best experience. As @robert-stuttaford says, it's a trade off, and we want to pick the right ones for Onyx and the kinds of use cases that we'll pick up

lucasbradstreet09:08:35

Also, thanks @jeroenvandijk!

lucasbradstreet09:08:59

I wanted to say something in the blog post about it being an incredible journey this far, but that has a specific meaning in startup land :p https://ourincrediblejourney.tumblr.com

jeroenvandijk09:08:38

Yeah it is mostly the lack of insight that turns me off. Except for that it really is about trade offs. And probably a good tool for many use cases. The marketing of AWS is a bit off for our particular case IMO

lucasbradstreet09:08:27

Yeah, one big selling point for us moving into a more managed product is being able to setup all of the monitoring, metrics, etc right, which will mean that most users get more insights than they would doing it all themselves (which will always be possible, but is obviously more work)

jeroenvandijk09:08:28

Sounds like a great product 🙂 Looking forward to it

jasonbell10:08:03

Great news on the funding, excellent progress. Not looked at the channel properly, been busy.

Drew Verlee12:08:47

Congrats! Im glad to see all your hard work is paying off!

Travis13:08:51

can some one reiterate what exactly Max Batch Latency means?

robert-stuttaford13:08:01

time taken to process 100% of the segments in the batch

robert-stuttaford13:08:14

as opposed to e.g. 99% batch latency

Travis13:08:36

ok, thats what i thought. Just wanted to make sure. Struggling to figure out why my latency is so high for one of my tasks that has a window on it. The task itself doesn’t do very much at all. I am guessing its mainly due to writing the segment into the bookkeeper journal

lucasbradstreet14:08:36

Almost definitely right. There is some latency required to ensure that everything is safely written to bookkeeper, as it has to be written to enough of the ensemble to be safely recovered

lucasbradstreet14:08:40

what kinda numbers are you getting?

Travis14:08:05

i am seeing an average of between 5-10 secs and sometimes more than 10

Travis14:08:20

current batch size of 1 @lucasbradstreet

lucasbradstreet14:08:09

wow that’s pretty long

lucasbradstreet14:08:26

I was thinking it would be closer to 300ms

Travis14:08:06

lol

Travis14:08:19

i would kill for 300ms right now

Travis14:08:20

lol

Travis14:08:03

its taking us almost 40 minutes to do 700K worth of records

lucasbradstreet14:08:43

Batch size 1 could be hurting you, because it hurts Onyx's ability to amortise costs. The other thing you should look at is what your journalling since you're using conj. If the segments are big it's gonna hurt.

Travis14:08:14

they are somewhat big, we are going to try and trim what we can

lucasbradstreet14:08:23

5-10s is ridiculous though

Travis14:08:36

yeah we are doing a collect by a key that we generate which is what the window is on. After that we bucket for a bit so we can collapse the like segments down into one and write them out to ES.

Travis14:08:11

Also what would you consider a big segment just to make sure we are on the same page of “what is big”?

lucasbradstreet14:08:01

So what is most important is what you are journalling. What you return in create-state-update is the important thing

lucasbradstreet14:08:53

For example, if you receive a new segment, and return all the segments you've received as part of create-state-update, you're journalling far more than necessary

lucasbradstreet14:08:15

Because that will be journaled, but all apply-state-update needs to know about is the new segment

Travis14:08:17

hmm, i don’t think we are defining any of those

Travis14:08:28

or i am not understanding

lucasbradstreet14:08:40

Ok, it sounds like your just using conj, in which case the only thing that matters is the size of the segment when serialised

Travis14:08:12

Travis14:08:13

[{:window/id window-id
                     :window/task task-name
                     :window/type :fixed
                     :window/aggregation [:onyx.windowing.aggregation/collect-by-key :collapse-key]
                     :window/window-key :rt
                     :window/range [60 :seconds]
                     :window/session-key :collapse-key
                     }

Travis14:08:20

just so you can see our window def

Travis14:08:53

- that session key in there

lucasbradstreet14:08:55

hmm. As far as I can tell, that looks fine. It should only be journalling a segment. Maybe it’s possible things are going bad when it merges windows. I don’t have time to check right now

lucasbradstreet15:08:00

maybe you can look at what it’s doing

Travis15:08:12

no worries, will do what i can

shaunxcode19:08:36

any chance of making the db for event windows pluggable?

gardnervickers19:08:29

It would not be terribly difficult

Travis20:08:12

what is the proper way to set the task scheduler? Also what is the default?

michaeldrogalis20:08:58

@camechis It is set through onyx.api/submit-job as a key. There is no default.

Travis21:08:10

hmmm

Travis21:08:17

i just don’t see it being passed in

Travis21:08:42

(onyx.api/submit-job peer-config
                                                        (onyx.job/register-job job-name config))

michaeldrogalis21:08:48

It's gotta be somewhere, it's a required, schema checked key.

michaeldrogalis21:08:10

Print out whatever you're passing to submit-job.

Travis21:08:25

im basically using the new template

gardnervickers21:08:20

https://github.com/onyx-platform/onyx-template/search?utf8=%E2%9C%93&q=task-scheduler

Travis21:08:38

ah, i see it

Travis21:08:39

lol

Travis21:08:03

sorry about that

gardnervickers21:08:13

All good

Travis21:08:30

in mind i was thinking it would be else where

Travis21:08:40

If you are using the balanced task scheduler and you do not define a min and max peer; Will a task grab as many vpeers as possible ( in the balanced fashion of course ) ?

michaeldrogalis21:08:01

Yes

Travis21:08:17

👍

Travis21:08:22

Another question, If you have a task with a window on it does it get locked to a single peer? we have a task with a window and it only seems to run on a single peer? Well only on a single host but I guess its possible that it might always be on the same host but different vpeer? Just seems very different from a few of the other tasks

michaeldrogalis21:08:12

"locked"?

Travis21:08:26

Well only runs on one

michaeldrogalis21:08:00

There's nothing about it having a window that makes it only bound to a single peer.

Travis21:08:54

Ok, interesting. I thought that was the case but wasn't sure after what we see in grafana

2016-08-18

Channels