Fork me on GitHub
#onyx
<
2015-12-04
>
mccraigmccraig00:12:33

@michaeldrogalis: nice - is that focusing more on the EL or is there going to be a something for the T bit too ?

michaeldrogalis05:12:13

@mccraigmccraig: Will add more T as needed. Need to be careful not to try and rebuild Onyx into a tool, so err'ing on the side of less transformational power.

robert-stuttaford06:12:06

see questions about this sql->datomic all the time

greywolve13:12:23

@lucasbradstreet, sorry to bother, but i'm trying to debug some metrics issues we're having. the

:complete-latency
metrics don't seem to be sending at all, all the others are though. this seems to be the case for the timbre appender too (not just the custom writer i'm working on). these never seem to happen: https://github.com/onyx-platform/onyx-metrics/blob/0.8.x/src/onyx/lifecycle/metrics/metrics.clj#L104-L136 , but i'm baffled because i'm doing a (info ...) in there to check , and they are outputting metrics, it's like they are being dropped by the channel before they can be taken off the other side. I tried limiting the metrics to just one task, because I figured maybe the dropping buffer is the culprit, but nothing comes through still. i'm quite baffled. 😐

lucasbradstreet13:12:59

Are you using the latest version?

lucasbradstreet13:12:03

It was throughput

lucasbradstreet13:12:48

Are the metrics on an input task?

greywolve13:12:01

yeah i have it set to just the one input task now

lucasbradstreet13:12:11

And only that metric?

lucasbradstreet13:12:21

Do you get retries?

greywolve13:12:33

yup, batch, retries, and throughput seem fine

greywolve13:12:36

it's like its not being put on the channel

lucasbradstreet13:12:47

That is super weird

greywolve13:12:44

haha i know, i've been at this all day

lucasbradstreet13:12:47

I assume your test is long enough to run for 10 seconds?

greywolve13:12:50

finally decided to ask for help

greywolve13:12:05

yeah, i ran it for a minute at a time

lucasbradstreet13:12:23

I'm out grabbing some food but can help when I get back

lucasbradstreet13:12:27

It seems to pass the event

lucasbradstreet13:12:53

That checks whether events with the complete latency tags were sent

greywolve13:12:11

thanks! hmm interesting

lucasbradstreet13:12:28

Hmm. Are you getting non zero retries?

lucasbradstreet13:12:43

Maybe it's not completing any message so it's always nil and never puts the event

lucasbradstreet13:12:49

Ah yes, the value in the log entries you posted have :value nil

lucasbradstreet13:12:53

Hmm. Although we do still seem to put them on the channel

greywolve13:12:17

oh yeah, but there are entries that aren't nil, and they still don't get sent, also there doesn't appear to be anything stopping it from putting if the value is nil? actually the read-log (an input task) doesn't seem to be working at all. it outputs throughput, and batch once and then after that it doesn't do anything

lucasbradstreet13:12:48

Yeah, was a false track

lucasbradstreet13:12:18

Try running the send test with your logging still in there

greywolve14:12:51

going to try the send-test in a sec

greywolve14:12:39

(the above values are being logged from the send-fn thread, but notice there's still no complete latency)

greywolve14:12:53

(that's for our input task)

lucasbradstreet14:12:29

Yeah it should be with the retry rate

lucasbradstreet14:12:05

Actually retry rate should be within the when that checks if it's an input task

lucasbradstreet14:12:23

Though it should still log every second

greywolve14:12:34

doesn't seem to print anything when i run the test

greywolve14:12:39

but it passes

lucasbradstreet14:12:45

Checking the logs with grep, not eyeballing right?

lucasbradstreet14:12:05

It's a real puzzler

lucasbradstreet14:12:52

Nothing in onyx.log?

greywolve14:12:53

oh right, silly me, lol, let me try that again

greywolve14:12:16

but i guess that's enough for the test

lucasbradstreet14:12:17

That's ok because the test only lasts around 10a

lucasbradstreet14:12:36

Retry is sampled every second. Complete latency every 10s

lucasbradstreet14:12:45

My current suspicion is that it's outputting it in your main code, but only 1/10th of the time you see it with a retry so you may be missing it

greywolve15:12:47

you're right simple_smile the problem seems to be in my code, it looks like its a silent assertion error, which doesn't pop up anywhere, causing it, on one of my fns which transforms the metric to a statsd compatible one

greywolve15:12:28

a super simple sender fn loop works fine

greywolve15:12:49

pretty annoying that the assertions are just ignored

greywolve15:12:56

that's the whole point of having them there

lucasbradstreet15:12:01

Ok, I was going to say you may want to be careful in your future thread to catch exceptions

lucasbradstreet15:12:02

Ok, I was going to say you may want to be careful in your future thread to catch exceptions

lucasbradstreet15:12:10

But you said it didn't happen in timbre which confused matters :)

greywolve15:12:48

yeah i missed it because it only happens 1/10th of the time like you said, lesson learned, always grep 😛 or log so only that can appear

greywolve15:12:49

i guess i need to learn more about using futures 😛

greywolve15:12:28

ahh, the docs

greywolve15:12:00

oh well, at least i know my narrowing things ability is decent, haha

greywolve15:12:10

sorry for wasting your time lucas simple_smile

lucasbradstreet15:12:24

Glad it’s working, and looking forward to the PR :d

greywolve15:12:41

happy to PR if you want to add in built in metrics for datadog flavoured statsd? not sure if you guys want that 😛 if you don't we'd probably release it as an onyx metrics plugin on its own

greywolve16:12:51

what should one send when the value of the metric is nil?

michaeldrogalis16:12:54

@greywolve: I think the thing to do is omit sending the metric. Most of these dashboards have configuration that lets you pick what to display when a metric is missing for a time interval. Correct me if I'm wrong though.

greywolve16:12:35

ta, i'll do that then simple_smile

lucasbradstreet16:12:06

I would say “no value” for completion latency, not 0

lucasbradstreet16:12:14

it really depends on the metrics you’re calculating