Fork me on GitHub
#onyx
<
2016-10-04
>
michaeldrogalis00:10:46

@colinhicks I took a look through the code. Looks fantastic. Let me know when I should try to integrate it. 🙂

colinhicks00:10:51

Sweet. Go for it – I just pushed my last tweak.

michaeldrogalis00:10:20

Excellent. 😄

robert-stuttaford06:10:18

@lucasbradstreet quick metrics question: the latency number for a task - from where to where does that measure?

robert-stuttaford06:10:47

also, is it possible for us to disambiguate onyx time from task time? that is, have a measure for just the segment transfer and a time for the task invocation?

lucasbradstreet06:10:51

batch latency measures the time for onyx/fn to be mapped over a batch

lucasbradstreet06:10:16

complete latency measures how long it takes for a segment to be read from an input source, until it has flowed through the whole job and been acked

robert-stuttaford06:10:57

so if we have variable batch sizes, the latency numbers can vary quite a bit

lucasbradstreet06:10:35

Yes, definitely. Batch latency is intended to give you an idea about how much latency for each individual segment, because of the whole batch. I think another metric that would be useful would be batch latency / batch size

robert-stuttaford06:10:39

so does batch latency include any measure of inter-task segment transport, then?

lucasbradstreet06:10:47

@robert-stuttaford it’s kind of hard to measure individual segments, though the metric I described above might help

lucasbradstreet07:10:22

No, it’d be kinda hard to track, since we’d probably have to compare the clock time from one node with the clock time from another

lucasbradstreet07:10:40

complete latency is probably the best measure of the overall latency that you see for a single input segment

robert-stuttaford07:10:50

great, this helps a lot. will definitely give that doc a read

robert-stuttaford07:10:29

batch latency / batch size — how hard would this be to add? i think this would be tremendously useful to monitor

robert-stuttaford07:10:59

@greywolve is going to submit a PR 🙂

lucasbradstreet07:10:32

Fantastic 🙂

lucasbradstreet07:10:34

It won’t be hard, I just haven't gotten around to it.

robert-stuttaford07:10:49

clojure.lang.ExceptionInfo: Unfreezable type: class clojure.lang.Delay clojure.lang.Delay@5201a90
    as-str: "#object[clojure.lang.Delay 0x5201a90 {:status :pending, :val nil}]"
      type: clojure.lang.Delay
clojure.lang.ExceptionInfo: Caught exception inside task lifecycle. Rebooting the task. -> Exception type: clojure.lang.ExceptionInfo. Exception message: Unfreezable type: class clojure.lang.Delay clojure.lang.Delay@5201a90
       as-str: "#object[clojure.lang.Delay 0x5201a90 {:status :pending, :val nil}]"
       job-id: #uuid "c332f7e4-40b8-4ba1-9bfe-d5e182745b1e"
     metadata: {:name "highstorm", :job-id #uuid "c332f7e4-40b8-4ba1-9bfe-d5e182745b1e", :job-hash "9a6a1b4369dc672dc9f41fb09fad4c6c51decfeac6fae387c03bf07dd35f"}
      peer-id: #uuid "4094fa21-4464-4def-baee-73b68c2a031f"
    task-name: :read-log
         type: clojure.lang.Delay

robert-stuttaford07:10:19

onyx and onyx-datomic “0.9.9"/“0.9.9.0"

lucasbradstreet08:10:46

@robert-stuttaford: no onyx/fn on the read-log task I assume? What version of the datomjc dependency are you using? I think you've seen this before...

robert-stuttaford08:10:55

even if we’re doing something silly, i presume that it shouldn’t do this

robert-stuttaford08:10:23

{:onyx/name :read-log
 :onyx/plugin :onyx.plugin.datomic/read-log
 :onyx/type :input
 :onyx/medium :datomic
 :datomic/uri db-uri
 :onyx/batch-size 200
 :onyx/max-pending 2000
 :onyx/pending-timeout 60000
 :onyx/max-peers 1
 :checkpoint/key "highstorm-streaming"
 :checkpoint/force-reset? true}

lucasbradstreet08:10:26

Well, some values aren't serializable. If a task returns a value that isn't, we can either drop it or reboot and hope the problem is transient

robert-stuttaford08:10:11

Datomic "0.9.5344"

robert-stuttaford08:10:29

i remember we did have this before, a long time ago, and it was something you fixed in the read-log plugin

lucasbradstreet08:10:30

Ok, the datomic log API must be returning a delay from somewhere. Is it happening all of the time?

lucasbradstreet08:10:46

I don't think we ever tracked it down

robert-stuttaford08:10:49

seen it on 1/3 instances so far. checking the others

robert-stuttaford08:10:31

yeah it happened on all 3 instances

robert-stuttaford08:10:07

perhaps read-log can detect a Delay and return an alternate suitable value?

lucasbradstreet08:10:32

Possibly, yes, but I will want to know why the log API is returning it first. If you'd like, you can put an onyx fn on it now to filter it out

robert-stuttaford08:10:29

it’s this line, right?

robert-stuttaford08:10:57

i’ll ask in #datomic

lucasbradstreet08:10:27

Looks ok to me, but maybe the way it implements seq is leaking delays out

robert-stuttaford08:10:28

we’ll find out in ~6 hours 🙂

lucasbradstreet08:10:18

No worries. If datomic was open source I'm sure we'd figure it out pretty quickly...

robert-stuttaford14:10:44

@lucasbradstreet just pasting this here for you because it might scroll past before you’re back at your desk

robert-stuttaford14:10:22

from @marshall in #datomic:

marshall [15:39]  
<@U0509NKGK> the use of seq would be my guess as well

marshall [15:56]  
<@U0509NKGK> looking again - there’s also a `take read-size` which could be doing it

dominicm14:10:13

Am I understanding this correctly - you've found a bug datomic?

robert-stuttaford14:10:30

that’s debatable 🙂

dominicm14:10:40

tx-range returns a sequence of transactions, and you're doing a take from them, and that could cause a delay to fall out...? Seems like a datomic bug to me. Or does it not return a collection like a vector or list, and the problem is a leaky conversion in the way you treat that log?

dominicm14:10:01

It does promise that each element is a map

robert-stuttaford14:10:37

my guess is it’s a leaky conversion

robert-stuttaford14:10:50

it may be due to the fact that Datomic is JVM tech, not just Clojure tech

dominicm14:10:18

It does technically say a "range" - I have no idea what a range would be in terms of "types" though.

lucasbradstreet15:10:22

Think about it, it seems like one of the elements that it returns must be a delay, which suggests to me that the sequencing isn't the problem, it's the fact that a delay is leaking out rather than the value. It probably doesn't have any elements and isn't getting derefed somewhere?

lucasbradstreet15:10:18

I'm pretty sure it's datomic's fault, whatever the case, because we don't use delay in onyx anywhere

robert-stuttaford15:10:25

yep. Marshall is elevating to the Datomic team

michaeldrogalis15:10:16

That's a first for us, heh.

robert-stuttaford16:10:17

more in #datomic 🙂

robert-stuttaford16:10:28

really just playing messanger now 🤓

robert-stuttaford16:10:48

probably simpler for Lucas to talk them directly?

michaeldrogalis22:10:49

@colinhicks I had a play around with onyx-gen-doc and tried it on onyx-kafka. Fantastic stuff. I think next steps will be to move the repo into onyx-platform, if you're cool with that, then get it on our automatic release process so we can put a jar out to Clojars.

michaeldrogalis22:10:57

Then one at a time, we can convert the plugins over.

colinhicks23:10:38

Great. Just sent the transfer req to you

colinhicks23:10:48

Seems like the plugin doc conversion process is a great place for community contributions, too...

michaeldrogalis23:10:54

Great, I moved onyx-gen-doc into onyx-platform and invited you as a contributor with write access. 🙂