onyx 2016-09-08 | Slack Archive

hey @vladclj I am not sure if I understand what do you mean by combining those params

what do you want to do with different headers as part of the segment params?

@aspra Header is as example, what about url? segmennt: {:args {:headers {"content-type" "application/xml"} :body xml}} and task-map: {:http-output/url "http://localhost:41300/" }

aspra08:09:20

@vladclj you using the

:onyx/plugin :onyx.plugin.http-output/output

:onyx/plugin :onyx.plugin.http-output/batch-output

aspra08:09:05

if you are using the batch one

As the segments can no longer be used for request parameters, these are instead supplied via the task-map.

aspra08:09:29

is that what you mean?

vladclj08:09:02

@aspra :onyx/plugin :onyx.plugin.http-output/output

aspra08:09:50

then :url should be a segment parameter

zamaterian12:09:49

I cant find onyx-metrics for 0.9.10.0-beta1 in clojars: https://clojars.org/org.onyxplatform/onyx-metrics/versions

lucasbradstreet12:09:31

@zamaterian thanks for the heads up. It didn’t release properly

lucasbradstreet12:09:26

@zamaterian it’s been released. Thanks!

zamaterian12:09:40

Anytime 🙂

lucasbradstreet12:09:24

Our new relic testing account expired so it broke our tests

Travis14:09:47

That sucks

zamaterian14:09:46

btw after upgrading to metrics 0.9.10.0-beta1 from 0.9.7.0 we are seeing metrics statements with value nil eg : {:tags ["complete_latency_90th" "onyx" ":partition-keys" :bbr-kode "b7762834-3971-42a6-b9ab-a6d256255541"], :service "[:partition-keys] 90_0th_percentile_complete_latency", :task-id :partition-keys, :job-name :bbr-kode, :job-id #uuid "43dcd22d-9e5e-4f78-bf54-74fc7b9252ec", :value nil, :window "10s", :task-name :partition-keys, :label "10s 90th percentile complete latency", :quantile 0.9, :period 10, :metric :complete-latency, :peer-id #uuid "b7762834-3971-42a6-b9ab-a6d256255541"}

michaeldrogalis15:09:19

Sorry, I should've caught the failed metrics build -- not sure how that one escaped me.

michaeldrogalis15:09:42

@zamaterian I want to recall that there were some changes to the key names that get emitted in the last few versions.

michaeldrogalis15:09:04

Not sure though, need to go have a look at the changelog

zamaterian15:09:23

Ok, no hurry 🙂 We are working around it.

michaeldrogalis15:09:32

@zamaterian What're you using to view the metrics, btw? New Relic?

zamaterian15:09:23

aws cloudwatch

zamaterian15:09:54

Trying to find out the right throughput that our datomic transactor can handle (when import totally 100mill rec) (datomic is on top of datastax cassandra)

michaeldrogalis15:09:46

Interesting. What have you been able to top out at with Onyx writing to it so far?

zamaterian15:09:03

quite a low figure, but that was because of our private cloud was provisioning datomic transactor to low. To complicate the issues - all our data is in a bit temporal format, so each records is transacted in each own datomic transaction with attributes added to the transaction.. But i’m ironing out the last few issues, we have before trying to maximize throughput

michaeldrogalis15:09:50

There's definitely a lot of knobs to turn on both systems, plus the underlying hardware, to get right.

zamaterian15:09:00

The reason for adding metrics to cloudwatch is compare the with datomic

michaeldrogalis16:09:15

Yeah -- quite sensible.

Chris O’Donnell18:09:19

Could someone explain what is meant by the extents of a window, as used in http://www.onyxplatform.org/docs/user-guide/0.9.10-beta1/#__code_watermark_code?

gardnervickers18:09:09

A window divides time into chunks, an extent is one of those chunks.

Chris O’Donnell18:09:51

So if you had a sliding window with a range of 15 minutes that slides every 5 minutes, you'd have three extents?

gardnervickers18:09:12

Per 15 minute period, yes

Chris O’Donnell18:09:30

Alright, thanks for the explanation.

Chris O’Donnell18:09:54

And just found it in the user guide. 😳

gardnervickers18:09:14

Each extent would accept segments from a time range, the first 0-15, the second 5-20, the third 10-25

gardnervickers18:09:17

Yup 🙂

michaeldrogalis18:09:05

@codonnell I linked to this one yesterday for someone else, worth another paste - https://github.com/onyx-platform/onyx/blob/0.9.x/src/onyx/windowing/window_id.cljc

michaeldrogalis18:09:46

Should help illuminate how extents are used for bounding.

aaelony18:09:17

I'm now able to get correct counts with a window aggregation, but only when I use :trigger/threshold [5 :elements], which gives me the running counts toward the correct totals in 5 element increments. The last number is the correct one. How can I specify a reporting time period for the final counts within a window extent? If I replace :trigger/threshold [5 :elements] with :trigger/period [5 :seconds] I get an error... Error in process state loop.

Chris O’Donnell18:09:11

Thanks @michaeldrogalis

michaeldrogalis18:09:20

There's not really a notion of finality in the Dataflow/BEAM model. The incremental views of the window are meant to give visibility of state over time. To figure out what the "final" state is, you'd just wait until the job is complete, then check the state. Triggers will fire on task completions to flush any partial state out.

aaelony18:09:03

got it, in that case I must be printing the window state too often.

aaelony18:09:06

thank-you

aaelony18:09:43

setting the integer really high works too 🙂

michaeldrogalis18:09:04

It's a trade-off though. The less often you sync your trigger state, the more data you accrete in memory and the longer it'll take to write to outside storage -- presumably stdout won't be the long-term target of your trigger.

michaeldrogalis18:09:21

So I'd be wary of just jacking up the trigger period there.

aaelony18:09:45

makes sense. I'm using :window/window-key :event-time :window/range [2 :days] in the window spec though, and I have a sense of how many would travel in that time frame. Ideally, I'd like it to output the counts it knows about for the 2 day time range every N minutes (and conj in a label that clearly states that these are counts after running for N minutes)

Chris O’Donnell19:09:15

I'm fixing some typos in the user guide. Does anyone know how many columns these .adoc files are supposed to wrap after? I don't want to mess up the formatting.

michaeldrogalis19:09:49

Thanks @codonnell. AsciiDoc automatically wraps based on the HTML target, so don't worry -- your changes will be fine

michaeldrogalis19:09:03

It's a pretty handy feature.

Chris O’Donnell19:09:57

@michaeldrogalis alright, I won't worry too much about it. Is = cores = = virtual peers + = subscribers supposed to be cores = virtual peers + subscribers in the performance tuning section?

michaeldrogalis19:09:28

Yeah, looks like that expr got mangled in the conversion from Markdown. Nice catch.

gardnervickers19:09:52

Yea

gardnervickers19:09:10

thanks!

Chris O’Donnell19:09:55

@michaeldrogalis PR with fixes submitted.

Chris O’Donnell19:09:22

On a side note, this new single page user guide is seriously awesome. It's much easier to go through and learn sequentially, as opposed to visiting isolated pages from earlier.

michaeldrogalis19:09:17

@codonnell Thanks for the PR. @vijaykiran is the brave soldier who did the new user guide ^^

michaeldrogalis19:09:30

I agree, blows the old version out of the water.

Chris O’Donnell20:09:33

cheers @vijaykiran

michaeldrogalis20:09:37

By the way, only thing you need to do to build the docs is run asciidoctor index.adoc. asciidoctor is a gem. Nice and simple 🙂

Chris O’Donnell20:09:54

wow that's super easy

michaeldrogalis20:09:34

Easy choice to switch away from Markdown with all the other features Adoc has, too.

michaeldrogalis20:09:49

I just built and confirmed all changes. Nice work. Ill get beta2 out soon so these docs will be on the site.

aengelberg21:09:26

Would it make sense to run a bunch of Onyx peers in the cloud but submit jobs from a local repl?

aengelberg21:09:09

I'm not sure what the Onyx peer config needs to be for a situation like that.

aengelberg21:09:38

can :onyx.messaging/bind-addr be set to localhost if all of the worker peers are on the same machine?

gardnervickers21:09:42

You just need to connect to Zookeeper under the same tenancy

gardnervickers21:09:47

bind-addr just needs to be routeable from all peers under your tenancy.

aengelberg21:09:02

But does that include the "peer" created when I submit jobs?

gardnervickers21:09:31

Nope, submitting the job is just a write to zookeeper.

gardnervickers21:09:32

Your machine that submits the job won't go through the joining algorithm, and won't be a "peer" in the cluster

aengelberg21:09:01

Awesome, that's what I needed to know. Thanks!

Travis21:09:36

@gardnervickers One thing i have had in mind that I haven’t really thought about yet but maybe you have some thoughts on the subject; whats probably the best way to submit jobs in Mesos and also how to upgrade jobs without losing data

gardnervickers21:09:02

What do you mean by losing data?

Travis21:09:31

probably a bad way to say it

Travis21:09:47

How would you update a job running in production

Travis22:09:56

in general

gardnervickers22:09:16

It's highly dependent on your job and what you're updating. You need to think about if you need to recompute data or not under the new job. In general I would kill the old job and start a new one

Travis22:09:50

so when you kill a job will data in flight finish before the job actually stops?

Travis22:09:25

Also when you restart the job would you use the same job id so it knows where to pickup at? Kafka in my case

gardnervickers22:09:39

Kafka has its concept of consumer groups, I'm pretty certain you just have to make your new input task share the consumer group with your old input task. @michaeldrogalis would know more about that.

gardnervickers22:09:04

When you kill a job things in-flight just won't be acked

Travis22:09:28

yeah it just occured to me on that so they won’t be acked so those ones would get replayed on restart

Travis22:09:37

assuming it know where to start at

michaeldrogalis22:09:04

@camechis In-flight data that is dropped due to a killed job will be replayed for the next job that resumes consumption from the same inputs, assuming they're sharing checkpoints.

Travis22:09:48

That's what I figured, just need to figure out and understand how to make that happen

michaeldrogalis22:09:41

How to make what happen? Kill/resubmit jobs?

Travis22:09:26

Kill I assume yo use the kill job API, but mainly to restart and have it pick up where it left off

michaeldrogalis22:09:46

Depends on the plugin, but it is generally the default behavior, and is automatically done for you as long as you share some identifier between jobs. For the Kafka plugin, it's the consumer group parameter.

Travis23:09:41

Gotcha, I will take a look at the Kafka settings

aaelony23:09:52

curious, what deserializer are folks using with Kafka?

Travis23:09:19

I'm currently using edn

aaelony23:09:23

nice

aaelony23:09:36

I have strings...

2016-09-08

Channels