Fork me on GitHub
#onyx
<
2016-08-24
>
mariusz_jachimowicz08:08:12

I am thinking about improving onyx-dashboard. For instance about showing immediately ZK connection failure. Now connection is based on BoundedExponentialBackoffRetry policy. Wouldn't be better to use for instance RetryOneTime policy so I would be able catch connection problem quickly and show in UI?

lucasbradstreet09:08:45

Hi @mariusz_jachimowicz. For #54, yes, maybe RetryOneTime would be better, and then we display it in the UI as you say

mariusz_jachimowicz09:08:04

Ok, so I will try to do it

lucasbradstreet10:08:37

That would be fantastic. Thanks, and let me know if you need any pointers

mccraigmccraig10:08:18

@gardnervickers dyu want a PR for onyx-template and maybe lib-onyx to add repl dev features back, along the lines of this - https://www.refheap.com/122498 https://www.refheap.com/122499

lucasbradstreet10:08:08

Good question. I guess it depends on whether we foresee many changes to it going forward. It'll be easier for users to keep up to date if it's in the lib

gardnervickers12:08:37

@mccraigmccraig: sorry for the lack of response, PR's are always welcome but I need to think through this a bit more. I'm not sure I see why you would want this over the with-test-env macro.

lucasbradstreet12:08:56

@gardnervickers: I feel the same, though @jeroenvandijk had something similar which they were using. My guess is it's a bit more interactive. You can stick some values on Kafka as it's going and monitor what happens

lucasbradstreet12:08:32

Also a bit quicker feedback if you just want to redef a function and stick a new value on

mccraigmccraig12:08:09

i have an onyx job always running, along with an api and an app ... i develop interactively - try things out in the app, messages get sent to kafka, through onyx and out again... change something in onyx, reload, try again

gardnervickers12:08:18

It might make more sense when developing applications that are Onyx+other-services. I would be happy supporting a +reloaded switch or something for the template.

mccraigmccraig12:08:45

yes @gardnervickers - i have an app, api and onyx with messages flowing around a cycle

gardnervickers12:08:46

Ok gotcha, makes sense.

lucasbradstreet12:08:24

Yeah. To be honest the only real downside I see is making it so more template code needs to be kept up to date as onyx is upgraded

lucasbradstreet12:08:46

as well as general template bloat

lucasbradstreet12:08:01

+reloaded is a good compromise

mccraigmccraig12:08:04

i could shift most of it to lib-onyx i think

mccraigmccraig12:08:07

@jeroenvandijk yeah, that looks closer to what i was using before i rebuilt from a newer onyx-template

mccraigmccraig12:08:02

i didn't use component at all this time, 'cos i'm not a fan, and it's not really necessary here

mccraigmccraig12:08:22

are you one of the distributed masonry team @gardnervickers ?

gardnervickers12:08:21

Yes, I'm usually a bit more active around here but there's been a lot of product work keeping me busy!

mccraigmccraig12:08:38

product work is good 🙂

mccraigmccraig12:08:19

if i do a lib-onyx PR which preserves the current api and adds a couple of peers and job functions, and an onyx-template PR which adds a +reloaded switch referencing those fns would that work ?

gardnervickers12:08:22

That would be most appreciated!

mccraigmccraig12:08:37

on another onyx-template note... is there a reason the run_media_driver.sh docker script doesn't include mount -t tmpfs -o remount,rw,nosuid,nodev,noexec,relatime,size=1024M tmpfs /dev/shm ?

gardnervickers12:08:59

Yup, the shm size is set through docker now

gardnervickers12:08:12

You can't do both, it'll break

mccraigmccraig12:08:01

ah, ok - my mesos install isn't setting it... maybe i have some tuning to do

gardnervickers12:08:32

If you have the ability to set a memory volume, that will work too

lucasbradstreet12:08:33

Which is somewhat unfortunate since it's more likely to get configured incorrectly (though is better because of priv mode). I wonder if we could check shm size in the script so we can throw a better error

gardnervickers12:08:46

You can mount your hosts /dev/shm inside your container too

gardnervickers12:08:00

But yea we should probably check

jasonbell12:08:12

Some more detailed documentation about the docker way of doing things would be helpful. I was really fumbling in the dark last night which got a little frustrating. Mainly down to my lack of docker knowledge. How to pass in configuration properly and so on.

jasonbell12:08:46

I don’t mind doing the blog posts still, don’t get me wrong.

Travis12:08:31

You can set the shm_size through parameters in marathon

mccraigmccraig12:08:01

ah, yeah, just found that @camechis

jasonbell13:08:18

@mccraigmccraig - McCraigMcCraig of Clan McCraig, I should introduce myself as a member of the Hugged By Bruce club 🙂

mccraigmccraig13:08:58

that's getting to be a big club!

otfrom13:08:02

mccraigmccraig I am terribly sorry about all the capital letters that jasonbell used there

otfrom13:08:08

he doesn't know about your phobia ;-)

jasonbell13:08:33

I was just being adult and polite…. I’ll learn from that mistake.

mccraigmccraig13:08:05

i need to start using erc @otfrom so i can mod it to downcase everything

jasonbell13:08:42

i promise never to use uppercase again….

lucasbradstreet13:08:50

Oh no. The channel has been struck by a case of capitallessness. Do I cauterise the wound?

lucasbradstreet13:08:59

For the record I am ok with English spelling

lucasbradstreet13:08:00

I'd be ok if that caught on

mccraigmccraig13:08:00

@lucasbradstreet i figure there are far more americans around these days, so american spelling is really normative english 😬

lucasbradstreet13:08:59

Then really we should all be communicating in emojis and text shorthand

lucasbradstreet13:08:45

In which case capitalisation is back, now that it happens automatically :p

jasonbell13:08:04

capitalization surely @lucasbradstreet 😉

lucasbradstreet13:08:02

Only if capitalization is a public facing API name

Travis16:08:21

@lucasbradstreet Just to give an update on our on going battle. One thing we noticed is we had way to much logging going on causing the mesos log roller to go crazy. Turning logging way back really helped things. With now windowing at all we achieved almost 2 million in about 5 minutes, enabling windowing with hardly any data in it reduced it to about 15-20 minutes. Then adding back the writing to elastic in the trigger killed it. So kafka throughput numbers were ( 500Kish no window, 150kish window ( small data ) , 50kish window with full data set, and 10K with window full dataset and trigger

lucasbradstreet16:08:46

Oh, great, that’s more like it.

lucasbradstreet16:08:57

Is the logging anything to do with Onyx that we could improve?

Travis16:08:03

no, that was our own fault

lucasbradstreet16:08:54

K, those numbers are more like what I’d expect.

Travis16:08:04

but I have come to the conclusion writing to elastic in our trigger and windowing is really killing the performance. Given the fact that it really degraded with windowing enabled only doing something as simple as the count. I think our bookies just really suck

lucasbradstreet16:08:04

It sounds like the trigger is mostly to blame?

Travis16:08:12

its definitely a huge part

lucasbradstreet16:08:42

windowing is definitely going to reduce performance because it’ll add commit latency, plus it’ll be writing to disk (unlike without windowing, which is purely in memory)

Travis16:08:50

the combo of that and the window make it dog slow. Doing the windowing with a trigger that does everything but do the write to the elastic is still not great but much better

lucasbradstreet16:08:04

Faster bookie servers could help tho

Travis16:08:34

yeah, right now our bookies are on 5 year old servers with old disks. And we can’t split the journal and ledger to separate disks

Travis16:08:05

we are soon moving to AWS so we are going to setup separate bookies with 2 EBS volumes for ledger and journal

lucasbradstreet16:08:53

That should help. You can also reduce the number of volumes that you need to write to via the configured quorum size, depending on your fault tolerance requirements

lucasbradstreet16:08:17

Trigger performance will primarily be a function of how fast whatever you’re doing in the trigger is

Travis16:08:59

agreed, I think what we are going to do is starting sending the triggered data back into kafka so we can do a little more enrichment and maybe use the elastic output plugin

Travis16:08:22

at least until the feature of being able to start new tasks from a trigger is available

lucasbradstreet16:08:34

Sounds reasonable

Travis16:08:55

cool, so i think we are done measuring on this infrastructure we have and we will see what it looks like once we get stood up on AWS

Travis16:08:09

see if things look better or worse

aaelony18:08:47

general question: So far, I've been running tests using large files and now I feel ready to tackle reading from Kafka. I'd first like to set up a mock-kafka and populate that from file. What is the recommendation going about this? Should I closely follow this (https://github.com/onyx-platform/onyx-kafka/blob/0.9.x/test/onyx/plugin/input_test.clj) or is there another resource or template that might help me get started?

aaelony18:08:47

need I bring in Franzy for Kafka work first or can that be handled via the onyx-kafka plugin?

aaelony19:08:48

retract that last statement... playing first with franzy 😉

Travis19:08:53

@aaelony What I have done is setup kafka via docker compose and have a little small util to load some data into it. Then hook onyx up to it via the kafka plugin

aaelony20:08:01

@camechis: thanks for sharing. I'll either do the same or see how far I get with franzy

Travis20:08:48

yep, we are using franzy as well