Fork me on GitHub
#onyx
<
2016-02-12
>
robert-stuttaford09:02:37

hey lucas simple_smile another question about lifecycle error handling. we submit to yeller in a task that receives stuff from exception? true flow conditions. should we submit to yeller from the lifecycle one too?

lucasbradstreet09:02:58

@robert-stuttaford: I think in the lifecycle case it at least logs to timbre error, so should be picked up by yeller

lucasbradstreet09:02:06

But you may want to just in case?

robert-stuttaford09:02:50

i never got the yeller timbre appender working

robert-stuttaford09:02:56

so, i’m going to add it myself

robert-stuttaford09:02:24

the fn in lifecycle can side effect and then return one of those 3 kws right? :restart :kill :and-the-other-one

lucasbradstreet09:02:02

Hmm, if you never got the yeller timbre appender you could be missing out on a lot of onyx exceptions that you want to know about

lucasbradstreet09:02:34

I think you may want to just fix that instead

robert-stuttaford09:02:02

event in the handler is the segment, right?

lucasbradstreet09:02:10

it’s the current event context (basically has everything the whole task needs)

robert-stuttaford09:02:50

ok. our segments each have the full system config in it, which i need to reach into to know whether to yell or not - how would i get to it?

lucasbradstreet09:02:21

why do you need the segments? the exception could’ve been caused by anything - e.g. loss of connection in the input plugin

robert-stuttaford09:02:05

to get the system config to read the yeller on/off flag

robert-stuttaford09:02:16

ah heck let me try the timbre thing again

lucasbradstreet09:02:04

Ah seems a bit weird for that to be in the segments. I think you should inject that into the event map

lucasbradstreet09:02:41

Happy to help with the timbre appender

robert-stuttaford09:02:14

how would individual task fns read the event map?

robert-stuttaford09:02:31

they only receive the segment

robert-stuttaford10:02:33

i thought about putting it in the event map as well

lucasbradstreet10:02:01

Yeah, that’s the way to go

robert-stuttaford10:02:34

pity. means we can’t use defnk for task fns simple_smile

robert-stuttaford10:02:47

oh well. faster code is better than easier to read code, at scale

lucasbradstreet10:02:19

why not? defnk supports multiple params no?

robert-stuttaford10:02:03

hmm. maybe they added it since i last checked

robert-stuttaford10:02:14

just confirming that your fix is in 0.8.9?

lucasbradstreet10:02:01

fix is in forthcoming 0.8.10

lucasbradstreet10:02:14

I’m just running a couple more jepsen runs and will release it

robert-stuttaford10:02:49

ah. that will explain why i’m still struggling simple_smile

lucasbradstreet10:02:19

Oh, which fix do you mean?

robert-stuttaford10:02:01

the one you fixed in onyx-datomic/read-log

lucasbradstreet10:02:44

Ah yes. Will be releasing that with the “can’t rejoin onyx/id” issue

lucasbradstreet10:02:16

btw, I just tested out custom logging here

robert-stuttaford10:02:17

ok. phew. then i don’t have to worry

lucasbradstreet10:02:35

See the standard-out appender which has its own standard-out-logger output fn

lucasbradstreet10:02:53

If you don’t want to use yeller-timbre-appender, you can just put the the timbre call in your custom fn

robert-stuttaford10:02:03

into standard-out-logger ?

robert-stuttaford10:02:41

let me try the yeller-timbre-appender again

robert-stuttaford10:02:47

because it’ll only yell on errors, right?

robert-stuttaford10:02:16

i’ve just searched through the likely onyx projects to use yeller and couldn’t find any examples - sorry, but could you link me to one, please?

lucasbradstreet10:02:29

I don’t have any examples

lucasbradstreet10:02:47

But yes, just rename standard-out-logger to something else

lucasbradstreet10:02:05

and then modify the function it calls to send to yeller

lucasbradstreet10:02:15

and also change the min-level to whatever you want it to yell on

robert-stuttaford10:02:41

cool. i’m going to try both approaches. thank you Lucas!

lucasbradstreet10:02:57

Will have 0.8.10 for you in 30 minutes or so

robert-stuttaford10:02:28

wonderful. take your time. thank you!

lucasbradstreet10:02:22

the jepsen test I created randomly kill -9s a random subset of node’s onyx jars

lucasbradstreet10:02:25

and starts them again

lucasbradstreet10:02:34

it managed to find your issue pretty quickly

lucasbradstreet10:02:48

our initial fix was bad too

lucasbradstreet10:02:57

Just waiting on a couple more runs and we’ll be good simple_smile

robert-stuttaford10:02:36

i’m about to test the error handling by killing datomic mid-run simple_smile

robert-stuttaford14:02:26

@lucasbradstreet: should i try with 0.8.10-alpha2?

lucasbradstreet14:02:03

Yep, that should be identical to 0.8.10

lucasbradstreet14:02:09

I’m just waiting for CI to finish on alpha2

robert-stuttaford15:02:31

ok. i’m out of brains. we’ll do this on monday simple_smile

lucasbradstreet15:02:02

Alrighty. 0.8.10 is releasing now and will be ready for you on Monday :)

joshg15:02:48

I’m looking into using Onyx to do notification digests and trying to understand windowing better. Would it be feasible to have 10–20k concurrent session windows? Would each require a virtual peer?

michaeldrogalis15:02:46

@joshg: I think it depends on your cluster topology. Windows are durably persisted to BookKeeper for fault tolerance, but they are maintained in memory between sync calls. Number of windows doesnt map to number of peers. Peers can have arbitrarily many windows depending on which task they're working on

michaeldrogalis15:02:38

You could have 1 peer working on, say, 1 task that does notification handling, which has 20,000 windows. That has to fit in memory for the machine. Does that help? simple_smile

joshg15:02:59

Yes, that’s exactly what I needed to know. Thanks! So if they didn’t fit in memory on a single machine, I would have to partition them into separate tasks, which could get allocated to other cluster members.

michaeldrogalis15:02:46

Stick with the number of tasks that makes sense for your domain. Add more peers. You can have multiple peers working on the same task. Onyx knows how to partition work across peers sensibly.

michaeldrogalis15:02:37

e.g. If you had 3 tasks and 9 peers, Onyx's Balanced task scheduler (the one most people use), will allocate 3 peers to task A, 3 to task B, and 3 to task C. You can use a Percentage based task scheduler, too.

joshg15:02:11

Oh, neat. And there’s no separate deployment requirement for BookKeeeper?

michaeldrogalis15:02:21

We'd recommend running BookKeeper in standalone mode in production for performance reasons, but for development/general playing around you can run it in embedded mode - which is what you're probably doing now. Check your peer config - there's a line in there enabling it to run directly on the peer.

michaeldrogalis15:02:32

Its a nice convenience.

joshg16:02:18

Thanks for clarifying how windowing works. It’s a really useful feature!

michaeldrogalis16:02:20

@joshg: Sure! Yeah it's a wonderful idea. The DataFlow paper came out at just the right time. A lot of our API comes from there.

joshg16:02:53

Out of curiosity, what are the advantages of using BookKeeper over Kafka for a replicated log?

michaeldrogalis16:02:57

@joshg: That's a good question. BookKeeper behaves a little differently than Kafka, even thought they look very similar. BK's abstraction only lets a single process write to a ledger, what Kafka calls at topic (sort of). When that process stops writing, no one else is allowed to write to the ledger. That helps with a lot of potential concurrent writer problems. Kafka also doesn't have a transactional writer, which is what is holding Samza back from having exactly-once semantics.

joshg16:02:12

That makes sense. I know that they’re working on transactional messaging for Kafka, but I’m not sure how close it is to being done.

michaeldrogalis16:02:53

I'm pretty excited for it to come out. Lots of great use cases for it simple_smile

lucasbradstreet17:02:35

@joshg: we’d like a Kafka state log implementation, but until they support transactional messaging it’s not really a priority for us

joshg17:02:18

The reason I asked is because we’re already running Kafka, but I understand the transactional requirement.