onyx 2017-03-02 | Slack Archive

hunter18:03:50

@lucasbradstreet I'm using onyx-datomic 0.10.0-beta5 log-reader ... the ABS "no backoff in log reader" seems to be crashing my topology ... is this something that could have a quick solution, or do i need to stay on 0.9.15? I have a bifurcated onyx app right now and I'd really like to move completely onto 0.10.0-x

hunter18:03:18

@michaeldrogalis also if you have any input on this ^ ?

michaeldrogalis18:03:18

@hunter Sorry, could I get a little more context?

hunter18:03:13

so I have several onyx topologies which use the ony.plugin.datomic/read-log to read my datomic transaction log ... i have 0.9.15 in production and it works great ...

hunter18:03:39

i am trying to move to 0.10.0-beta5 to use the newest onyx-kafka and some windowing features

hunter18:03:56

but currently 0.10.0-beta5 onyx-datomic has this warning https://github.com/onyx-platform/onyx-datomic#abs-issues

hunter18:03:10

specifically "checkpointing for log reader can't be global / savepoints" seems to be causing the checkpointing to lose track of it's place in the tx-log immediately

michaeldrogalis18:03:40

The work to fix the latter of the issues should be done. I can’t patch it today.

michaeldrogalis18:03:14

I’ll see if we can get another beta of this out tomorrow.

hunter18:03:29

thanks, much appreciated

michaeldrogalis19:03:35

@hunter No prob.

lucasbradstreet19:03:57

@hunter are you saying that you can’t use :checkpoint/key any more as it’s crashing the job?

hunter19:03:56

i'm not using checkpoint/key

lucasbradstreet19:03:58

oh, sorry, I mixed the two discussions up.

lucasbradstreet19:03:05

Thought they were both about onyx-datomic

hunter19:03:16

i'm only talking about onyx-datomic

lucasbradstreet19:03:09

Could you paste bin the exception that you’re seeing? I would think that the lack of backoff wouldn’t crash the job.

hunter19:03:17

it's not "crashing"

hunter19:03:54

it's that the tx-log stops being tracked by the read-log plugin after a couple of segments

lucasbradstreet19:03:10

Ok thanks. I'll look into it. That's not a known issue

hunter19:03:34

lucasbradstreet thanks, i'll generate fresh logs in a little while and get them to you.

jasonbell20:03:54

Loving the Async Barrier Snapshot, just one question. When designing a job am I to think of $NPEERS + 1 addition peer for the ASB task?

lucasbradstreet20:03:13

you mean to cover the coordinator? If so, there’s no need. The coordinator is very lightweight and piggybacks on a regular peer (albeit in an extra thread)

jasonbell20:03:04

yes that's what I meant.

jasonbell20:03:19

No need, perfect. Just thought I'd ask 🙂

jasonbell20:03:42

Where are the checkpoint messages being generated from?

Checkpointed input #uuid "6ae670af-0e7b-1f0b-271d-d6ad8f1ed878" 6 29 :in 0 :input
Checkpointed output #uuid "6ae670af-0e7b-1f0b-271d-d6ad8f1ed878" 6 28 :out 0 :input

what do the numbers mean past the uuid?

jasonbell20:03:13

(I was trying to see where it was generated so I didn't have to ask 🙂 )

lucasbradstreet20:03:11

cluster replica version for the allocation = 6 (this ensures that all the peers think they’re doing the same thing), barrier epoch = 28 (resets to 1 on a new job replica version, increases on each snapshot)

jasonbell20:03:40

ah cool thanks

lucasbradstreet20:03:40

0 = the slot the peer is on. If you have 10 peers on that task there will be 10 slots. Slots are used to ensure consistent hashing in the group bys

lucasbradstreet20:03:18

the combination of the replica version and the epoch forms a vector clock of sorts. When you restore from a resume point and want to restore the latest checkpoint, you would find the checkpoint with the largest replica-version, and find the checkpoint with the largest epoch for that replica-version.

jasonbell20:03:57

oooh I see, that's neat.

lucasbradstreet21:03:45

@hunter I think I’ve narrowed down the issue. what batch size are you using?

jasonbell21:03:56

Also thinking, with large volumes within docker is it worth mapping /dev/shm to a point on the instance outside of the container.

lucasbradstreet21:03:17

If I understand you correctly, you mean some shared memory space that could be shared by the containers on that node?

lucasbradstreet21:03:35

I believe we're doing something like that, with the media driver being in its own container

jasonbell21:03:46

Ah right cool

lucasbradstreet21:03:46

If you're using SSDs and you don't have much you can also choose to put your log buffers on disk, though there will be a performance hit

hunter21:03:23

@lucasbradstreet batch-size 1

gardnervickers21:03:13

In prod we’re running separate containers for the media driver and peer and mapping a memory volume to /dev/shm

gardnervickers21:03:29

On both containers

jasonbell21:03:53

is there any documentation available for doing that?

lucasbradstreet21:03:35

@hunter OK, I’m pretty certain I know what’s going on. Previously we would try to read the log without setting an end-tx, but now we’re setting the log to read from (last-tx, last-tx+batch-size). Unfortunately datomic doesn’t always increase the tx we can read by 1 (I’ve always wondered what was going on there), so now it is trying to read from (tx, tx+1) but there is never a tx+1 and it doesn’t advance.

lucasbradstreet21:03:09

@hunter can you try increasing the batch-size to 20 and tell me if it starts working?

hunter21:03:11

hunter21:03:19

it'll take a few minutes

lucasbradstreet21:03:16

Oh, I think that is because transaction-ids are like any other eid

lucasbradstreet21:03:49

Damn. That makes it hard

gardnervickers21:03:03

@jasonbell I don’t think it’s written down anywhere but it’s quite simple. Build lib-onyx and then create a container that uses this namespace as the jar entrypoint. https://github.com/onyx-platform/lib-onyx/blob/master/src/lib_onyx/media_driver.clj

gardnervickers21:03:15

Then mount /dev/shm on both containers.

gardnervickers21:03:34

I’m not sure how that’s done in Mesos but I can help if you’re running on Kubernetes

lucasbradstreet21:03:10

@hunter actually, depending on how many entities you’re transacting in each tx, you may need an even bigger number

jasonbell21:03:02

@gardnervickers thanks for the offer, not sure which way I'm going to turn yet. depends on a few factors.

lucasbradstreet21:03:37

@jasonbell please let me know what you end up deciding with shm (especially how big it ended up being). It’s good feedback.

jasonbell21:03:50

@lucasbradstreet will do, to be honest by moving up to 0.10 we removed the need for an external heartbeat server with Yada wrapped in Component, so we've freed up overhead there. I'll have a better idea tomorrow, oh and I put more tighter control on the peer/partition so the throughput is more controlled. Once I get some data I'll let you know.

lucasbradstreet21:03:42

Great. There’s no rush 🙂

jasonbell21:03:54

rest assured when I come arcoss something of interest I'll let you know.

2017-03-02

Channels