onyx 2016-05-02 | Slack Archive

@drewverlee: That diagram represents a virtual peer, technically - or the lowest level worker.

I often use peer and virtual peer interchangeably because they should be fairly transparent, but more correctly, 1 peer -has many- virtual peers.

michaeldrogalis14:05:58

Its supposed to emulate the idea of virtual sharding, but how well I actually did that is certainly up for debate.

Drew Verlee15:05:04

Thanks @michaeldrogalis and @gardnervickers.

Drew Verlee15:05:07

I probably have several fundamental things wrong in my mental model. Is it ever the case that a job (what i think of as a function like (do-something segment) is shared across multiple machines? Or can you only scale up to handle a larger job?

lucasbradstreet15:05:35

@drewverlee: any task, aside from some plugin exceptions, can run on more than one peer at a time. A peer has one task thread that it'll run an onyx/fn on, allowing you to scale out as you add cores (whether on one machine or more than one machine)

Drew Verlee18:05:50

@gardnervickers: would it be fair to think of a job as a topology? I’m forming this intuition that 1 job = 1 topology

(let [job {:workflow c/workflow
                 :catalog catalog
                 :lifecycles lifecycles
                 :windows c/windows
                 :triggers c/triggers
                 :task-scheduler :onyx.task-scheduler/balanced}
            job-id (:job-id (onyx.api/submit-job peer-config job))]

However, this > Within that ring each peer tries to claim and run jobs indicates a ring of peers could be working on 1 or more separate topologies. Which would seem un-necessary.

gardnervickers18:05:39

You may have 30 peers running, and 2 jobs that only require 5 peers each.

gardnervickers18:05:03

Then it’s up to the scheduler you choose as to how those jobs are distributed

gardnervickers18:05:57

peer’s are like “workers”, all of the workers at an organization need to coordinate, clock-in and clock-out, communicate with each other even though they might be working on separate tasks.

gardnervickers18:05:52

In the ring example, you can think of it like each worker making sure another worker is still at their table doing what they are supposed to do.

gardnervickers18:05:29

Not sure what you mean by topology, I’d have to see where you’re getting that word from.

ckarlsen20:05:09

michaeldrogalis: Hi, I'm testing out the kafka0.9 plugin. Stumbled upon one issue so far; it's not possible to resume from latest offset between job instances because the task-id is used as part of the checkpoint key

michaeldrogalis20:05:03

Ahhh yuck. Thanks for finding that one @ckarlsen.

michaeldrogalis20:05:23

Ill get that patched up tonight/tomorrow. Still needs formal review from others. Did you run into anything else?

lucasbradstreet20:05:34

Oh right, I’ll review it tomorrow, that was the other thing I needed to get on to

lucasbradstreet20:05:10

@michaeldrogalis: what I did with datomic’s read-log there is allowed a custom checkpoint key to be supplied. I assume that’d be your fix anyway

ckarlsen20:05:38

ckarlsen20:05:50

i'll let you know if anything else might come up

michaeldrogalis20:05:08

Cool, yup. I must have backed over that change.

bridget22:05:32

I was able to remember where I was with the onyx-ruby project and make all the outstanding changes I had left on it in one sitting. Leaving a note for myself FTW.

bridget22:05:50

"... and that's why You Always Leave a Note."

gardnervickers22:05:31

Ha!

bridget22:05:31

I'll take one more look over it tomorrow with fresh eyes then get it moved to the onyx-platform organization

2016-05-02

Channels