onyx 2016-04-20 | Slack Archive

mccraigmccraig10:04:55

just upgrading to 0.9... i see that :onyx/restart-pred-fn is deprecated in favour of :lifecycle/handle-exception - one question - i had a (Thread/sleep 5000) in my old :onyx/restart-pred-fn to prevent the thrashing that otherwise happens when kafka brokers get restarted - is that still a reasonable thing to do ?

acron14:04:57

Another curiosity, thinking more about peers in a cluster - what exactly is pushed between peers? What do peers need to share in order to confidently participate in a job?

acron14:04:29

I've had an idea about dynamically creating workflows and catalogs and submitting them as jobs - would these be able to scale?

acron14:04:02

Or does each peer need to have a catalog or workflow itself before participating?

gardnervickers14:04:08

Peer’s communicate through the zookeeper log, not directly. They race to try and participate in a job using zookeeper. The entire job datastructure is stored in zookeeper and downloaded by the peers.

gardnervickers14:04:27

onyx.api/submit-job just writes to zookeeper, it does not talk to the peers directly.

tcoupland14:04:53

but, in order to run a 'novel' workflow you'd need to deploy another jar?

gardnervickers14:04:39

Any time you want to change code you need to redeploy.

tcoupland14:04:53

does code include the workflow in that sentence?

gardnervickers14:04:22

gardnervickers14:04:38

That’s just pure data, you can manipulate and re-submit the job/workflow

acron14:04:04

So we could create novel workflows and catalogs so long as the catalogs reference functions that can be resolved by the peer?

gardnervickers14:04:11

Exactly

acron14:04:13

Bingo

tcoupland14:04:16

gardnervickers14:04:33

This leads to a desire to keep your functions somewhat general and parameterize through the catalog.

acron14:04:38

Yes

tcoupland14:04:03

it was just getting a little confusing when reading around the adding peer docs

gardnervickers14:04:12

i.e. instead of a catalog entry that’s :onyx/name :inc-5 you would have

{:onyx/name :adder 
:onyx/params [5]}

tcoupland14:04:50

cool, that's how i thought it worked before i confused myself

gardnervickers14:04:20

This snippet might be helpful, this is how onyx turns a keyword reference to a function into a function https://github.com/onyx-platform/onyx/blob/686ca2a8bb1fed8cd3a6dbe63d75923460f46888/src/onyx/peer/operation.clj#L27-L34

gardnervickers14:04:43

That applies not only to catalog functions, but lifecycle functions, trigger :sync functions, etc…

tcoupland14:04:48

the old ns-resolve

gardnervickers14:04:59

as long as it can be resolved by that, you dont need to redeploy your jar

gardnervickers14:04:06

Yup 😄

tcoupland14:04:13

so, the peers essentially just contain a bunch of functions

tcoupland14:04:26

then you submit 'job' explaining them and how to wired them together

tcoupland14:04:30

for a given purpose

gardnervickers14:04:30

Yup!

tcoupland14:04:25

i think it got weird when i started seeing that the catalog was resubmitted each time

tcoupland14:04:34

seemed like that info would just live on the peer

tcoupland14:04:49

but, the parametrization is the clue

acron14:04:08

Slight tanget, is there a preferable way to submit a job to a cluster? Just contact one peer and submit-job ?

gardnervickers14:04:31

submit-job just talks to zookeeper, you dont need to run it on a peer.

acron14:04:46

Ahhh

gardnervickers14:04:48

It just needs the job datastructure

acron14:04:48

Of course

gardnervickers14:04:04

and uses the env-config to resolve zookeeper

gardnervickers14:04:48

@tcoupland: I’m not sure what your talking about, where were you seeing the catalog being resubmitted each time?

tcoupland14:04:12

in the little starter project, the catalog is submitted with the workflow:

tcoupland14:04:26

(let [job {:workflow looped-flow
               :catalog dev-catalog
               :lifecycles dev-lifecycles
               :flow-conditions flow-conditions
               :task-scheduler :onyx.task-scheduler/balanced}]
      (onyx.api/submit-job peer-config job)

lucasbradstreet14:04:40

@mccraigmccraig: yeah, it should work pretty much identically, so you can do what you did before

gardnervickers14:04:11

@tcoupland: Oh I see, that’s a common pattern among Onyx users. You have two entrypoints to your Uberjar, one that starts up the peer and one that submits the job. That makes it so you dont have to do anything extra during deployment except start your jar with a different entrypoint.

tcoupland14:04:27

@gardnervickers: i think we're sorted now. Thanks, was a silly question really, there's no way it would have been made the other way, but sometimes you've just got to know

acron14:04:41

thanks for your help @gardnervickers

gardnervickers14:04:17

No problem! It’s totally not a silly question, Onyx has a different architecture than most are used too.

michaeldrogalis18:04:30

"so, the peers essentially just contain a bunch of functions .. then you submit 'job' explaining them and how to wired them together .. for a given purpose" - @tcoupland This is a great simple explanation.

tcoupland19:04:03

@michaeldrogalis: thanks! feel free to use it

2016-04-20

Channels