Fork me on GitHub
#onyx
<
2017-04-21
>
georgek00:04:42

Thanks! All of your collective encouragement and assistance has been pivotal in getting this to happen - not to mention the elegant work y’all have done! 🙂

michaeldrogalis01:04:39

Go team Clojure 😄

lmergen10:04:12

when decoupling your virtual peers from job sumission, what are the best practices for putting my job’s code ?

lmergen10:04:40

as in, i know that in onyx jobs are technically data, but you do need to provide an :fn, that’s in the classpath of the peers

lmergen10:04:47

so the peers cannot be designed in a job-agnostic way (or otherwise, you would need to upload a .jar and load that on the fly)

lmergen10:04:50

am i correct ?

mccraigmccraig10:04:36

we just build an uberjar with all jobs' code in and run peers from that

lmergen10:04:04

but that does require all the peers to be restarted when you change yours jobs, right ?

lmergen10:04:37

that doesn’t seem like a very clojure-esque way of coding

lmergen10:04:55

then again, it’s probably the most pragmatic

lmergen10:04:58

and the simplest 🙂

mccraigmccraig10:04:26

yes, all peers get restarted when i change job code... but that's not much different from all my api instances getting restarted when i change code there

mccraigmccraig10:04:50

i guess if i was running jobs from multiple projects on one onyx cluster it would be different

lmergen10:04:44

i’m now thoroughly considering keeping this whole thing self-contained

lmergen10:04:04

as in, make one uberjar that contains both the job code and the ability to run the peers

lmergen10:04:12

is that what you’re doing as well @mccraigmccraig ?

mccraigmccraig10:04:46

yes @lmergen - i have one uberjar with different entrypoints to run peers and manage a job

lmergen10:04:53

ok, that makes the most sense indeed

lmergen10:04:16

i guess i’ve been tainted by too much hadoop to think that this decoupling was necessary

mccraigmccraig10:04:42

perhaps onyx will do job-jar distribution too at some point in the future... you'll have to ask mike or lucas about that tho

michaeldrogalis14:04:15

@lmergen If your jobs aren’t being built up dynamically, what is the advantage you see in putting the job data and the code in different jars?

michaeldrogalis14:04:34

Sounds like you changed your mind already, but was just curious.

lmergen14:04:41

yeah, i realise i had this completely wrong

lmergen14:04:08

i think i was expecting the ability to submit new jobs (as in, newly written) in an already-running cluster

michaeldrogalis14:04:11

Okiedoke. I think you’re on the right track now.

lmergen14:04:30

i now realise that this is an anti-pattern, since it makes reasoning about the peer processors much more difficult

michaeldrogalis14:04:57

You can call onyx.api/submit-job when the cluster is running as many times as needed. Is that what you meant?

lmergen14:04:38

nah, i meant new code — editing existing job structure on the fly

lmergen14:04:49

but this sacrifices immutability

michaeldrogalis14:04:03

Oh - yep. Jobs, once submitted, are immutable. You are correct.

michaeldrogalis14:04:29

There is an Apache project who’s flagship feature is the ability to edit similar streaming structures on the fly.

michaeldrogalis14:04:35

That seems rather insane to reason about.

michaeldrogalis14:04:26

The name escapes me at the moment. 😕 But anyway, now you’ve got the idea. 🙂

lmergen14:04:04

is it that by any chance ?

lmergen14:04:14

so many data processing products for apache…

lmergen14:04:44

that’s the project of the people from https://data-artisans.com

lmergen14:04:04

which is all about “your data doesn’t stop flowing”, and “real time data applications” 🙂

gardnervickers15:04:07

@lmergen Onyx has an adaptation of their streaming engine and is suitable for the same type of work, DataFlow-style unified stream/batch processing.

lmergen15:04:58

yeah i’m aware of that, i was referring to @michaeldrogalis comment about some apache project that allows you to edit streaming structures on the fly

michaeldrogalis15:04:09

@lmergen I think it was Gear Pump, though I haven’t checked. Flink’s submissions are also immutable, same as Onyx’s

michaeldrogalis15:04:37

It’s true though, Apache has a zillion of these.