Fork me on GitHub
#onyx
<
2017-03-09
>
yonatanel07:03:02

@drewverlee It's also best to make the change on a new branch and submit the PR onto upstream's "dev" branch I think.

jeremy17:03:05

I have a question regarding deployment. I'm coming from Storm, where you could bundle an uberjar for each topology you submit. Can I do the same thing with Onyx, that is deploy an uberjar for each job I want to submit. Or does all the code for the jobs I want to submit have to be in the same uberjar? If the latter, how does this affect streaming jobs that are in progress when a new deploy occurs?

michaeldrogalis17:03:26

@jeremy With Onyx, you create an uberjar that contains all of the functions that will be pointed to by keywords in your job. The data structure that represents the job doesn’t go in the uberjar — think of the uberjar as a collection of functions. The job is the instruction set for how to assemble those functions at runtime

michaeldrogalis17:03:28

When you redeploy, you’re only upgrading the functions themselves, so if all of your function signatures still resolve correctly, you can redeploy without any issues to a streaming job.

jeremy17:03:03

And if not, I'd need to restart all the affected jobs?

michaeldrogalis17:03:32

Or, more likely, modify the jobs to fit the changes you made to your functions.

jeremy17:03:15

So then it's possible to do a deploy while keeping existing jobs running, as long as the api stays the same?

jeremy17:03:35

or do they all restart during a deploy?

michaeldrogalis17:03:53

There’s no centralized coordinator doing restarts - if there are enough peers online, they’ll self-schedule themselves to work on available jobs.

michaeldrogalis17:03:33

@jeremy Also relevant - job data is stored in ZooKeeper, not in the jars themselves. That’s the piece that let’s jobs ensure restarts of peers.

jeremy18:03:16

Ah, ok. (Sorry if I'm just repeating what you've said, but I want to make sure I understand.) Deploying an uberjar to the peers means restarting them with the new code (there's no hotswapping), and the job picks up where it left off based on the state in zookeeper. is that correct?

michaeldrogalis18:03:55

No problem, happy to clarify. Onyx’s deployment model is pretty different.

michaeldrogalis18:03:09

@jeremy Yes, that’s an accurate description.

michaeldrogalis18:03:41

Interestingly enough, I’ve wanted to try hot-reloading code for a while just for fun. It’s achievable with the usual Clojure nrepl tool chain.

michaeldrogalis18:03:10

It’d be neat to be able to control an Onyx cluster’s behavior from the repl. Not sure if it’s a good idea - but definitely interesting.

gardnervickers18:03:12

How Onyx confines external state to the lifecycles and keyword->var lookups might actually make it pretty simple to get right too.

jeremy18:03:06

yeah, it would be cool to see if hot-reloading works

michaeldrogalis18:03:52

Onyx isn’t doing anything fancy there, so we should get the standard Clojure guarantees around how it usually works.

jeremy18:03:06

so no multi-methods, got it. haha

yonatanel20:03:01

@michaeldrogalis I've just watched the powderkeg talk today. Have you seen it? (regarding controlling onyx from repl)

yonatanel20:03:08

they know when a new var was defined in repl and send the new jar to the spark cluster, or something like that. I don't know if that's considered hotswapping. https://www.youtube.com/watch?v=OxUHgP4Ox5Q

michaeldrogalis21:03:29

Spinning up a new jar isn’t quite what I had in mind. Think more what happens in Emacs when you do eval-sexpr.