onyx 2017-11-17 | Slack Archive

lucasbradstreet02:11:50

@lmergen onyx.api/job-ids has been converted to onyx.api/job-ids-history, so you can get a full history of a job-by-name. I’ve also modified job-snapshot-coordinates, and it’ll now walk back through the history until it finds a job with a snapshot.

lucasbradstreet04:11:40

New onyx.api function that will be in 0.12:

lucasbradstreet04:11:31

(onyx.api/job-state zk-addr tenancy-id job-id): plays back the log for that tenancy-id and returns a map describing the job state.

lucasbradstreet04:11:38

eg. {:cluster-alive? true, :peer-allocations {:inc [#uuid "18592939-6597-719d-9cc3-0981e992657b"], :out [#uuid "e271deab-0e82-05b4-f02e-916d74c54a4d"], :in [#uuid "1491fbe3-bc39-549a-86fa-2580e5e3a420"]}, :peer-sites {#uuid "18592939-6597-719d-9cc3-0981e992657b" {:address localhost, :port 40199}, #uuid "e271deab-0e82-05b4-f02e-916d74c54a4d" {:address localhost, :port 40199}, #uuid "1491fbe3-bc39-549a-86fa-2580e5e3a420" {:address localhost, :port 40199}}, :allocation-version 4, :job-status :running}

lmergen06:11:48

oh wow apparently i was implementing exactly this functionality at the same time haha

lucasbradstreet06:11:14

Which part, the job-state? or onyx.api/job-ids?

lmergen06:11:21

job state

lmergen06:11:49

well both actually, but using datomic to query

lucasbradstreet06:11:49

Ah, cool. Give it a go and let me know what you think. I’m happy to make changes if there’s anything else you want in it before 0.12 is released.

lmergen06:11:07

i will take a look

lucasbradstreet06:11:35

Right, yeah. I realised with just a little bit more data in ZK we could do the job-history better. We’re going to ditch most of our custom datomic reconciliation and just keep the job-status detection in datomic (mostly for quick lookup because otherwise we have to play back the log).

lmergen06:11:58

yes exactly

lmergen06:11:41

well, cool, ill give it a go shortly

lucasbradstreet06:11:30

🙂

lucasbradstreet06:11:34

You can try the snapshot at: 0.12.0-20171117.063935-35. New fns https://github.com/onyx-platform/onyx/blob/master/src/onyx/api.clj#L279 and https://github.com/onyx-platform/onyx/blob/master/src/onyx/api.clj#L377

lucasbradstreet06:11:42

I need to create an onyx-example for these features now.

jetmind09:11:20

Hey folks, I’m trying to figure out how onyx-kafka 0.11 works. Does it commit offsets at any point in time? I see consumer’s auto commit is off by default and .commit isn’t called anywhere.

jetmind10:11:34

Asking because if I kill a job and then re-submit it with a new job-id it will re-process messages the killed job have processed already.

jetmind10:11:13

Restarting peers without killing a job works as expected (job proceeds from latest processed offset)

jetmind10:11:02

I wonder if there’s any way to transfer last processed offset to a re-submitted job

lmergen10:11:07

@jetmind take a look at checkpoints

lmergen10:11:31

since onyx 0.10, it allows you to 'checkpoint' jobs, and (re-)start jobs from a certain checkpoint

lmergen10:11:38

in the case of kafka, this will also checkpoint consumer offsets

jetmind10:11:56

@lmergen do you mean these? http://www.onyxplatform.org/docs/user-guide/0.12.x/#resume-point

lmergen10:11:02

yep

jetmind11:11:29

okay… it looks like this requires more manual work when re-submiting jobs (e.g. figuring out previous job id)

lmergen11:11:18

yes, the benefit however is that you'll have end-to-end consistent consistency

lmergen11:11:26

which is not easy to achieve

jetmind11:11:01

I see

lmergen11:11:02

so you'll not only resume the kafka consumer at a certain point, you'll also know exactly where your output was left at that same moment

lmergen11:11:15

so this also works in more tricky situations, e.g. aggregates

jetmind11:11:31

it make it harder to migrate from 0.9, though 🙂

lmergen11:11:36

right

lmergen11:11:49

perhaps you should wait a bit until @lucasbradstreet or @michaeldrogalis come online

lmergen11:11:12

they can provide you with better information than i can

jetmind11:11:30

ok, thanks!

michaeldrogalis15:11:24

@jetmind @lmergen's description is accurate. The changes discussed slightly above by @lucasbradstreet automate looking up resume point. It's a little manual effort at the moment, but it's really not more than a few lines.

michaeldrogalis15:11:36

And yes, a lot has changed off 0.9.

lellis18:11:38

Hi all! im trying upgrade from 0.10 to 0.12 but got these Runtime exception below when run my tests or try load a namespace with onyx-local-rt.api

CompilerException java.lang.RuntimeException: Unable to resolve symbol: pos-int? in this context, compiling:(onyx/spec.cljc:18:1)

lellis18:11:02

Any tips?

lucasbradstreet18:11:02

local rt requires onyx 1.9 because it leans on spec.

eriktjacobsen18:11:56

How do resume points interact with grouping and flux policy? Can I change the number of virtual peers assigned to a grouped task between two jobs? It seems the :slot-migration key might be involved, but can't find documentation and it seems :direct is the only valid value? Does that mean we must always have the same # of virtual peers unless :continue flux policy is set through resume points?

lucasbradstreet19:11:31

That is to be determined. At the moment we rely on there being the same number each time, because doing anything else would require repartitioning state.

lucasbradstreet22:11:37

Would it be enough to repartition it on a migration? That’s going to be a lot easier than repartitioning it on auto-scaleup

eriktjacobsen22:11:50

It's not a required feature for us at this point, but for the use-cases I can think of, migration would be fine.

lucasbradstreet23:11:12

Great. It won’t be all that technically hard to implement, but we don’t have the resources to do it right now unless it’s sponsored.

2017-11-17

Channels