onyx 2016-10-11 | Slack Archive

zamaterian09:10:06

@lucasbradstreet, could you throw me some hints: How I start a new job with the checkpoint'ed entries from a previously killed job ? Example I have a long running data import job (sql -> datomic) and at some point during the import either the sqldb connection or the transactor timeout. I would like to be able to run a new job (identical catalog) where the old one died. (since its not possible to resume a killed job) - I'm thinking on reading all the checkpointed entries from zookeeper into the new job.

lucasbradstreet10:10:54

It's possible, but will require making a. Change to onyx sql allow checkpointing to be written to a custom key. A similar ability was added to onyx kafka and onyx datomic

lucasbradstreet10:10:00

@zamaterian you can see this at work in onyx-datomic here: https://github.com/onyx-platform/onyx-datomic/blob/0.9.x/src/onyx/plugin/datomic.clj#L289

zamaterian10:10:34

Sounds great, will look into this 🙂 Thnx for the pointer, will probably submit a pr.

lucasbradstreet10:10:10

Great! We have been meaning to do this for all the plugins, so I will definitely merge it

michaeldrogalis15:10:57

Haven't forgot about updating the docs with the auto-gen tool, still on my list for this week.

aaelony18:10:48

We're thinking of reading the job data-structure from a file (as json) at run-time to submit an onyx job. A thought that occurred to me is that perhaps it is desirable to version the information model of the job, so that if the information model changes in new versions things would still work... e.g. :onyx-version "0.9.11" in the job spec... Any thoughts on this?

michaeldrogalis18:10:45

@aaelony The information model is (generally) designed to be backwards-compatible. We deprecate things once in a while, but its become relatively rare to make a change to the info model that breaks backwards.

aaelony19:10:46

At least for the kafka-plugin, I think there have been a few breaking changes though right?

michaeldrogalis19:10:13

Recently, yes. We found some things that changed for Kafka 0.9 from 0.8 that we missed.

aaelony19:10:37

versioning might help with that, allowing changes to go forward perhaps. What happens if the onyx job is specified as json outside of the program that includes the library? This is a cool idea, but I guess it begs the question of how to ensure the json job might still work over time if the job needs to be re-submitted...

michaeldrogalis19:10:48

Most of them were more like bug fixes than feature alterations, though.

aaelony19:10:42

also, what are you thoughts concerning clojure.spec for the job-spec itself?

aaelony19:10:14

might be a lot of work to write it out...

michaeldrogalis19:10:35

We have most of it done for internal use - we'll get it out once Clojure 1.9 is out I think.

aaelony19:10:48

wow!

michaeldrogalis19:10:02

You can get 95% of it by reading the information model and using a code generator. Its < 50 LOC

aaelony19:10:20

what do folks typically do currently when they deploy an uberjar? Do they bundle the job specs in the uberjar or do they allow jobs to be read in?

Travis19:10:48

what do you mean by specs?

aaelony19:10:08

the job map with keys for :workflow, :catalog, etc..

Travis19:10:18

i bundle mine in the uberjar

Travis19:10:32

and based on main arguments it picks the job i want

Travis19:10:37

all based off the lein template

aaelony19:10:19

@camechis, suppose you are running the current uberjar (and job) for a while, and you want to release in new changes to the existing job(s).. How do you handle that?

aaelony19:10:40

suppose you are consuming kafka topic X and you now want to consume kafka topic Y ?

Travis19:10:29

haven’t had the chance to experience that yet but i think you essentially stop the job deploy the new version and start a knew one. If they share the same group id then it will pick up where the last one finished

Travis19:10:41

with kafka that is

Travis19:10:37

new*

aaelony19:10:24

So, you never have more than one job running on a cluster?

Travis19:10:51

i will eventually have a ton of jobs. Just haven’t made it that far in the process yet

Travis19:10:07

my goal would be to automate the deployment of those

Travis19:10:22

essentially the same job but different configs per customer

Travis19:10:30

so different kafka topics

aaelony19:10:08

it is nice to consider the job-map to be standalone as a json file (that can be written by anyone/thing/language), slurped at job-submission time, and run (if it is well-formed, etc..)

aaelony19:10:24

at very least, I can imagine a "prod" job from a topic, and a "testing" job from the same topic. Or even another "testing job" from a different topic

aaelony19:10:50

just want to think it all through before it happens 😉

aaelony19:10:08

I also find myself writing helper functions (poorly) like this:

(defn find-in-workflow [s]
  (->> workflow
       flatten
       (into #{})
       (filter #(re-find (re-pattern s) (name %)))
       ))

(defn find-in-catalog [s]
  (->> catalog
       (mapv :onyx/name)
       (filter #(re-find (re-pattern s) (name %)))
       ))
(defn find-in-lifecycles [s]
  (->> lifecycles
       (mapv :lifecycle/task)
       (filter #(re-find (re-pattern s) (name %)))
       ))

(defn job-audit [s]
  (let [hits-workflow (into #{} (find-in-workflow s))
        hits-lifecycles (into #{} (find-in-lifecycles s))
        hits-catalog (into #{} (find-in-catalog s))
        results {;; :found {:workflow-hits hits-workflow
                 ;;        :lifecycle-hits hits-lifecycles
                 ;;        :catalog-hits hits-catalog}
                 :in-workflow-but-not-catalog (clojure.set/difference hits-workflow hits-catalog)
                 :in-catalog-but-not-workflow (clojure.set/difference hits-catalog hits-workflow)
                 :lifecycle-hits hits-lifecycles
                 :lifecycle-catalog-intersection (clojure.set/intersection hits-lifecycles hits-catalog)
                 }
        ]
    results))

does anyone do this as well?

Travis19:10:49

I am pretty much following the best practices of using task bundles to compose things

Travis19:10:24

usually what differs for me is configuration

aaelony19:10:03

awesome, thanks @camechis

aaelony19:10:18

was unaware of http://www.onyxplatform.org/jekyll/update/2016/06/13/Task-Bundles.html

michaeldrogalis19:10:51

@aaelony We have some quite good facilities for parameterized jobs in the new product. It was a lot easier to do well in an integrated PaaS.

michaeldrogalis19:10:45

It's been an interesting pattern. We can do a number of things better because we can hide more. It's harder with open source templates because the more you add, the more complexity you stack on for someone using the template.

yonatanel20:10:58

For event sourcing, can I guarantee message ordering without using a window, maybe by relying on the underlying messaging platform?

michaeldrogalis20:10:41

@yonatanel Not yet. We'll have some guarantees in the feature for tasks that have exactly one peer assigned each in the future, but currently no. It ends up being hard/impossible to offer good performance and also in-order processing because of the mechanisms needed for failover, and parallelization in general

yonatanel20:10:26

@michaeldrogalis Would you recommend not using onyx for cqrs/es, or is it actually fine when using a window? I like the way onyx is built but I wonder if I should just use some actors framework or something similar.

michaeldrogalis20:10:20

@yonatanel Onyx is pretty good at those use cases. Ordering is always going to be hard in a distributed setting. Is there something about windows that's problematic for your scenario?

yonatanel20:10:27

Not sure. I can't miss a single event and they must be processed in order. I'm guessing I need a way to check the window has all the messages and I'm not sure how to do that. The docs say messages may be dropped when buffers are full, or retried. That was a concern.

michaeldrogalis20:10:51

@yonatanel How do you know when you've achieved completeness in terms of having "seen" all the events?

yonatanel20:10:55

@michaeldrogalis Exactly, I don't. When I read directly from Kafka I have at-least-once in-order guarantees.

michaeldrogalis20:10:41

@yonatanel Right, you have that with Onyx also. I'm talking semantically though. What piece of data indicates that you've seen "everything" you need to see?

yonatanel20:10:49

@michaeldrogalis I would maybe have to implement my own sequencing per aggregate. But if Onyx has at-least-once and in-order I won't need that.

michaeldrogalis20:10:55

@yonatanel I would recommend using a window with :onyx/group-by to force all "like" data to go to the same local process, even in a distributed setting, then use a predicate trigger to flush the window when it "has" all the data.

michaeldrogalis20:10:26

A predicate trigger is just a function that gets the window contents, then returns true or false if it should flush the window, or do whatever trigger/sync says to do with it.

yonatanel20:10:13

@michaeldrogalis Thanks! I'll experiment further until I have better questions.

michaeldrogalis20:10:23

@yonatanel Sure, happy to help. 🙂

2016-10-11

Channels