onyx 2016-04-18 | Slack Archive

jimmy07:04:16

watching it, feeling amazing 😄

bcambel09:04:17

Great talk @michaeldrogalis

otfrom09:04:26

michaeldrogalis: very curious to see how things are going to move in a flink like direction

aspra14:04:43

Is there somewhere an example or test that demonstrates message retrying in case it is not fully acked?

lucasbradstreet14:04:47

@aspra: we really need a better way to test for this. The best example I’ve got uses a lifecycle to crash the peer and restart, then the message will retry. This example is in the onyx-datomic plugin https://github.com/onyx-platform/onyx-datomic/blob/0.9.x/test/onyx/plugin/input_fault_tolerance_test.clj

lucasbradstreet14:04:28

Note in this example we use lifecycle/handle-exception to tell the peer to restart and continue processing the job, rather than killing the job

lucasbradstreet14:04:30

https://github.com/onyx-platform/onyx-datomic/blob/0.9.x/test/onyx/plugin/input_fault_tolerance_test.clj#L30

aspra14:04:37

Right. I was more thinking of no peer or job restarting but more for example an error response to an http request

aspra14:04:45

I f I understand correctly then it has to do with the retry-segment implementation of the input plugin used?

michaeldrogalis14:04:10

@nxqd @bcambel Thanks!!

michaeldrogalis14:04:46

@otfrom: Should be a pretty transparent change for users. One big win beyond a performance boost will be fully automatic backpressure. Very little tuning to do.

lucasbradstreet14:04:31

@aspra: in that case it should automatically occur after the pending-timeout is hit on the input task

aspra14:04:01

is pending-timeout the same as batch-timeout?

lucasbradstreet14:04:53

They’re different. http://www.onyxplatform.org/docs/cheat-sheet/latest/#catalog-entry/:onyx/batch-timeout vs http://www.onyxplatform.org/docs/cheat-sheet/latest/#catalog-entry/:onyx/pending-timeout

acron16:04:06

thinking about adding peers to a cluster; lets say I package my-code-0.1.0 in a JAR, throw it on a node in the cluster, have it boot up and connect to ZK and then repeat the process but with my-code-0.1.1 - other than this being totally avoidable with good ops, is there anything Onyx does to warn/prevent this?

michaeldrogalis16:04:48

@acron: There's nothing preventing you from doing that. I'd call that a rolling upgrade tbh 😛

acron16:04:13

heh, ok, thanks. Great talk @ Clojure/West btw

michaeldrogalis16:04:31

Thanks

michaeldrogalis16:04:37

Next release of Onyx is getting an upgraded static analyzer to detect job-level errors before you submit them. Here's a little preview..

Drew Verlee17:04:38

@michaeldrogalis: I know at one point you were looking for help on a “scheduler” At the time i didn’t really understand what this meant. I have come to understand this refers to something analogous to what YARN or MESOS does. Is that correct? Did you finally settle on a solution? I looked around for material on this and came up short. Thanks!

michaeldrogalis17:04:00

@drewverlee: Onyx runs compatibly with Mesos and other tools that offer a "Level 1 scheduler", e.g. machine level allocation across different contending resources. We have our own "Level 2 scheduler", which has application specific knowledge about tasks and such - things that Mesos-ish couldn't anticipate. We went with a library called BtrPlace.

michaeldrogalis17:04:18

Crazy strong library, highly recommended. http://www.btrplace.org/

2016-04-18

Channels