Fork me on GitHub
#onyx
<
2018-03-29
>
jgerman13:03:24

I feel like I should be able to answer this question from the documentation but I am not seeing it: In the case of output tasks how do you recover when a segment can’t be written to storage?

jgerman13:03:11

for example an output task that writes to a db has a conflict on a particular segment

jgerman13:03:30

we’re on onyx 0.9.15 currently (upgrading isn’t a viable options right now)

michaeldrogalis14:03:39

@jgerman If the task on that job is configured to reboot when an exception is thrown, it'll retry the exact same operation. I'm not sure how you would recover from that specific scenario, Onyx aside.

jgerman14:03:10

my concern is if I have a stream of incoming documents to persist, and one has a conflict, I’d like to catch that message and handle it

jgerman14:03:20

I’ll keep playing around and see if I can find a solution

jgerman14:03:53

I shouldn’t use the word stream there, if I have a batch of segments coming into a plugin and only one has a conflict

michaeldrogalis16:03:57

@jgerman You can use lifecycles to retry on failure. The other thing you can do is supply an :onyx/fn to any task - input or output included, and modify the outgoing segments

michaeldrogalis16:03:11

Perhaps some pre-emptive handling in there could help

michaeldrogalis16:03:31

Hard to help more since this seems very application-specific, but happy to keep bouncing ideas around.

jgerman16:03:39

you can supply :onyx/fn to any task? I was under the impression that once the segments went into an output task that’s where we basically lost our view into them

jgerman16:03:12

it’s definitely application specific, thanks for the ideas!

lucasbradstreet16:03:41

Yes, any task can perform transforms with onyx/fn

jgerman16:03:48

I’ll look at the docs again, I’m slightly confused by when that fn would be applied in an output task, the output task doesn’t pass any segments on because it has to be a leaf node in the job right?

lucasbradstreet17:03:39

The onyx/fn is applied to the segments that are sent to it from upstream, so it’s a way to apply some transformations immediately prior to being placed in storage.

lucasbradstreet17:03:40

You can see the phases in the process-batch section of the state machine, it’ll read a batch from the upstream tasks, then apply-fn