Fork me on GitHub
#onyx
<
2018-07-09
>
sparkofreason04:07:30

Whittling down the production issues. Recently got a "host not found" exception from S3, which I assume is just S3 being flakey. Looks like onyx tried several times to connect and eventually gave up and shut everything down. Is there a setting or something to avoid full shutdown in this case (or some pattern for auto restart of the jobs)?

lucasbradstreet04:07:58

If you return :restart from your handle-exception lifecycle it should just keep rebooting the peer until it comes back up.

lucasbradstreet04:07:53

I’ve seen some of those transient host not found issues and I was never sure if it was S3 or whether it was some DNS issues within the container.

sparkofreason04:07:11

Thanks, completely blanked on the handle-exception lifecycle.

jasonbell09:07:09

@dave.dixon I found that S3 becomes unreachable quite a lot, Lucas has already answered it but yes the lifecycle restart on handle exception is a life saver 🙂

👍 4