Fork me on GitHub

Hi, there. Can anybody help me with one thing in Onyx? It’s about :onyx/pending-timeout and the sentence bellow: “Asynchronous Barrier Snapshotting fault tolerance technique does not depend on retrying individual segments on a timeout.” Ok, it does not depend on retrying, but indeed is there a timeout inside the code somewhere, where the segments are submitted to retry? If yes, is it possible/recomendable to disable or increase this timeout attribute somewhere? Thanks, Luis


the keyword here is "individual" -- it does not depend on retrying individual segments


however, periodically, it will send a control signal, a barrier, which is then stored on both input and output storage


this essentially makes sure that input and output both agree on what data has been processed


this happens periodically, e.g. every 15 seconds


i think, however, the onyx/pending-timeout might be more of an artifact of the pre-ABS days

👍 4

Yes, it is pre-ABS days. My concern is: eventually, if a block of segments passes the 60sec, it will receive a timeout and then retried, right? Is it defined in some place?


yes if the block passes 60s, a timeout occurs and an exception for the task is thrown


Onyx then attempts to restart the task


and recover from the last checkpoint


the code to handle this is fairly deep inside Onyx, but you can see it surface in e.g. all the plugins that have to implement these check pointing and recovery functions

👍 4