Fork me on GitHub
#onyx
<
2017-07-14
>
eriktjacobsen22:07:47

What are some favorite patterns for managing state with retries? Let's say I have a map inside an atom, and as events / segments come into the onyx job, I want to apply commutative functions that modify that map. Only in a distributed way and ideally something that would filter out subsequent calls in a retry situation. I see avout, though doesn't help on the "exactly once" side. I've read the docs / blog posts for both the 0.8 aggregation state management and the new ABS, though it seems that is more for maintaining "Exactly Once Aggregation" for a given window, rather than a "global state of system" as I'm looking for. I'm curious what best practices are currently for this.

lucasbradstreet23:07:52

You can use aggregations for a global state. If you write them as an onyx aggregation with a global window it'll work fine under retry situations, though you will need to shard it over a number of peers in an appropriate way if the amount of state you are accepting is large

lucasbradstreet23:07:12

Unless my definition of global state differs from yours

eriktjacobsen23:07:44

I can be more specific: I'm tracking a graph structure represented as a map of nodes, and as events stream into the system, they change the status of many nodes throughout the system. I would estimate max 50,000 nodes at 2kb per node, so 100mb.

eriktjacobsen23:07:58

Are window states only visible within that window's execution? One aspect of system is providing a "recent history", and was planning on using 5 min sliding windows. Would those windows be able to query the global window state? Will look into this

lucasbradstreet23:07:20

Window states are only visible within that window’s execution, but you can use triggers to stick the window states somewhere else, e.g. put them in an atom

lucasbradstreet23:07:38

We also have a feature called queryable state coming soon that will allow you to directly access the window states from other applications