Fork me on GitHub

What are some favorite patterns for managing state with retries? Let's say I have a map inside an atom, and as events / segments come into the onyx job, I want to apply commutative functions that modify that map. Only in a distributed way and ideally something that would filter out subsequent calls in a retry situation. I see avout, though doesn't help on the "exactly once" side. I've read the docs / blog posts for both the 0.8 aggregation state management and the new ABS, though it seems that is more for maintaining "Exactly Once Aggregation" for a given window, rather than a "global state of system" as I'm looking for. I'm curious what best practices are currently for this.


You can use aggregations for a global state. If you write them as an onyx aggregation with a global window it'll work fine under retry situations, though you will need to shard it over a number of peers in an appropriate way if the amount of state you are accepting is large


Unless my definition of global state differs from yours


I can be more specific: I'm tracking a graph structure represented as a map of nodes, and as events stream into the system, they change the status of many nodes throughout the system. I would estimate max 50,000 nodes at 2kb per node, so 100mb.


Are window states only visible within that window's execution? One aspect of system is providing a "recent history", and was planning on using 5 min sliding windows. Would those windows be able to query the global window state? Will look into this


Window states are only visible within that window’s execution, but you can use triggers to stick the window states somewhere else, e.g. put them in an atom


We also have a feature called queryable state coming soon that will allow you to directly access the window states from other applications