Fork me on GitHub

How could I do a "stateful transducer" at a cloud infrastructural level? I'm thinking a lambda stores the new state in some states db then lookup the state value on each transformation step given some id that identifies the record in a collection/source. I would need to use SQS or Kinesis Data Streams to ensure the records are given to the lambda in the correct order. The lamdba would be transforming data from many sources and collections and would need to use the current state associated to the collection of that record being transformed. The streams would run indefinitely and so the state would never be deleted. So I guess my questions are: 1. Is this a good way of solving this problem? If not what are some alternatives? 2. What datastore should I use for storing the state. Everything seems overkill for just a single state value per collection on what is likely to be a few hundred collections 3. If a record ever lost, I couldn't simply just resend it as all later records would have been transformed with an out of sync state. Given the nature of the problem I've solving this happening will have extremely little effect on the transformed data. But this seems like a red flag to me. Perhaps I'm approaching this the wrong way? Any advice is appreciated


@caleb.macdonaldblack Just FYI. You can also use a dynamo db stream as a mechanism to process the records in order.