Fork me on GitHub
#clara
<
2022-12-11
>
cddr17:12:09

Hey folks. I’m trying to figure out whether clara can help solve a data enrichment problem. Basically I need to provide an API that you can send a sequence of events and the API will return an enriched version of those events. The process of enrichment applies a bunch of domain specific logic that analyses the events. Examples of the things we need to derive based on the input sequence…. derive new fields in the origin events based on simple formula applied to fields in the event identify sub-sequences of interest and add a field to the origin events that enumerates them insert new events when we can detect from a pattern within the events that something important has happened I’m wondering if I can use clara to define the biz logic. In particular, after reading the docs, I’m wondering how you might use a rule to “edit” a fact that has been added from the input sequence.Would one need to model events as a collection of triples and then when you need to “modify” an event, you just insert a new eav fact?

mikerod17:12:36

I think there has been some eav work before with clara in general. I also think that it’s generally better to insert a new fact instead of try to actually modify the state of a previous fact - but I think you are already thinking that way.

mikerod17:12:01

I don’t believe this is maintained, but it was at least interesting to me when I first saw it https://github.com/clyfe/clara-eav

mikerod17:12:37

In terms of the rest of your flow - I’m not sure I conceptually understand it well enough to say anything more concrete. I think it seems like something that could be modeled reasonably with forward chaining rules. I think the main thing to consider typically is if you can do it with rules that are order-independent and allow the clara truth maintenance system (aka TMS) to work as intended by the defaults there (try to find a logically consistent state of the rules given the final state of the facts in working memory).

mikerod17:12:52

If you end up having to fight that flow too much, then I typically suspect somethings off.

👍 1
Linus Ericsson13:12:07

I think you should look into pathom or similar tools for this problem.

cddr13:12:52

Cool. Thanks Linus and Mike. I had found the clara-eav project which seems like it might be useful. During my research I’d come across eql so seems like pathom is also worth investigating too. I’m also considering the datalog dbs like datalevin and xtdb but I think we’d find value in the truth-maintenance feature (sometimes events will be cancelled). Maybe we can use clara to derive the novel info and a db to persist facts between invocations. The request caching in pathom seems like it would be useful especially if I can use it in conjunction with clara (some of our inferences take a while to compute/fetch, others are immediate).

👍 1
danielglauser00:12:57

At Gateless we just rerun the rules whenever we get new data in, works well for us.

👍 3