Fork me on GitHub
#clara
<
2023-01-18
>
jherrlin08:01:08

Hey, I have just a bit of knowledge around Clara and rule engines in general and Im researching for a rule engine to use at work and have a question. My team have decided to do I/O from conditions in an attempt to not fetch data unless the condition will be evaluated. This is a optimization attempt from our side. I dont see any problem with doing this from a Clara condition. But the values received from the I/O function, will they be seen in the inspection? We have a requirement to be able to inspect all of the values in a session. Is it possible to see the value a condition got from a I/O function in the inspection? Are we thinking about this optimization wrong? Thx!

wparker12:01:54

I wouldn’t expect Clara to treat code with an IO call differently than any other arbitrary code in the constraints of a condition. Is that what you’re doing - maybe you have an example of the kind of code you’re writing (redacted as necessary)? However, Clara does expect that data will be immutable and constraints should have consistent results, that is they should use pure functions. Something like

[YourFactType (fn-call-that-changes-over-time-with-database-contents this)
would result in undefined behaviour.

ethanc15:01:57

> that is they should use pure functions Strong emphasis here on what wparker was saying. Otherwise it will break truth maintenance: http://www.clara-rules.org/docs/truthmaint/

enn15:01:57

I believe truth maintenance can also cause conditions to be evaluated more than once.

👍 2
ethanc17:01:48

enn is also correct here. Clara’s Truth maintenance does not guarantee single executions of either the LHS or the RHS. As it might go several iterations to reach stable state within the session.

jherrlin08:01:51

Thank you very much for the input! I think I understand the problem it creates with my proposed solution. Do you have any solutions to our base problem? That we dont wanna fetch everything before we run a rule engine session.

enn14:01:00

Maybe memoize the IO?

jherrlin14:01:27

That's a very good idea! So I call a function that does I/O from the condition but and that function memoize the returned value?

ethanc19:01:10

From our use-case, we attempt to ascertain all of the relevant data prior to execution of the session. Modeling most/all of the data that would be required, and then referencing the data within the rules. Though it should be noted that this is more of a necessity due to scale/distributed nature of our system. Where even the memoization of I/O might not be enough to get us out of the possibility of DDoS of internal services.

mikerod20:01:14

Another option would be to compose different rule networks for different phases of your operations

mikerod20:01:34

Query the first session for data regarding if or what needs to be fetched for the next set of rules to evaluate

mikerod20:01:01

At no point in this does any IO happen within actual rule evaluation. The rules are just complex and pure control flow in a bigger processing pipeline

❤️ 4
enn20:01:33

+1 that idea. having separate phases with their own rulesets and clear inputs and outputs for each phase will also keep your system easier to understand as it grows.

👍 2
jherrlin08:01:43

Thank you for all the input!

Joel15:02:52

@UAEV0STAA Also went thru this same transition. Originally, we had rules that figured out what is needed, we did the fetches then ran the second set of rules (also caching/memoizing so facts didn’t change). However, we found that the rules (data fetch/logic) “bled” into each other. Our general solution was to insert facts on what data is needed, and use other facts to fetch the data. This allowed us to do batch fetching as well (using lower salience rules that would collect all the data request facts). I also found that Jess has a trick in this regard using a form of backtracking. It’ll automatically insert “need-” facts. I think you could get Clara to do this possibly by generating more rules, but I didn’t get that far.

jherrlin16:02:42

@UH13Y2FSA thank you! Do I understand it correctly that you have facts that states that more facts are needed, fetch that facts and run a new session? I will look into the backtracking stuff in Jess

Joel13:08:30

@UAEV0STAA TBH, this was implemented in Drools. The RHS collected the (explicitly inserted) “needs- facts”, multithreads the data requests, and inserts into Working Memory, with no new session. I’m curious if you progressed this on the Clara side, its something I might have time to do. I’m wondering if I’m going to be able to read the LHS and figure out generating the new rules.

jherrlin14:08:40

Thank you! That sounds very interesting. I did a pre-study at work but unfortunately the task was handed over to another team. I have a feeling that Clara is very good for this though, as it’s very data oriented. It feels to me that it’s fairly simple to have a domain model that is isomorphic to Clara’s data structures.

Joel14:08:19

Referring to isomorphic, I’m looking at using the {:type …} https://github.com/cerner/clara-rules/issues/6#issuecomment-25760526 as I’m thinking that will make the scanning easier. I haven’t worked out the logic for generating the backchaining rules, not sure how complex that might get.

jherrlin14:08:51

Please post if you do any breakthrough! I don’t work or think actively on this atm but I still finds this very interesting.

✔️ 2