clara 2018-10-24 | Slack Archive

eraserhd13:10:22

I found a paper about the LEAPS algorithm, which apparently out-performs RETE by using gasp laziness. I wonder if any LEAPS stuff was incorporated into Clara?

eraserhd13:10:22

This one: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.96.5371&rep=rep1&type=pdf

mikerod14:10:18

@eraserhd I don’t see the date on this some reason

eraserhd14:10:46

It's old, like 94 or something.

mikerod14:10:49

but I know I’ve read about LEAPS before.

mikerod14:10:07

It is comparing to the traditional rete used in the OPS5 and perhaps a few others at the time

mikerod14:10:14

things were quite a bit different in those I believe

mikerod14:10:25

Clara takes advantage of batch-oriented fact propagation

mikerod14:10:51

I am of the opinion that that is the biggest perf win

mikerod14:10:20

over any sort of laziness. However, Drools (popular JVM/Java based rules engine) went to a custom algo they decided was sufficiently different enough to get a new name

mikerod14:10:23

“PHREAK”

mikerod14:10:31

in Drools 6, they wrote some good stuff on it

mikerod14:10:48

but it was meant to be lazier and to do things like cut parts of the rete tree off when they aren’t needed

mikerod14:10:16

the unfortunate part of that upgrade was that Drools went from eager and single fact propagation to this lazier and batched propagation

mikerod14:10:22

and I think the batching is the bigger win

mikerod14:10:45

So the topic of being lazier in Clara has came up before

mikerod14:10:12

but hasn’t been done since it isn’t clear how much you really gain from that over the batched propagations.

mikerod14:10:25

that is assuming you have queries that you intend to use

mikerod14:10:51

if you had like 10 queries and were only going to want to perform 1 of them a lot of the time or something like that, then there may be a bigger win to delaying things

eraserhd14:10:51

I'm only a little way in, and I haven't seen the laziness part yet (even though it's claimed in the abstract). It so far has claimed that the biggest win is not needing to materialize facts in memory. I don't understand it yet.

mikerod14:10:06

also, keep in mind, these older papers

mikerod14:10:17

they have some good material for sure, but have to be aware of the environments there were dealing with

mikerod14:10:57

e.g. sometimes they are really emphasizing using minimal memory (was more constrained then), or higher allocation costs etc

mikerod14:10:21

it’s just something to be aware of, still good material out there and most of it is pretty old

mikerod14:10:59

also, sometimes things are explaining a situation that is most helpful when dealing with a large number of rules, other times its for dealing with a large number of facts, and occasionally perhaps a large number of both is discussed

eraserhd14:10:21

yup. But I read things like this as a hobby, honestly.

eraserhd14:10:34

I'm not suddenly suggesting that we must implement this 😄

mikerod14:10:56

no, it’s good to discuss and to think about

eraserhd14:10:58

In fact, it would be neat if there was a bibliography for Clara Rules.

mikerod14:10:24

quite a bit of perf-related work has been done in Clara already

mikerod14:10:45

some of that was just impl details and other things were tweaks to propagation or often accumulators

eraserhd14:10:25

Neat... I got that impression. It performs super well for us.

👍 4

wparker16:10:18

@eraserhd Agreed with @U0LK1552A’s previous comments - Another interesting way that this played out (both myself and Mike spent a while doing perf optimizations on Clara) is that in practice, it turned out that even for large cases (hundreds of thousands of facts/tens of thousands of rules) the constant factors seemed to be most important. Hashing turned out to be a large percentage of work performed for example. A lot of these optimizations (on the Clojure side) are in the memory.cljc namespace with lots of Java interop etc. That said, this was in use-cases where we didn’t really have lots of data that we were just going to end up discarding, and there’s definitely cases where more laziness could be useful.

mikerod14:10:00

Clara make some small mention to background here https://github.com/cerner/clara-rules/wiki/Introduction#the-rules-engine

mikerod14:10:14

The paper referenced there is http://reports-archive.adm.cs.cmu.edu/anon/1995/CMU-CS-95-113.pdf

mikerod14:10:51

It is concerned with the ops system I think. It’s long and not all that relevant to current stuff, but there are some nice fundamental chapters in it

mikerod14:10:13

Starts with a very basic description of a simple rete impl, and then discusses some interesting ways to improve it, such as left and right unlinking

mikerod14:10:02

and Drools has a lot of extra stuff going on that is left out of Clara, but they do have some good docs on approaches https://docs.jboss.org/drools/release/6.2.0.CR2/drools-docs/html/HybridReasoningChapter.html#ReteOO

mikerod14:10:13

That’s their newer one. Not the same as Clara, but there are some commonalities in some the ideas. It also may explain some of the deficiencies in the overly simple approach traditionally taken.

eraserhd14:10:56

Nice, thank you! I've queued all that up for bedtime reading.

mikerod14:10:51

sure, sorry it isn’t super organized - reference dump

jvtrigueros22:10:51

I'm trying out Clara, but I'm hitting an issue, perhaps I'm using it wrong, here's the simplest snippet that shows my issue:

(defrecord Request [resource])

(defn condition
  [x y]
  (= x y))

(defrule some-rule
  [Request (condition ?resource resource)]
  =>
  (println ?resource))

(mk-session)

I'm getting an error about ?resource not being bound, however if I replace condition with = this works. I didn't see anything in the Boolean Expressions documentation about not being able to use other functions, however if that's the case, how does one go about doing this?

souenzzo22:10:01

[Request (= ?resource resource)]
[:test (condition? ?resource)]
=>
(prn ?resource)

ethanc13:10:46

Probably not relevant and more of an FYI, but if the condition? is simply equality it could be done with a third argument to =.

(r/defrule some-rule
  [Request (= ?resource resource "GET")]
  =>
  (println ?resource))

souenzzo13:10:59

[Request (= resource "GET")]
=>
(prn "GET")

Onde it will only match if resource is GET 😅

souenzzo13:10:51

you can also do

[?request <- Request (= resource "GET")]
=>
(prn ?request)

Will print the "full record"

jvtrigueros13:10:30

Thanks @U3KC48GHW ! This is actually something that I did need (for a different rule)

souenzzo22:10:17

@jvtrigueros

jvtrigueros22:10:28

How does this work? In this contrived example condition takes two arguments but here it's just one. :thinking_face: Or perhaps you could point me to the literature for this

souenzzo22:10:17

(= ?resource resource) it not a function call. it just a DSL to bind the value of resource (form record) to ?resorce (symbol)

👍 8

jvtrigueros22:10:46

Ah gotcha, thank you! This points me in the right direction, I'll continue to play with Clara 😃

souenzzo23:10:08

some functions you can use like (contains? #{:foo} resource)

2018-10-24

Channels