clara 2016-06-13 | Slack Archive

mikerod02:06:27

devn: thanks for the heads up. Yeah, my original demo is likely to not be all that close to what I end up going with. However, it is good to know where the hiccups are right now with the default impl's of print-dup and I forgot about this existing issue there.

devn02:06:32

@mikerod: it actually prompted me to do a general jira survey for "dup" issues. I was unable to figure out how to get JIRA to search for #= despite several attempts at various quoting and escaping schemes

devn02:06:57

actually, i wound up doing a walk over most of JIRA yesterday. I don't know that it was "enjoyable", but focusing on the critical and high priority issues and reading them through was worth the time I think

mikerod02:06:05

devn: I think that is often a good idea'

mikerod02:06:20

I try to keep up with it somewhat, but it'd be good to get a refresher over the bigger existing issues

mikerod02:06:34

I agree that searching for print-dup stuff is certainly a non-trivial one 🙂

devn02:06:44

it moves slow enough that if you give it a good look post-release, you'll know what's up for awhile

mikerod02:06:05

I've done a reasonable amount with print-dup before, but it is a lesser documented area and as you've shown here, there are caveats

mikerod02:06:15

yeah, I'd think so

devn02:06:51

heh, speaking of print-dup: I wrote a thing awhile back that took all of the sexps from the #C03S1KBA2 IRC channel going back to 2009

devn02:06:05

and ran them in a sandbox, capturing output (if any), and value

devn02:06:19

I used print-dup to serialize the entire collection of results out to a file at some point, and i remember finding some tricky bits, but i don't remember details anymore

devn02:06:25

despite the "first rule of hash equals club is don't talk about hash equals", i remember it being quite solid back in 1.2, 1.3

devn02:06:13

there were at least 30k random examples serialized to a file, and read successfully, including Swing and all sorts of other nonsense

devn02:06:44

though it's one of those things that Rich and Co. have basically said: use at your own risk, i have a hard time imagining it going away

devn02:06:07

Anyway, as long as I mentioned that project: http://getclojure.org/

devn02:06:03

It's nothing to write home about, and since it was all sandboxed, there are plenty of things which didn't pass the sniff test and were subsequently not run in the sandbox, but there's some interesting stuff in there, usually further into a search. A search for comp has results for the first few pages that are like (comp comp comp comp) for example, which is not very interesting.

mikerod02:06:21

devn: sounds interesting

mikerod02:06:23

I'll ahve to look at that

devn02:06:32

it's not really, i assure you 🙂

mikerod02:06:07

yeah, print-dup is weird. It seems to be, in some ways, a recommended approach to serialization of Clojure structures, but it is still not really fully "official"

devn02:06:08

i had dreams of building a sort of hoogle for clojure. codeq + clojure.tools.analyze + query over sexps, something like that

mikerod02:06:20

it is primitive though. to use it successfully, you have to implement your own back references etc.

mikerod02:06:34

if you have a lot of pointers to the same objects in memory or something like that.

devn02:06:57

@mikerod: do you work with ryan, out of curiosity?

mikerod02:06:33

this website is interesting GetClojure

mikerod02:06:40

I think I remember looking at this a long while back perhaps?

mikerod02:06:46

maybe you posted it somewhere before and I was curious

devn02:06:48

yeah, it's my first clojure project from the olden days

mikerod02:06:52

ah I see

mikerod02:06:06

Yeah I work at the same company as Ryan. Also Will on here

mikerod02:06:18

w parker

devn02:06:57

i'd like to understand clara internals better, but i think i need to start smaller and implement a dumb version of rete before I'll be of much use

devn02:06:10

(mostly for the purposes of contributing)

mikerod02:06:17

devn: yeah, I actually find the codebase to be approachable, but it is good to have some background

mikerod02:06:46

clara.rules.engine is almost something you could look at stand-alone to reason about the way the Rete network propagates data around

mikerod02:06:57

clara.rules.compiler is more compilation complexity

mikerod02:06:20

builds the rete graph etc, but I think you can sort of gloss over that and just look at clara.rules.engine to get a good feel for how data moves around

devn02:06:37

is it crazy to think about building some instrumented visualization of the graph construction?

mikerod02:06:38

writing a toy version of your own could be a good learning experience though I'd imagine

mikerod02:06:49

I think that could be a cool and useful thing

devn02:06:51

sort of a stepping debugger or something

mikerod02:06:01

I know that Ryan played some with some visualizations here and there

mikerod02:06:35

however, nothing too serious yet. just demo's or trials of things I think

mikerod02:06:41

https://github.com/rbrush/clara-tools

mikerod02:06:03

and also, it may not visualize the graph like you are saying. Visualization has different sort of levels you may be interested in.

devn02:06:17

yes, exactly

devn02:06:33

there are layers to the explanation/visualization

mikerod02:06:53

There is an older paper referred to in the Clara wiki too

mikerod02:06:53

https://github.com/rbrush/clara-rules/wiki/Introduction#the-rules-engine

devn02:06:54

some of them useful at a domain level so people using our stuff can understand why a decision was rendered

mikerod02:06:00

http://reports-archive.adm.cs.cmu.edu/anon/1995/CMU-CS-95-113.pdf this paper

devn02:06:26

and then there's the "i'm a programmer, show me the hairier view, with data"

mikerod02:06:30

I found it to have a really good overview of Rete. It is certainly dated and lacking many of the optimizations and extensions to Rete present in Clara, but it is a good base.

devn02:06:37

and then there's the "i know kung fu, show me the internals"

mikerod02:06:57

Yes. Visualization does seem to be a complex subject.

devn02:06:11

@mikerod: this is the paper clara is based off of yes?

mikerod02:06:36

Yeah, I'd say so. However, it is really just an overview of Rete more than anything.

devn02:06:38

the question i had was: is it better start simpler, with rete - optimizations?

mikerod03:06:10

there are a few key chapters to it. it is long, but some of it isn't really all that relevant unless you are curious about the test system they were writing back in the 90s or whenever it was. 😛

devn03:06:11

i have some books i picked up on CLIPS and OPS5. Some interesting stuff!

mikerod03:06:26

yeah, clips has good material

devn03:06:40

i mean, it's cool to see something lispy

mikerod03:06:40

and OPS5. I've read at least most of the Forgy paper

devn03:06:51

even if the material is quite dated

mikerod03:06:04

I won't speak completely for Ryan, but I think he had some inspiration from http://herzberg.ca.sandia.gov/

mikerod03:06:06

as well

mikerod03:06:08

Jess

mikerod03:06:19

At least in the fact that it used a Lisp DSL, ran on the JVM

devn03:06:27

oh, nice! this is great! thanks!

mikerod03:06:31

Clojure is just here now, and makes it more awesome (in my opinion)

mikerod03:06:50

Jess has good documentation here though. Including on some Rete extensions like accumulators.

mikerod03:06:06

I've read most, if not all, of the Jess wiki docs accessible via this link.

devn03:06:41

i keep tinkering with the idea (yet another project that is on hold due to the amount of time i have) of a datomic-driven rules engine

mikerod03:06:16

Yeah, that has seemed like an interesting idea to me before as well

devn03:06:38

reified transactions a la Tim Ewald (sagas) here: http://www.datomic.com/videos.html

mikerod03:06:40

I haven't done much with Datomic other than listen to a bunch about it and read up on it. I've liked a lot of the ideas etc there and hope to work with it at some point

mikerod03:06:18

I don't think I've watched these ones yet. I will. Looks like a good set of videos

devn03:06:29

i know that there have been at least two rules engines built on datomic which are not open source (sad face)

devn03:06:48

one of them, as i understand it, had all the fixins (truth maintenance, etc.)

mikerod03:06:54

Oh, also on Rete. I follow Drools stuff a fair amount too. Drools has pushed the bar with optimizations and has a lot of good extensions and documentation as well.

mikerod03:06:18

Oh. Unfortunate neither is open source then

devn03:06:48

yeah, though it just makes my curiosity a bit more intense

devn03:06:26

query in datomic is basically the LHS.

devn03:06:33

the datomic thing is most interesting to me i think because most rules engines (as far as i understand it) do not address temporal concerns

devn03:06:12

this is partly why durability is interesting to me: "show me what would have happened as of a point in time", or "show me what it would look like if the rule base contained this rule, without committing it"

mikerod03:06:30

devn: yeah, temporal concerns can be a wrench

mikerod03:06:33

in things*

mikerod03:06:40

I think Drools has tried to work with temporal concerns some

mikerod03:06:44

I haven't read up on it much at this point

mikerod03:06:48

"events" i think they ccall it

devn03:06:00

interesting, googles

devn03:06:35

so, i've only read a couple of sentences while skimming the docs, but very interesting

mikerod03:06:47

http://docs.jboss.org/drools/release/6.4.0.Final/drools-docs/html_single/#DroolsComplexEventProcessingChapter

devn03:06:48

"Complex Event Processing, or CEP, is primarily an event processing concept that deals with the task of processing multiple events with the goal of identifying the meaningful events within the event cloud. CEP employs techniques such as detection of complex patterns of many events, event correlation and abstraction, event hierarchies, and relationships between events such as causality, membership, and timing, and event-driven processes."

mikerod03:06:17

yeah, that may be the sort of thought you're thinking about

mikerod03:06:39

I know Ryan has said a little on the topic of dealing with temporal "events" as well, but I'm not sure how much its been thought out yet

mikerod03:06:21

Clara durability of session state (working memory) is going to be important to the work I'm doing right now, so that should keep progressing. We haven't needed anything more complex as of yet.

mikerod03:06:39

To us, we've just thought of retracted facts if things change in the future

mikerod03:06:58

and inserted a new fact if there was a "change"

mikerod03:06:07

I'm sure this can be limiting if you have more complex use-cases

devn03:06:09

we deal with a lot of sparse data, and so a fair amount of work is done to validate concepts, those concepts are then turned into facts. i'd sort of like to extend rules to the problem of validation and creation of concepts

mikerod03:06:00

interesting

devn03:06:51

we of course build up concepts inside the engine, too

devn03:06:41

but there's a sort of ETL-like process that varies a lot. i imagine a sort of rule-based ETL

mikerod03:06:53

sounds doable

devn03:06:38

yeah, certainly, just haven't had the stomach for it, because i think there's tooling i'd want to build to manage this process and have decent visibility into it

mikerod03:06:43

well if you come up with good tooling ideas that'd help on the Clara end, I'm sure we'd all be happy to hear about them

devn03:06:05

I mean, I'm as much interested in providing ideas as I am hearing about them.

mikerod03:06:29

haha

mikerod03:06:44

yeah, visibility is something we should strive for

mikerod03:06:07

our current push this release round is probably going to center on performance and then this durability thing

devn03:06:39

As I said, I don't feel like I have a good enough handle on internals to say: "here's a great idea!" because I worry that the performance overhead will be insane, or that i'll be unable to capture some key piece of information necessary to the idea without reworking some significant piece of Clara

mikerod03:06:55

but thinking more on what can be done for visualization, "explainability", etc should stay imoprtant

devn03:06:22

the explanation piece is one of many reasons why we chose a rules engine

devn03:06:26

but it's a really, really big one

mikerod03:06:45

https://github.com/rbrush/clara-rules/blob/master/src/main/clojure/clara/rules/listener.cljc

mikerod03:06:57

the listener stuff here has some leverage

devn03:06:06

it's extremely important for us to say: "We recommend these actions based on the following information related to the patient, provider, etc."

mikerod03:06:22

mostly just keeping tracing sort of information in data structures that can be hooked on by whatever tooling

devn03:06:22

with a sort of decision tree that people with a clinical background can use to validate those recommendations

mikerod03:06:39

devn: yeah, we've had similar sort of use-cases with rules

mikerod03:06:58

so far, what we've mostly done - which has been sufficient - was to just show the support "chain"

mikerod03:06:10

we either manually write rules to do so, or we built some automatic helpers around insert's

mikerod03:06:59

(defrule my-rule [?s <- SomeFact] => (insert! (map->AnotherFact {:something blah :support [?s]})))

devn03:06:01

I committed some code as part of an early spike on clara which added a :contributing-factors {} k/v in the RHS

devn03:06:06

ha! so i'm not crazy!

mikerod03:06:23

yes, we have implemented some ideas around this. not in Clara directly, just in our own projects

devn03:06:26

i've been looking at that sideways and thinking: hmmm, this feels weird to me

devn03:06:47

particularly because over the course of the "chain", things can get a little sticky

mikerod03:06:48

we've defined protocols or something as well that can sort of walk the "contributing facts" chain

devn03:06:04

you might need to merge multiple :contributing-factors across the results of queries in order to get the real picture

mikerod03:06:10

yeah

devn03:06:12

and then you have to wrestle with order

mikerod03:06:19

visualizing it int he end can get interesting

devn03:06:28

or overwhelming 😄

mikerod03:06:36

yeah

mikerod03:06:56

we've sort of just "flattened" the chain up to certain points where there is a meaningful user-defined name

devn03:06:59

or at least, the visualizations ive come up with are like: oh god, no nurse is ever going to look at this

mikerod03:06:27

so it's sort of hierarchical, but flattened out just to where we have some sort of meaningful name to show end-users

devn03:06:30

@mikerod: yeah, i think i was close, but the user-defined name is important

devn03:06:59

InternalRecordThatExposesImplementationDetails {?x 1, ...} => no bueno

mikerod03:06:07

yeah, definitely don't want that

devn03:06:33

the bindings which wind up being worth a damn, again due to sparse data, can be troublesome also

mikerod03:06:44

yep

devn03:06:46

you don't want to expose bindings which didn't matter at a particular conceptual level

devn03:06:53

but then, they turn out to be valuable in some future rule

mikerod03:06:58

yeah, so it is hard to automate and just assume "all bindings"

mikerod03:06:22

you could mark fact types in some way or another htat are "internal details" vs meaningful though

mikerod03:06:07

(defrule [?f <- ImplDetailThing] [?f2 <- Meaningful] => (insert! (map->Fact {:val something :support [?f ?f2]})))

devn03:06:09

I'm now generating records (using eval, don't hate me) that pull from a big old map of domain concepts => data definitions

mikerod03:06:14

so just like automate attaching all bindings as support

mikerod03:06:28

but then when walking the chain later, dropping out stuff in the chain/tree that isn't useful

devn03:06:29

@mikerod: ah, that's a good idea.

devn03:06:49

haven't played with a specific fact type

devn03:06:55

(for capturing the meaningful bits)

mikerod03:06:20

yeah, could think of a protocol like ISupported (depending on how you like to name things)

devn03:06:21

i wonder: any ideas on a query that pulls "ordered" Meaningful facts?

mikerod03:06:40

yeah, not sure on ordered, unless you had somehting to order by

mikerod03:06:55

generating records with eval sounds fun 😛

devn03:06:57

right, but i mean, i am willing to throw some of my pride away

mikerod03:06:12

it may make sense if you are syaing you are trying to make data types dynamically driven off of config

devn03:06:15

so an atom per session that increments using the RHS of rules could work, maybe

mikerod03:06:29

yeah, maybe something like a counter

mikerod03:06:43

just be aware the the rules engine has some liberties it can take as far as when it fires rules

mikerod03:06:01

rules may fire in different orders due to non-determinism (we'd like to minimize), optimizations, etc

devn03:06:25

yeah, i desperately avoid order where i can

mikerod03:06:44

just never get too reliant on rule order unless you have logical rule order that with eventually be consistent due to truth maintanence

devn03:06:02

i'd rather not ever pay the cost of needing to think about that

mikerod03:06:04

and :salience when desperate

mikerod03:06:10

yeah, it is complexity

mikerod03:06:20

and ruins the declarative nature of the rules

mikerod03:06:35

and eventually becomes brittle when you get many rules involved etc

mikerod03:06:53

we certainly often see rules with logical dependencies between each other though

mikerod03:06:34

(defrule rule-1 [MyFact] => (insert! (->AnotherFact ))) (defrule rule-2 [:not [AnotherFact]] => (insert! (->AlternativeFact)))` etc

mikerod03:06:41

but I think that is obvous usages

devn03:06:45

yes, that's really what we deal with, but i do worry a bit that if that's not the case in the future, we're going to be sad pandas

devn03:06:56

99% of our stuff is a DAG

mikerod03:06:11

yep

devn03:06:13

err let's go with 90%

devn03:06:15

😉

mikerod03:06:17

hah

mikerod03:06:24

well, I'm logging off for the night though.

devn03:06:28

yeah, same here

mikerod03:06:36

good talking with you. It's nice to hear of others use-cases etc.

mikerod03:06:56

Have a good night!

devn03:06:01

appreciate the conversation, and am pleased to hear that i'm not the only one who's doing this whole :support thing. people on our team just started using it

devn03:06:08

and i was like: oh god, did i just open pandora's box?

mikerod03:06:14

😛

devn03:06:16

but it's been doing the job, so ¯\(ツ)/¯

devn03:06:22

anyway, have a good night, see you around 🙂

devn03:06:24

👋

wparker21:06:17

@devn: FWIW, I agree with Mike that the engine and (I’ll add) the memory are probably the best places to start. When I was first getting familiar with the internals of Clara I found it useful, and still do if I’m unsure of what some rule will look like in the network, to just create a session from that rule and look at the rulebase generated. That is, something like (-> (mk-session [rule-in-question] :cache false) .rulebase :alpha-roots) and just poke around the rules network generated.

wparker22:06:52

I’ve found that if I’m not dealing with the compiler, it is very possible to just deal with the rules network and forget where it came from. The memory is even more isolated in some ways in that it mostly is just a composite data structure, albeit one very much tailored to the needs of the engine.

wparker22:06:13

The reason I put the :alpha-roots in the call above is that facts inserted start there and propagate downward to the children of those alpha-roots

2016-06-13

Channels