This page is not created by, affiliated with, or supported by Slack Technologies, Inc.
2018-06-05
Channels
- # beginners (135)
- # cider (30)
- # clara (66)
- # cljs-dev (18)
- # cljsrn (6)
- # clojure (115)
- # clojure-austin (1)
- # clojure-dev (10)
- # clojure-italy (7)
- # clojure-nl (1)
- # clojure-spec (18)
- # clojure-uk (26)
- # clojurescript (76)
- # cursive (2)
- # datomic (4)
- # devops (1)
- # emacs (19)
- # fulcro (159)
- # garden (3)
- # klipse (5)
- # leiningen (5)
- # off-topic (61)
- # om (7)
- # pedestal (6)
- # re-frame (17)
- # reagent (73)
- # ring-swagger (6)
- # rum (5)
- # shadow-cljs (60)
- # spacemacs (31)
- # specter (4)
- # vim (8)
- # yada (1)
@dave.tenny should be possible to use Clara tracing to take a look at something like that. Although it’s exploratory to find what you want given the data structures it returns. You could get something like count of times things were evaluated that way.
Between each fire-rules
the dispatcher will select one job per job type (so 15 in this numbers case), dispatch it, update the ActiveUserJobCount, WorkerResource, facts, and retract the dispatched JobRequest facts.
So I'm averaging about 1-3 seconds per job dispatch, and whacking the hell out of my CPU. (Memory footprint is good however... surprise!)
I had originally hoped to do the fact maintenance for active user job counts (used for round robin eligibility consideration), and worker resource stats (to track remaining worker capacity), in the RHS of rules. But I gave that up early in the process because it was a side-effect oriented process with a bunch of [:not A] => A
scenarios, which might have been doable, but it was really the timing of the rule LHS evaluations that killed it, since LHS evaluations are perceptually "in parallel" and not sequential with respect to cause and effect for a given rule.
So I compute some eligible things, then do the dispatching and accounting between calls to fire-rules
, then do it all over again (always saving and continuing from the updated session).
> but it was really the timing of the rule LHS evaluations that killed it, since LHS evaluations are perceptually “in parallel” and not sequential with respect to cause and effect for a given rule.
Only the case for insert-unconditional!
(and possibly even a defect), but yeah. if you have to extract these facts after they are “done” it may make most sense to be an external thing anyways.
@dave.tenny I can look at your rules from above some, sounds too slow
Happy to share the whole module, nothing but some pretty rules and ugly code to do the bookkeeping and setup mock data and such
when something takes long in that range of seconds (like 10+) I tend to do a cpu sampling
Job dispatch and completion is all about accounting then trying again with updated rules.
sometimes it gives quicker leads to what is causing issues. Sometimes it is too opaque unless you know the rule engine internals, but not necessarily always
Yeah, I'm having problems with VisualVM on my Linux system becuase of some JNI/Jar problem I can't figure out, and HPROF sampling is usually useless
but also, doing some just blunt counting of times that certain conditions in rules and/or rule RHS were fired sometimes can give you the outliers as well
Yeah, VisualVM with “cpu sampler” is what I have used. if things are whacky there, not sure hah
Yeah, I'm looking at instrumenting, is it possible to capture the time spent, via instrumentation, in each LHS condition?
Re the rules, any obvious stupid hash-join failures or things that might be better done in predicates or with accumulators?
when I’ve tried “counting condition/rhs” firings before in bad performance cases, there are often outliers
In my case I'm worried it's the tests in the conditions that are firing a lot, but I'll get more data.
For example, if I'm doing N^2 firings on the worker-viable-jobs conditions for the 100k job requests and 45 worker-resource facts.
you have 100K job requests facts and 45 workers, and you also have a :not
in that rule
This is where, if I could update the worker resources in the RHS and immediately prune the possibilities for the next firing of worker-viable-job it would have been a win, but that doesn't work because all the worker-viable-jobs are going to fire regardless of whether I update the worker resources because of that seemingly parallel LHS evaluation protocol.
Extract a rule, find the oldest jobs first, then only bring those into the join with WorkerResource
rule here
in general, you have a lot of JobRequest
facts to deal with. You want to avoid any rule that may do a join across that set of facts with itself
Except we'll still need to potentially consider the next oldest, and so on, until we find one that fits the avaialble worker resources, so does an extra rule really help?
[:not [JobRequest (< job-id ?job-id)
(= ?job-type job-type)
(= ?user-id user-id)
(<= threads ?threads)
(<= memory ?memory)]]
The memory and thread tests are to test only for jobs that can actually execute on the worker resources, i.e. for which enough resources exist.
Any advice for a beginner on how to effectively use the tracing API here? The one time I tried it there was too much data to process, and that was on the simplest most minimal amount of facts.
re: the "oldest job" stuff, I was wondering if accumulators would in any way help, I have no idea how they're implemented w.r..t incremental fact maintenance.
> Any advice for a beginner on how to effectively use the tracing API here? The one time I tried it there was too much data to process, and that was on the simplest most minimal amount of facts. I haven’t used it as much as I’d expect. I was used to rolling my own stuff prior to when tracing stuff was introduced. However, for counting, I believe you can do something like:
(let [traced (-> (clara.rules/mk-session <your rules>)
(clara.tools.tracing/with-tracing)
(insert <your facts>)
fire-rules
clara.tools.tracing/get-trace)]
(frequencies (map :node-id traced)))
or perhaps better sorted:
(let [tr (-> (clara.rules/mk-session [temp-rule])
(t/with-tracing)
(insert (->Temperature 10 "MCI")
(->Temperature 20 "MCI"))
(fire-rules)
(t/get-trace )
)]
(->> (map :node-id tr)
frequencies
(sort-by val)
reverse))
once you know the highest count :node-id
s you can look those up in the rulebase associated with the session
(let [session (-> (clara.rules/mk-session <your rules>)
(t/with-tracing)
(insert <your facts>)
(fire-rules))
trace (t/get-trace session)
node-id <whatever node id in question from `trace`>
{:keys [rulebase]} (clara.rules.engine/components session)
{:keys [id-to-node]} rulebase]
(get id-to-node node-id) )
This is how you could look up the node-id
to try to find what node in the engine it isa node will be a defrecord of stuff, not all that readable to you, but you should be able to recognize aspects of it and align it back to your rules typically
@mikerod as I am still new at accumulators, I'm trying to discern the purpose of the ?user-id binding on line 30 in the above snippet. Is it used so that the rule will fire once for each distinct user id? Does it work given that ?user-id is not otherwise bound?
@dave.tenny > Is it used so that the rule will fire once for each distinct user id? yes, in that’s its purpose in that example
which makes me realize I had a typo there, it should have been (= ?user-id (-> this :worker-viable-job :user-id))
since it was nested one level lower, will update. Sorry if that caused confusion.
Accumulators behavior with field-level bindings like that is explained more in http://www.clara-rules.org/docs/accumulators/
I'm getting null pointer exceptions on calls to <
that I think are the accumulator, but are maybe something else, unfortunately the stack trace doesn't not clue me in other than it's in (fire-rules)
, pretty much. However the acc/min
isn't documented to accept a :initial-value
argument, so it's probably something else. Just in case something obvious is missing above.
Ah wait, maybe it's because I removed an explicit binding of :user-id in the candidate rule.
@dave.tenny more typo’s because I forgot the thing was neste