This page is not created by, affiliated with, or supported by Slack Technologies, Inc.
2018-10-31
Channels
- # aleph (1)
- # announcements (2)
- # beginners (105)
- # braveandtrue (1)
- # cider (61)
- # clara (47)
- # cljdoc (29)
- # cljs-dev (5)
- # clojure (61)
- # clojure-austin (14)
- # clojure-brasil (1)
- # clojure-india (3)
- # clojure-italy (4)
- # clojure-mexico (1)
- # clojure-nl (2)
- # clojure-spec (37)
- # clojure-uk (95)
- # clojurescript (73)
- # cursive (12)
- # data-science (1)
- # datomic (26)
- # duct (10)
- # emacs (5)
- # fulcro (47)
- # hoplon (6)
- # hyperfiddle (3)
- # jobs-discuss (3)
- # kaocha (2)
- # leiningen (3)
- # nrepl (8)
- # off-topic (3)
- # onyx (2)
- # re-frame (31)
- # reitit (16)
- # shadow-cljs (36)
- # spacemacs (46)
- # specter (16)
- # tools-deps (13)
- # yada (22)
Hi! So we're debugging a performance issue, and it seems like, in our example, fast-token-compare taskes 142 seconds of self time, while PriorityQueue.remove() takes 139 seconds of self time.
It's not clear from VisualVM's sampling, but it looks like this referring to the PriorityQueue.remove(Object) method, which is consistent with the fast-token-compare times.
This method is linear for priority queue. It should be possible to wrap PriorityQueue to make it log(n) by checking when values are extracted. Is there some reason this won't work?
Alternately, are we doing something wrong to get runtime dominated by PriorityQueue.remove(Object)?
@eraserhd > Alternately, are we doing something wrong to get runtime dominated by PriorityQueue.remove(Object)? Seems like there is a lot of thrash of rules getting activated, but then subsequently “deactivated”
> This method is linear for priority queue. It should be possible to wrap PriorityQueue to make it log(n) by checking when values are extracted. Is there some reason this won’t work? I don’t understand what this is and how it’d make it log(n).
and lastly, fast-token-compare
may be slowing down due to large tokens in these rules that are struggling with if they are satisfied/not satisfied
@mikerod if remove/1 stores objects in a hash set, then remove/0 and add/1 skip removed tokens, remove/1 becomes O(1) and I think remove/0 is still O(log N)
Dijkstra's algorithm is frequently implemented like this, because removing from the priority queue is expensive.
in the particular case where the prio queue is used, the object to remove is a new instance too, so wouldn’t be able to have a reusable cached hashcode
not to say it isn’t worth consideration at any time, just saying hashing isn’t always as cheap as it seems
it uses identity-based equivalence as a first-pass, so via identical?
in clj semantics
the activation stuff is what uses the prio queue, it has to do with how when rules are satisfied, they are “staged” as available to have their RHS action executed at some future time
it is a prio queue since you can have precedence on which rules should be staged ahead of one another - e.g. :salience
I think there are some related issues out there (though not explained the same) https://github.com/cerner/clara-rules/issues/385
I know I’ve messed around with this sort of problem before though, not sure I ever had a good answer
your hashing thing could be attempted at least, just have to consider a lot of things if it ends up looking like a good path
the issue above talks about a “distinct accumulator”, I think that may be overly specific
a big token would come from a rule left-hand side (LHS) that had a lot of facts associated with a single activation of the rule
Take for example something like
(r/defrule big-activation-maker
[?xs <- (acc/all) :from [X]]
=>
(do-stuff ?xs))
then this rule may keep activating/cancelling/re-activating, as the set of X
grows - and each time there is a big token comparison to do these removals
Most of the time, the network’s “batched propagation” semantics would lead you to a situation where all the X
are around in one go with only 1 rule activation resulting
perhaps the distinct accumulator case was a situations like this one, I just don’t remember
This sounds like there is one thing we can write as a custom accumulator to avoid retriggering a whole slew of rules.