This page is not created by, affiliated with, or supported by Slack Technologies, Inc.
2019-04-10
Channels
- # announcements (4)
- # beginners (116)
- # boot (4)
- # calva (63)
- # cider (8)
- # clara (20)
- # cljdoc (10)
- # cljsrn (69)
- # clojure (115)
- # clojure-austin (1)
- # clojure-dev (4)
- # clojure-finland (1)
- # clojure-italy (3)
- # clojure-nl (6)
- # clojure-russia (10)
- # clojure-uk (84)
- # clojurescript (28)
- # cursive (14)
- # data-science (1)
- # datascript (1)
- # datomic (11)
- # duct (3)
- # emacs (13)
- # figwheel-main (11)
- # fulcro (4)
- # graphql (6)
- # jackdaw (2)
- # jobs (23)
- # jobs-rus (1)
- # kaocha (11)
- # lein-figwheel (13)
- # leiningen (55)
- # luminus (14)
- # lumo (22)
- # off-topic (121)
- # pathom (19)
- # re-frame (6)
- # reagent (3)
- # reitit (22)
- # remote-jobs (10)
- # ring-swagger (1)
- # shadow-cljs (67)
- # slack-help (5)
- # spacemacs (1)
- # sql (18)
- # vim (28)
- # yada (2)
On another note, we've been troubleshooting some serious performance issues loading our database, and I've just figured out it is memory. If I fire-rules with the facts of the worst kind in batches of 5 (that chain the most), it completes in about 25 minutes; otherwise, it takes about 4h10m.
@eraserhd both heap profile and CPU sampling snapshot info would be good - not sure what you use. visualvm is good
Another shot in the dark suggestion - if you are making rules with long LHS sequence of conditions, perhaps you’d be better off breaking it into multiple rules that aren’t as “deep”
this can often give better perf characteristics and I generally think it’s better modularization anyways. This may not be relevant to you at all though. I just noticed the “chain” part and wasn’t sure what it referred to.
The clara inspection tracing/tools recently had an update by Will I believe where it can try to report “counts” of operations happening at points in the network. if your chaining is resulting in large “cartesian product” of facts joining together, these helpers may be useful to diagnose too
This is the relevant commit https://github.com/cerner/clara-rules/commit/dde409c446a731e364096440d02ea0f7e8442c04
by chain, I mean deep - rule A inserts a fact, then rule B fires and inserts a fact, etc...
@eraserhd Clara uses atoms in engine.cljc in some places to store pending operations, conceivably you could get there from data being put in those atoms. It would be fairly hard though. But then the performant version of your rules sessions taking 25 minutes is striking. 😱 Obviously it could be unavoidable if you’re just processing massive amounts of data, but I’d be curious to see any kind of reproducing example of inordinate resource use, smaller examples obviously being easier to diagnose. I suspect you’ve uncovered some kind of bad pattern in the way the rule network fires rather than this (poor) level of performance being inevitable. Most of Clara’s optimization to date has been done against Cerner’s use cases; having a broader sample of benchmarks would be useful.
AlphaNodes are also generated with an atom
for the bindings that the node will propagate in the event that its satisfied, however they would be scoped to the function that determined the satisfaction of the node. The atom is probably unnecessary and could be probably be replaced with shadowing within the function… I doubt that this is the growth that you are seeing though.
Might be a performance gain by not having to swap!
that atom though….
@U3KC48GHW interesting on the use of an atom there. I’ll have to check that out again. However, overall, I wouldn’t expect to see an “Atom” itself taking memory up. It’s a pointer to something that may be large, but it itself shouldn’t show up as taking that memory (that’s not what I expect in heap dumps at least I think?)