This page is not created by, affiliated with, or supported by Slack Technologies, Inc.
2017-09-12
Channels
- # aleph (11)
- # aws-lambda (1)
- # beginners (158)
- # boot (19)
- # cider (14)
- # clara (23)
- # cljs-dev (3)
- # clojars (4)
- # clojure (133)
- # clojure-dev (57)
- # clojure-dusseldorf (1)
- # clojure-finland (2)
- # clojure-gamedev (31)
- # clojure-greece (15)
- # clojure-ireland (1)
- # clojure-italy (3)
- # clojure-russia (8)
- # clojure-spec (149)
- # clojure-uk (51)
- # clojurescript (88)
- # community-development (1)
- # component (5)
- # cursive (17)
- # datomic (3)
- # emacs (6)
- # fulcro (142)
- # graphql (1)
- # juxt (15)
- # lein-figwheel (1)
- # luminus (3)
- # lumo (6)
- # off-topic (11)
- # om (8)
- # onyx (5)
- # portkey (6)
- # proton (2)
- # protorepl (3)
- # quil (6)
- # re-frame (14)
- # reagent (9)
- # shadow-cljs (226)
- # specter (11)
- # testing (96)
- # uncomplicate (5)
- # unrepl (8)
- # vim (11)
I have a singleton fact that I insert for referencing that contains a large map (hundreds of KB), and is a part of the LHS of most of my rules, so that the RHS can use the map when inserting more rules. Are there any performance issues to consider here? Or ways to exploit the fact that it is a singleton and will never need to be compared between multiple instances? I know with persistent structures it isn't inefficient to have the large map in multiple places, but was wondering last night if perhaps there was a better way to reference it. The LHS query for it is dead simple: [Tree (= ?root root)]
in all cases
beyond that, assuming it is immutable (as facts should be), a cached hashcode may become important
You could also consider wrapping the big map in a custom type that optimizes equality checking and hashing too if you had noticeable perf issues come up
You’re example does look like you have a type on it called Tree
though, so I am not sure what implications that type has
Tree
is just a record wrapper around a map so it works with the default Fact type identification
For clarity: There are basically 2 situations that come to mind in this scenario:
1) How often is this big fact going to be compared to other different facts?
2) This big fact will be part of a token that is stored in memory. This token is sometimes used as a hash map sort of key. If this is hashed too often, the hash code may become a bottleneck.
For (1)
If the fact has a unique type that no other facts have, it should only ever be compared to maybe itself. The equals check should be fast tehre since most equals impl’s will short-circuit on the identical?
(or Java ==
case)
For (2)
Clojure maps default to caching hash codes. Clojure records do not, prior to what looks like 1.9 (not yet fully released at least (see https://dev.clojure.org/jira/browse/CLJ-1224))
So if anything, if your Tree
wrapper is made via defrecord
, you may be able to see some time spent recalculating the hash code when if this fact ends up being involved in a lot of rules LHS “tokens”
Tokens contain a sequence of all the facts that matched for a particular successful invocation of the rule
Then again, I may not get worked up about this situation unless you can see measurable issues with perf
Thanks for the advice, I'm just trying to keep some performance considerations in my back pocket in case I need to improve performance. This Tree
is somewhat like a database, so it will continue to grow over time. I'm already seeing processing times approach >= 100ms for the rule engine and I generally want things to be below ~50ms for some soft real-time constraints we have
to clarify: the tree does not change within a rule session, only between rule sessions
then you take a “snapshot”, view the snapshot by drilling down through a fairly large callstack
For this particular concern I had above, I’d just try to get to the “bottom” of the larger time callstacks and see if I can find references to, in this example, the Tree
object getting equiv
or hash
sort of calls on it
no problem. I’m always willing to take a look at a profiler snapshot, a portion of it, or just say “I see lots of time in Clara function X, any ideas?” too if you find those parts to be hard to sort through. I’m not sure how easy it may be to read profiler snapshots sometimes if not familiar with the codebase and/or the way clj compiles out java class names