This page is not created by, affiliated with, or supported by Slack Technologies, Inc.
2019-09-28
Channels
- # announcements (2)
- # beginners (19)
- # calva (8)
- # cider (5)
- # cljsrn (13)
- # clojure (35)
- # clojure-conj (3)
- # clojure-spain (1)
- # clojure-uk (2)
- # clojurescript (4)
- # clojutre (4)
- # cursive (13)
- # data-science (2)
- # datascript (1)
- # datomic (5)
- # duct (4)
- # fulcro (76)
- # funcool (5)
- # jobs (5)
- # off-topic (10)
- # remote-jobs (7)
- # rewrite-clj (8)
- # spacemacs (3)
- # sql (2)
- # xtdb (25)
@hoppy may I ask for a shallow timecard structure? Are they multiple entities or a single one?
If you work with versions of a single entity you can use history-range
which has four boundaries – two for valid time and two for transaction time. If this doesn’t help – you’re welcome to elaborate your thought in github issues, this indices idea may be very helpful.
trampoline https://github.com/juxt/crux/issues
@hoppy My understanding is that this should make use of efficient range scans over the indexes. Have you benchmarked running this without using args?
ok, looked a bunch more at this. My conclusion is that the indexes are not getting leveraged at all for inequality queries.
thanks @hoppy, a minimal gist would be very helpful, the team will attempt to figure this out tomorrow
ok, I went a little past that. I put a project on github that is representative of my use case, which exhibits the same "shape"
Thanks! I will take a look 🙂 in the meantime I think this analysis addresses the immediate problem: >>> so we support java.util.Dates for range scans, but not Joda Time, you can try adding some form of extension of ValueToBuffer for the date type, or you can model it on the normal date implementation, that might speed it up
See the existing ValueToBuffer implementations https://github.com/juxt/crux/blob/master/crux-core/src/crux/codec.clj#L146
I had had that thought as well, however, in the case I sent you, it appears that even for integers the index isn't exploited, so I didn't go beyond that in the example.
@hoppy as a brief update: I can reproduce your results and timings on my machine (again, thanks for publishing the repo!). I'm currently trying to whittle it down to an even more minimal example
@U899JBRPF, I therefore assume that you agree this result is unexpected?
Yes, although that is just my own personal assessment right now. For instance, it seems that doing [e :r r][(= r 308)]
is slower than a direct [e :r 308]
:thinking_face:
what is our workflow here. are you basically triaging this to see if it needs to become a ticket?
That's my plan. I'll get a second opinion (internally) today and open an issue a ticket for this today if we can't resolve it.
@hoppy I've determined that this is due to an unfortunate miscalculation of the join order for certain trivial cases. The workaround for the current release (`1.4.0`) is to add extra dummy clauses so that the relevant lvar (logical variable) is more frequent. E.g. this is based on a lightly modified version of your example with 200k samples and runs in ~13ms
range-test=> (def q4a {:find '[e]
#_=> :where '[[e :counter ?r]
#_=> [_ :counter ?r]
#_=> [(<= ?r 263)]
#_=> [(>= ?r 263)]]})
#'range-test/q4a
range-test=> (time (def r1 (crux/q (crux/db @node) q4a))) (count r1)
15:22:22.474 [clojure-agent-send-off-pool-6] DEBUG crux.query - :query {:find [e], :where [[e :counter ?r] [_ :counter ?r] [(= ?r 263)] [(= ?r 263)]]}
15:22:22.475 [clojure-agent-send-off-pool-6] DEBUG crux.query - :join-order :ave ?r e {:e e, :a :counter, :v ?r}
15:22:22.475 [clojure-agent-send-off-pool-6] DEBUG crux.query - :join-order :ave ?r _13910 {:e _13910, :a :counter, :v ?r}
15:22:22.476 [clojure-agent-send-off-pool-6] DEBUG crux.query - :where [[:triple {:e e, :a :counter, :v ?r}] [:triple {:e _, :a :counter, :v ?r}] [:range [[:sym-val {:op =, :sym ?r, :val 263}]]] [:range [[:sym-val {:op=, :sym ?r, :val 263}]]]]
15:22:22.476 [clojure-agent-send-off-pool-6] DEBUG crux.query - :vars-in-join-order [?r e _13910]
15:22:22.476 [clojure-agent-send-off-pool-6] DEBUG crux.query - :attr-stats {:crux.db/id 200000, :counter 200000}
15:22:22.476 [clojure-agent-send-off-pool-6] DEBUG crux.query - :var->bindings {_13910 #crux.query.VarBinding{:e-var _13910, :var _13910, :attr :crux.db/id, :result-index 2, :join-depth 2, :result-name _13910, :type :entity, :value? false}, e #crux.query.VarBinding{:e-var e, :var e, :attr :crux.db/id, :result-index 1, :join-depth 1, :result-name e, :type :entity, :value? false}, ?r #crux.query.VarBinding{:e-var _13910, :var ?r, :attr :counter, :result-index 0, :join-depth 2, :result-name _13910, :type :entity, :value? false}}
15:22:22.480 [clojure-agent-send-off-pool-6] DEBUG crux.query - :query-time-ms 11
15:22:22.480 [clojure-agent-send-off-pool-6] DEBUG crux.query - :query-result-size 8
"Elapsed time: 12.565868 msecs"
#'range-test/r1
8
@U899JBRPF I presume this is going to get fixed at some point?