This page is not created by, affiliated with, or supported by Slack Technologies, Inc.
2021-07-12
Channels
- # announcements (1)
- # aws (1)
- # babashka (63)
- # beginners (108)
- # calva (12)
- # cider (6)
- # cljdoc (2)
- # cljsrn (33)
- # clojure (150)
- # clojure-europe (28)
- # clojure-nl (13)
- # clojure-spain (1)
- # clojure-spec (8)
- # clojure-uk (25)
- # clojurescript (16)
- # conjure (7)
- # cursive (7)
- # datomic (15)
- # duct (2)
- # eastwood (2)
- # figwheel (1)
- # figwheel-main (1)
- # fulcro (6)
- # graalvm (1)
- # graalvm-mobile (1)
- # helix (6)
- # honeysql (23)
- # integrant (6)
- # introduce-yourself (4)
- # jobs (10)
- # lsp (132)
- # malli (4)
- # meander (1)
- # membrane (1)
- # off-topic (223)
- # pathom (23)
- # pedestal (3)
- # re-frame (18)
- # reagent (13)
- # releases (1)
- # remote-jobs (2)
- # shadow-cljs (68)
- # tools-deps (217)
- # vim (19)
- # xtdb (79)
Hey all. I’m getting increasingly frustrated with mongodb as the backend for my little web app multiplayer turn based game. Are there any resources on transitioning from mongo to crux? I’m well versed in sql if it helps and very new to datalog
Hey @UEENNMX0T! We don't have resources on transitioning from Mongo specifically, but we did begin implementing a Mango->Datalog query compiler a few months ago, see https://github.com/jonpither/crux/blob/select/crux-test/test/crux/select_test.clj and https://github.com/jonpither/crux/blob/select/crux-core/src/crux/select.clj - unfortunately that work isn't available as part of Crux officially yet but perhaps you can borrow some ideas. Otherwise I'd suggest just running through our tutorials to start feeling comfortable with Datalog https://opencrux.com/tutorials/tutorials.html I'm very happy to answer any questions or help with specific problems 🙂
Do you have an example of a query from Mongo that you'd like to see ~roughly converted to Datalog?
Less query specific but how to move data from within a mongo database into a crux database
Ah! I misread 😄
Same answer though...no resources on performing the Mongo->Crux data migration. Although you probably want to model Mongo's collections using a :type
attribute, since Crux's indexes are comparatively "global"
Which Mongo value types are you using?
no worries, I wasn’t specific enough. Right now it’s text, numbers, dates, and booleans, with lots of nested data
Is your game open source? It might help to take a look at how you're nesting documents, then folks could chip in with pointers and gotchas.
I should preface with “I inherited this mess!” lmao. We could normalize it, but that’s not great for mongo, so we’ve just leaned into the document storing
It actually looks pretty amenable to normalization to me, if you want to migrate to Crux. Although Crux can ingest nested maps, it prefers flat records. The contents of a nested map are opaque to the query engine. All the top-level attributes which correspond to further-nested maps could become references. As far as I can tell, the nested maps aren't themselves too nested, which might save you some grief in terms of mapping a big denormalized tree to normalized Crux docs.
The one thing to keep in mind is that small "mutations" (like this inc
: https://github.com/mtgred/netrunner/blob/1b7a9001ed28201e5d70be42bcfb21ea51f65ce5/src/clj/web/stats.clj#L53) are probably something you want to localize to one document as much as possible, since Crux doesn't do structural sharing over a large doc receiving tiny incremental updates. That said, your data appears to be pretty ephemeral? If a game isn't revisited after it's finished, this might not be as big a concern. I suppose it depends how often you update stats mid-game.
that’s good to know, thank you
are range queries expected to work on instants? e.g.
(doseq [i (range 20)]
(c/put node {:crux.db/id i :t (t/<< (t/now) (t/new-duration i :days))}))
@(c/q node '{:find [?e]
:in [MAX]
:where [[?e :t ?t]
[(< ?t MAX)]]}
java.time.Instant/MAX)
the range predicate doesn't seem to affect the result order
#{[0] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] [1] [2] [3] [4] [5] [6] [7] [8] [9]}
Hey, yep range queries should work, since Instants are encoded, as per https://github.com/juxt/crux/blob/b990d2f09a52636d885b894c3bdf1fdcdb5c012f/crux-core/src/crux/codec.clj#L340-L346
However, what is the type of (t/<< (t/now) (t/new-duration i :days))
? Is that an Instant also? (I don't have a REPL with tick
handy, sorry 😅 ) That could explain things though, since range constraints don't respect type boundaries intuitively, mentioned briefly here https://github.com/juxt/crux/pull/1281
(assert (= (type java.time.Instant/MAX) (type (t/<< (t/now) (t/new-duration 1 :days)))))
thanks for confirming. What is the vars-in-join-order
? Available in the crux.query
DEBUG log or via crux.query/query-plan-for
query-plan-for
is new to me, that's nice 🙂
{:depth->constraints [nil nil nil nil],
:in-bindings [{:bind-type :scalar, :idx-id in165610, :tuple-idxs-in-join-order [0]}],
:var->bindings {?e #crux.query.VarBinding {:attr :crux.db/id,
:e-var ?e,
:result-index 1,
:result-name ?e,
:type :entity,
:value? false,
:var ?e},
?t #crux.query.VarBinding {:attr :t,
:e-var ?e,
:result-index 0,
:result-name ?e,
:type :entity,
:value? false,
:var ?t},
MAX #crux.query.VarBinding {:attr nil,
:e-var nil,
:result-index 2,
:result-name crux.query.value/MAX,
:type :in-var,
:value? true,
:var MAX}},
:var->cardinality {?e 100.80263456077057, ?t 11.935930428907975},
:var->joins {?e [{:id triple165609, :idx-fn #<Fn@7b59dd27 crux.query/triple_joins[fn]>}],
?t [{:id triple165609, :idx-fn #<Fn@7b59dd27 crux.query/triple_joins[fn]>}],
MAX [{:id in165610, :idx-fn #<Fn@1aa7c7b crux.query/in_joins[fn]>}]},
:var->logic-var-range-constraint-fns
{MAX [#<Fn@418ecdbd crux.query/build_logic_var_range_constraint_fns[fn]>]},
:var->range-constraints {},
:vars-in-join-order [?t ?e MAX]}
Ah, sorry, I realise what's happening now! The query planner doesn't prioritise joining against the MAX
binding, because it ~can't in the general case (I'm not 100% sure of the details as to why), so if you want the range constraint to be prioritised you have to pass the MAX value in using a literal (not an :in
param)
I regret that this isn't better documented. There's a note about it here https://github.com/juxt/crux/blob/b990d2f09a52636d885b894c3bdf1fdcdb5c012f/crux-test/test/crux/query_test.clj#L1794-L1797 and several other query tests show the literal behaviour working as expected
still seems to be the same order
(crux.query/query-plan-for
(crux/db node)
{:find '[?e]
:where ['[?e :t ?t]
[(list '< '?t java.time.Instant/MAX)]]})
{:depth->constraints [nil nil nil],
:in-bindings [],
:var->bindings {?e #crux.query.VarBinding {:attr :crux.db/id,
:e-var ?e,
:result-index 1,
:result-name ?e,
:type :entity,
:value? false,
:var ?e},
?t #crux.query.VarBinding {:attr :t,
:e-var ?e,
:result-index 0,
:result-name ?e,
:type :entity,
:value? false,
:var ?t}},
:var->cardinality {?e 100.80263456077057, ?t 15.53476867758077},
:var->joins {?e [{:id triple169226, :idx-fn #<Fn@5fa65ae1 crux.query/triple_joins[fn]>}],
?t [{:id triple169226, :idx-fn #<Fn@5fa65ae1 crux.query/triple_joins[fn]>}]},
:var->logic-var-range-constraint-fns {},
:var->range-constraints {?t #<Fn@76d4290f crux.query/new_range_constraint_wrapper_fn[fn]>},
:vars-in-join-order [?t ?e]}
that's because MAX
is no longer a var, but if we pretend it was, the order would now be [MAX ?t ?e]
the results are in the same order as previously, #{[0] [10] [11] [12] [13] [14] [15] [16] [17] [18]
...
how about if you add :limit 100
to force it to be a vector/bag (in case the set just happens to be printing that way)?
cool 🙂
> in fact that was it the whole time, it seems
that may be true, but when MAX
was a var the order coming out is more "undefined" than when there's a literal (i.e. don't rely on it)
There was a discussion about this a few weeks back, but tl;dr it only works in one direction https://github.com/juxt/crux/discussions/1514
interestingly, this instantly segfaults
(with-open [res (crux/open-q (crux/db node)
{:find '[?e]
:limit 100
:where ['[?e :t ?t]
[(list '> '?t java.time.Instant/MIN)]]})]
(iterator-seq res))
it's unlike any crux segfault I've ever seen! too tired to look into this atm, but here's a dump of what appears relevant from the crash log
Stack: [0x00007fe5da1fe000,0x00007fe5da2ff000], sp=0x00007fe5da2fb8c8, free space=1014k
Native frames: (J=compiled Java code, A=aot compiled Java code, j=interpreted, Vv=VM code, C=native code)
C [libc.so.6+0x1c10b0]
J 27181 c1 crux.rocksdb.RocksKvIterator.next()Ljava/lang/Object; (33 bytes) @ 0x00007fe61f9b387c [0x00007fe61f9b3180+0x00000000000006fc]
J 27180 c1 crux.kv.index_store.PrefixKvIterator.next()Ljava/lang/Object; (110 bytes) @ 0x00007fe61f991bd4 [0x00007fe61f991000+0x0000000000000bd4]
J 27283 c1 crux.kv.index_store$step_fn$step__144431$fn__144432.invoke()Ljava/lang/Object; (66 bytes) @ 0x00007fe61f9dbe8c [0x00007fe61f9db040+0x0000000000000e4c]
J 11609 jvmci clojure.lang.LazySeq.sval()Ljava/lang/Object; (42 bytes) @ 0x00007fe623a25754 [0x00007fe623a256a0+0x00000000000000b4]
J 11610 jvmci clojure.lang.LazySeq.seq()Lclojure/lang/ISeq; (53 bytes) @ 0x00007fe623a60c54 [0x00007fe623a60bc0+0x0000000000000094]
J 3280 jvmci clojure.lang.Cons.next()Lclojure/lang/ISeq; (10 bytes) @ 0x00007fe6239de5f4 [0x00007fe6239de580+0x0000000000000074]
J 2047 jvmci clojure.core$next__5404.invoke(Ljava/lang/Object;)Ljava/lang/Object; (7 bytes) @ 0x00007fe6239ae4fc [0x00007fe6239ae420+0x00000000000000dc]
J 27326 c1 crux.index.SeekFnIndex.next_values()Ljava/lang/Object; (108 bytes) @ 0x00007fe61f9fc594 [0x00007fe61f9fb6a0+0x0000000000000ef4]
J 27577 c1 crux.index.DerefIndex.next_values()Ljava/lang/Object; (61 bytes) @ 0x00007fe61fad1774 [0x00007fe61fad0ba0+0x0000000000000bd4]
J 27325 c1 crux.index.NAryJoinLayeredVirtualIndex.next_values()Ljava/lang/Object; (152 bytes) @ 0x00007fe61f9f98c4 [0x00007fe61f9f8a40+0x0000000000000e84]
j crux.index.PredicateVirtualIndex.next_values()Ljava/lang/Object;+46
j crux.index.GreaterThanVirtualIndex.next_values()Ljava/lang/Object;+48
j crux.index.PredicateVirtualIndex.next_values()Ljava/lang/Object;+46
J 27325 c1 crux.index.NAryJoinLayeredVirtualIndex.next_values()Ljava/lang/Object; (152 bytes) @ 0x00007fe61f9f98c4 [0x00007fe61f9f8a40+0x0000000000000e84]
J 27550 c1 crux.index$layered_idx__GT_seq$step__95852.invokePrim(Ljava/lang/Object;JLjava/lang/Object;)Ljava/lang/Object; (446 bytes) @ 0x00007fe61fab185c [0x00007fe61faafee0+0x000000000000197c]
j crux.index$layered_idx__GT_seq$step__95852.invoke(Ljava/lang/Object;Ljava/lang/Object;Ljava/lang/Object;)Ljava/lang/Object;+10
j crux.index$layered_idx__GT_seq$step__95852$fn__95866.invoke()Ljava/lang/Object;+78
J 11609 jvmci clojure.lang.LazySeq.sval()Ljava/lang/Object; (42 bytes) @ 0x00007fe623a25754 [0x00007fe623a256a0+0x00000000000000b4]
J 11610 jvmci clojure.lang.LazySeq.seq()Lclojure/lang/ISeq; (53 bytes) @ 0x00007fe623a60eec [0x00007fe623a60bc0+0x000000000000032c]
J 12778 jvmci clojure.core$seq__5420.invoke(Ljava/lang/Object;)Ljava/lang/Object; (7 bytes) @ 0x00007fe623ffe7fc [0x00007fe623ffe720+0x00000000000000dc]
j crux.query$query$fn__99980$iter__99982__99986$fn__99987.invoke()Ljava/lang/Object;+22
J 11609 jvmci clojure.lang.LazySeq.sval()Ljava/lang/Object; (42 bytes) @ 0x00007fe623a25754 [0x00007fe623a256a0+0x00000000000000b4]
J 11610 jvmci clojure.lang.LazySeq.seq()Lclojure/lang/ISeq; (53 bytes) @ 0x00007fe623a60c54 [0x00007fe623a60bc0+0x0000000000000094]
J 3904 jvmci clojure.core$take$fn__5928.invoke()Ljava/lang/Object; (79 bytes) @ 0x00007fe623a63c7c [0x00007fe623a63500+0x000000000000077c]
J 11609 jvmci clojure.lang.LazySeq.sval()Ljava/lang/Object; (42 bytes) @ 0x00007fe623a25754 [0x00007fe623a256a0+0x00000000000000b4]
J 11610 jvmci clojure.lang.LazySeq.seq()Lclojure/lang/ISeq; (53 bytes) @ 0x00007fe623a60c54 [0x00007fe623a60bc0+0x0000000000000094]
J 3263 jvmci clojure.lang.SeqIterator.hasNext()Z (64 bytes) @ 0x00007fe6239a7a5c [0x00007fe6239a7760+0x00000000000002fc]
j crux.io.Cursor.hasNext()Z+9
J 23824 jvmci clojure.lang.RT$4.invoke()Ljava/lang/Object; (69 bytes) @ 0x00007fe6245dd51c [0x00007fe6245dd380+0x000000000000019c]
J 11609 jvmci clojure.lang.LazySeq.sval()Ljava/lang/Object; (42 bytes) @ 0x00007fe623a25754 [0x00007fe623a256a0+0x00000000000000b4]
J 11610 jvmci clojure.lang.LazySeq.seq()Lclojure/lang/ISeq; (53 bytes) @ 0x00007fe623a60c54 [0x00007fe623a60bc0+0x0000000000000094]
J 11617 jvmci clojure.core$seq__5420.invokeStatic(Ljava/lang/Object;)Ljava/lang/Object; (7 bytes) @ 0x00007fe623bbbd9c [0x00007fe623bbbd00+0x000000000000009c]
j clojure.core$print_sequential.invokeStatic(Ljava/lang/Object;Ljava/lang/Object;Ljava/lang/Object;Ljava/lang/Object;Ljava/lang/Object;Ljava/lang/Object;)Ljava/lang/Object;+195
I observe this being sorted by ?id
instead of ?date
(crux.query/query-plan-for
(crux/db node)
{:find '[?date ?id]
:where [['?e :entry/view 1]
'[?e :entry/id ?id]
'[?e :entry/updated-date ?date]
[(list '< '?date java.time.Instant/MAX)]]
:limit 250})
but the query plan has :vars-in-join-order [1 ?e ?date ?id]
it's hard to judge exactly why the planner chose that join order, but I suspect it will be due to the relatively cardinality/selectivity of the :entry/updated-date
values vs the :entry/view
values
e.g. are there significantly more :entry/updated-date
KVs with a more diverse set of values, than for :entry/view
?
:var->cardinality
{1 0.0, ?date 170.46747929617274, ?e 1.3768786594190381E-5, ?id 1.7976931348623157E308}
here's a repro for you for the segfault
(ns test
(:require [crux.api :as crux]
[ :as io]
[tick.alpha.api :as t]))
(defn start-crux! []
(letfn [(kv-store [dir]
{:kv-store {:crux/module 'crux.rocksdb/->kv-store
:db-dir (io/file dir)
:sync? true}})]
(crux/start-node
{:crux/tx-log (kv-store "data/dev/tx-log")
:crux/document-store (kv-store "data/dev/doc-store")
:crux/index-store (kv-store "data/dev/index-store")})))
(defonce node (start-crux!))
(doseq [i (range 100)]
(crux/submit-tx node [[:crux.tx/put {:crux.db/id i :t (t/ago (t/new-duration i :days))}]]))
(with-open [res (crux/open-q (crux/db node)
{:find '[?e]
:limit 100
:where ['[?e :t ?t]
[(list '> '?t java.time.Instant/MIN)]]})]
(iterator-seq res))
deps
{:deps
{pro.juxt.crux/crux-core {:mvn/version "1.17.1"}
pro.juxt.crux/crux-rocksdb {:mvn/version "1.17.1"}
tick/tick {:mvn/version "0.4.32"}}}
thanks for the repro, I've added it to the project board as a gist https://gist.github.com/refset/8f09f7ff0bf553b08e7428daa2c820c8
> e.g. are there significantly more `:entry/updated-date` KVs with a more diverse set of values, than for `:entry/view`? do you have anecdotal/relative numbers for these in mind? (ignoring what's actually in the index)
if you really want to process the query in your preferred order you could decompose it into 2 queries, or handle the range constraint in a subquery (though this means realising the full result set before the outer query can start processing)
really I'd like to get rid of updated-date completely (it's just an indexed vt), but I guess having a separate vt index wouldn't help here
I feel like re-ordering of the result set is almost always going to have to be the final step in any serious db query, yeah, unless you have some super exact knowledge of the data evolution ahead of time and build some kind of ideal index structure to support it
> really I'd like to get rid of updated-date completely (it's just an indexed vt) As discussed before, it's coming 😄
fortunately I've got a denormalized data store I can use, so I can in fact get rid of updated-date 🙂 I think it must be categorically wrong to expect any kind of ordering from crux, as a data modeling rule
re the segfault - this is because iterator-seq
is lazy, so it's trying to access the query results after the with-open
has closed the query's resources
I wouldn't expect a iterator-seq
with nothing else going on to even do anything (should be a lazy view over the iterator?), much less have some action outside the with-open
block, but I'll gladly take your word for it here
it is a lazy view over the iterator - if you're running this from the REPL, it might be that your REPL is trying to print out the seq?
another way to check it would be to def
it, and then see whether it blows up when you eval the (def foo ...)
or when you then check the value of foo
ahh, right, the block would return a reference and the repl doesn't know that it's been "closed". that makes sense, thanks 🙂
Those almost look string-sorted. 😕 If you give it a few more days, do they show up between [1]
and [2]
?
#{[0] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] [1] [20] [21] [22] [23] [24] [25] [26] [27]
[28] [29] [2] [30] [31] [32] [33] [34] [35] [36] [37] [38] [39] [3] [40] [41] [42] [43] [44] [45]
[46] [47] [48] [49] [4] [50] [51] [52] [53] [54] [55] [56] [57] [58] [59] [5] [60] [61] [62] [63]
[64] [65] [66] [67] [68] [69] [6] [70] [71] [72] [73] [74] [75] [76] [77] [78] [79] [7] [80] [81]
[82] [83] [84] [85] [86] [87] [88] [89] [8] [90] [91] [92] [93] [94] [95] [96] [97] [98] [99] [9]}
I don't know whether it's been mentioned here but coming up this week: https://www.meetup.com/Los-Angeles-Clojure-Users-Group/events/279378615/
(I attend this meetup sometimes, even tho' I'm in the San Francisco area)
@kevin842 I'm a bit confused by this on a number of levels. The Clojure less-than doesn't like java.time.Instant
at all, so I'm a little surprised Crux doesn't get angry for the same reasons. But I really wouldn't expect it to sort in that (string-ified, presumably) order, if at all.
@seancorfield This is probably okay to tweet to a wider audience?
Sure, I think they only have 100 seats on their Zoom license for this tho' 🙂
Pretty sure we don't have quite that much reach. 😉 I just wanted to make sure a bunch of strangers wouldn't throw off the LA CUG vibe.
They already get a bunch of non-local folks joining. It's been really interesting to see how user groups have expanded geographically after going virtual.
Hi there question about crux-console
...The repo is read-only - is the plan to take it out of Crux proper?
Hey, for others' context, you're talking about https://github.com/crux-labs/crux-console - that repo is deprecated in favour of the more recently UI now built-in to crux-http-server
although the two aren't equivalent in power in various dimensions
yep thanks @taylor.jeremydavid - was referring to that repo as I did not notice the other crux-http-server
one 😉
yep thanks @taylor.jeremydavid - was referring to that repo as I did not notice the other crux-http-server
one 😉