This page is not created by, affiliated with, or supported by Slack Technologies, Inc.
2021-06-06
Channels
- # announcements (1)
- # atom-editor (2)
- # babashka (6)
- # beginners (30)
- # calva (12)
- # chlorine-clover (2)
- # clojure (88)
- # clojure-australia (2)
- # clojure-europe (9)
- # clojure-germany (4)
- # conjure (3)
- # cursive (12)
- # datomic (4)
- # lsp (86)
- # off-topic (48)
- # play-clj (8)
- # polylith (6)
- # reagent (11)
- # reitit (8)
- # shadow-cljs (19)
- # specter (6)
- # sql (13)
- # xtdb (25)
Hey all! I'm currently working on a raw index from a snapshot and seemingly have managed to crash the JVM on my machine when using the rocksdb backend. Presumably the C library is statically linked in org.rocksdb/rocksdbjni
, although I'd seen the query work fine on a Mac that I was working with. The following call syntax should be fine, right?
(with-open [index-snapshot (db/open-index-snapshot index-store)]
(->> (db/av index-snapshot :person/name nil)
(map (partial db/decode-value index-snapshot))
(into [])))
just a guess, but in some places the crux code checks if it should open-nested-index-snapshot
instead
Nice tip. I'm actually doing this scan pretty close to node start up so I think I shouldn't need to. It looks like it may have something to do with a lock on the DB not being cleared from a previous shutdown. Digging further still
Can confirm that the code works if I manually delete the db.rocksdb/LOCK
file after the node has been shutdown with (.close node)
. I would have expected .close
to clean that up for me, so digging further...
Ah, LOCK files came up recently on Zulip: https://juxt-oss.zulipchat.com/#narrow/stream/194466-crux/topic/Lock.20file.20cleanups/near/235162945
I don't think the file ever gets removed once its created, even after closing everything down, but the OS shoud release the lock after you .close
(verifiable with lslocks
or equivalent)
> I'm currently working on a raw index from a snapshot Exciting - this has me very curious π
Turns out that it was actually lazyness that was biting me. Forgot to doall
a for.
> Forgot to doall a for. I've taken to using x/for
from cgrand/xforms instead, and generally just always pass eductions around instead of lazy sequences. too many footguns with lazy stuff for me, and the extra perf is nice
thanks for releasing crux-geo, could be really useful for me too :) I really wish we could easily take advantage of checkpoints for these extra index stores!
> I really wish we could easily take advantage of checkpoints for these extra index stores Me too π but we'll get to it soon! As per https://github.com/juxt/crux/issues/1221
somewhat related question while I've got you here: would the proposed index store sharding strategy also extend to other index stores or is it specific to the triples?
Definitely will add checkpointing once it lands. Perhaps Crux could define an actract Protocol Checkpointable
that would allow any system implementing it to hook into the checkpointer...
> would the proposed index store sharding strategy also extend to other index stores or is it specific to the triples? (the context here is R&D for a next-gen index) it's still very early days on that front, but it would be focused on triples at first. I know Arrow offers a lot of options for extension types though π
ah, so you'd be thinking of storing extra indices in the same storage layer as the triples, unlike today where it's managed independently? interesting
Something along those lines, yes. The goal is to have multiple storage layers, all implemented with Arrow, and then ideally custom indexes can also hook into this "adaptive indexing" architecture.
Just opened up access to https://github.com/teknql/crux-geo - thanks Crux team for the awesome foundation
Awesome, nice job!
> allows querying across time
It's great to see this, and I'm really happy that crux-lucene
was able to pave the way π
It does! Thanks again for all the work on this and open-sourcing Crux. I remember seeing it being unveiled at Clojure North (if you remember someone asking about additional dimensions of temporality π) and I was so happy (I'd just spent a year approximating something similar in Haskell, so it was really cool to see it arrive in fuller form). Following along from crux-lucene made implementing this quite painless. This bug (https://github.com/juxt/crux/issues/1523) may prove to be an issue for us
(Not so much as it relates to that exact problem, but joins are in the wrong order, a nested traversal is super expensive if you have hundreds of thousands of entities that could be filtered down to like 200 w/ one of the indices (lucene, geo spatial)
I think an easier workaround to #1523 is just to use subqueries for the expensive predicates, but yeah we'll look into it more this week
I had considered that as well but actually got the same dependency error (or a derivative one) when I had tried it, if I recall correctly
Although perhaps if I had moved everything into the subquery (ie. not passing in ?a
as an arg) that would resolve it. I didn't wrestle with it too long