Fork me on GitHub
#xtdb
<
2021-03-11
>
Aleksander Rendtslev23:03:07

Node startup time when using lucene: I managed to get my local node into an unusable state: Nothing to see in the logs, my node just kept failing whenever I tried to start it. Wiping the lucene and index store did the trick, and it’s now working as expected again. But it took roughly 2-3 minutes for my lucene db to be repopulated on start, and all I have in it is test content I’ve generated over the last few days. I can see that being a pretty big problem once I roll this out to other users. (it’s only 217kb) Is this normal?

Aleksander Rendtslev00:03:34

Hmm, something is definitely messed up. Server can’t start the node again. Where would I start debugging this?

Aleksander Rendtslev00:03:33

Oh I see… It’s lucene. “Lucene store latest tx mismatch” I’ve just wiped it, it worked for a little bit and then it broke again. I started hitting this after I introduced a match on document creation. No issues before that.

Aleksander Rendtslev00:03:47

I can report this on Github, but just wanted to share it here first and gather as much information as I can: Lucene keeps ending up in a mismatched state. It takes roughly 5 minutes to restart and recreate the lucene index, and it gets to a mismatched state after roughly 5 minutes (don’t know if time or one of my writes is a factor here)

Steven Deobald01:03:27

@U01DH13SK8E Which versions of crux and crux-lucene are you on?

Aleksander Rendtslev01:03:02

juxt/crux-core                {:mvn/version "21.02-1.15.0-beta"}
           juxt/crux-rocksdb             {:mvn/version "21.02-1.15.0-beta"}
           juxt/crux-jdbc                {:mvn/version "21.02-1.15.0-beta"}
           juxt/crux-http-server         {:mvn/version "21.02-1.15.0-alpha"}
           juxt/crux-lucene              {:mvn/version "21.02-1.15.0-alpha"}

Steven Deobald01:03:26

> I started hitting this after I introduced a match on document creation. No issues before that. Do you know if removing the match on document creation corrects this? (Obviously not ideal, but it would be helpful to isolate the behaviour you're seeing, in case there is a bug.)

Aleksander Rendtslev01:03:50

Yeah I’ve found out how to replicate it: if a match fails while “putting” a document, the lucene index gets out of sync. In 5 minutes, when my index has caught up (sigh 😛 ) I can send another screenshot documenting the behaviour.

Aleksander Rendtslev01:03:13

(I’m wondering if some of the errors that has happened is causing the sync time for the index to take longer. It’s taking progressively longer every time it crashes)

3
Steven Deobald01:03:32

In general, text is better than screenshots. It's very hard to copy/paste out of a screenshot... they also take up too much space on disk to have one in ever GitHub issue. 😉

Aleksander Rendtslev01:03:23

Okay I’ll paste the code and the repl output seprately. I just wanted to show it inn progression

Steven Deobald01:03:56

Sure thing. In both Slack and GitHub you can use three backticks to create a multi-line code block.

Aleksander Rendtslev01:03:52

Slack just lacks the formatting to make it readable (and I can’t seem to do snippes in threads). Anyways, I posted this one on an issue as well: https://github.com/juxt/crux/issues/1456

Aleksander Rendtslev03:03:17

On the initial question: Sync times for my very humble dev node is now upwards of 7 minutes. It’s roughly 1000 transactions at this point. That obviously won’t scale. That and the crashes because of lucene mismatches leaves me somewhat concerned. I realize Lucene is in Alpha at this stage, so I suppose that is to be expected. But is there any viable alternative if I want to do Full Text search with Crux?

Steven Deobald03:03:52

I'm not sure it's time to go hunting for an alternative just yet... this isn't normal behaviour.

Steven Deobald03:03:29

The dev team are in timezones that are asleep right now. Hopefully they'll get a chance to look at it this issue soon. 🙂

Aleksander Rendtslev03:03:31

Sounds good. Thank you for being responsive - I'm a big fan of your team!

☺️ 3
refset16:03:23

Hey @U01DH13SK8E 🙂 > Sync times for my very humble dev node is now upwards of 7 minutes. It’s roughly 1000 transactions at this point. roughly how big are the transactions (# of ops / docs, size of docs)? Are you running any transaction functions that involve queries?

Aleksander Rendtslev16:03:48

The docs are still relatively small. The biggest field is the :entry field which is free form text. Unless I cap it, it could be quite long. But in all my tests I've kept it below 160chars. So that means an average of 5 fields per doc with the largest once having 160chars in them There is potentially a few ops per docs on the entry side. Let's say an average of 5 (it's free text entry, ido denounce it though to keep it slightly down). Any transaction functions that involve queries? I'm registrering this one right now, but I haven't been using it:

;; ---- Transactions ---------------------------------------
(defn register-update-doc
  [node]
  (crux/submit-tx
    node
    [[:crux.tx/put {:crux.db/id :update-doc
                    :crux.db/fn '(fn [ctx eid new-attrs valid-time]
                                   (let [db (crux.api/db ctx)
                                         entity (crux.api/entity db eid)]
                                     [[:crux.tx/put
                                       (merge entity new-attrs)
                                       valid-time]]))}]]))

;; TODO: Create transaction function for validating ownership?

refset21:03:20

Thanks. And when you say your dev node is very humble...how much RAM? You need to leave enough native memory free, outside of the JVM allocations, otherwise Rocks might be swapping

Aleksander Rendtslev22:03:36

Hmm, I'm just running it on my MacBook? 32gb ram. I haven't configured anything specifically for Crux. Running it through my REPL in this case

Aleksander Rendtslev22:03:14

Humble as in: I'm the only one who's been writing to the DB and it's not more than 1000 transactions. Done over the last two weeks

refset23:03:31

🙂 cool, yeah that kind of humble is fine, no concerns there. I'll dig into this further tomorrow

refset23:03:47

I'm somewhat lost for ideas as to what could be causing your slow ingestion experience, ~2.5 transactions per second is abysmal. How was/is the ingestion speed without Lucene in the picture?

Aleksander Rendtslev00:03:44

Hmm, I'll try do some experiments and gather some more documentation for you. But yeah, abysmal was the feeling I was left with as well, and I couldn't believe that should be the case. Any particular information, logs etc. That would be helpful for you? Jumping in a private channel

👍 3