This page is not created by, affiliated with, or supported by Slack Technologies, Inc.
2021-11-10
Channels
- # asami (41)
- # babashka (24)
- # beginners (48)
- # calva (41)
- # cider (10)
- # clj-commons (20)
- # clj-kondo (2)
- # cljdoc (8)
- # clojure (131)
- # clojure-australia (4)
- # clojure-europe (17)
- # clojure-hungary (2)
- # clojure-india (2)
- # clojure-nl (3)
- # clojure-uk (1)
- # clojurescript (12)
- # community-development (6)
- # core-logic (4)
- # cursive (11)
- # datomic (22)
- # emacs (25)
- # events (1)
- # exercism (2)
- # fulcro (30)
- # helix (5)
- # honeysql (6)
- # hugsql (3)
- # integrant (12)
- # introduce-yourself (4)
- # lsp (5)
- # malli (5)
- # nextjournal (31)
- # off-topic (4)
- # pedestal (3)
- # portal (51)
- # reitit (33)
- # remote-jobs (1)
- # shadow-cljs (12)
- # sql (10)
- # vim (7)
- # xtdb (37)
I have problem starting XTDB in ECS from a checkpoint... I see logs that it is restoring from checkpoint, but then it hangs before returning from xt/start-node
... I don't know if it is just slow. I have grace period set to 300 seconds and it doesn't start in that time
{ "context": "default", "level": "INFO", "logger": "xtdb.tx", "message": "Started tx-ingester", "thread": "main", "timestamp": "2021-11-10T06:52:16.430Z" }
[2021-11-10T08:52:17+02:00]
{ "context": "default", "level": "DEBUG", "logger": "xtdb.hash", "message": "Using libgcrypt for ID hashing.", "thread": "main", "timestamp": "2021-11-10T06:52:17.578Z" }
{ "context": "default", "level": "DEBUG", "logger": "xtdb.lucene", "message": "Committing Lucene IndexWriter...", "thread": "xtdb-lucene-fsync-1", "timestamp": "2021-11-10T06:54:18.635Z" }
{ "context": "default", "level": "DEBUG", "logger": "xtdb.lucene", "message": "Committed Lucene IndexWriter.", "thread": "xtdb-lucene-fsync-1", "timestamp": "2021-11-10T06:54:18.635Z" }
those are the last logs, and 4 minutes later ECS just kills it and starts another, perhaps I'll just increase the startup time yet again
I can verify that this happens locally as well, I configured my local env to use the same S3 checkpoint bucket and tx-log/doc store and starting up the node hangs after it has downloaded the checkpoint... no CPU usage to speak of so it isn't doing anything intensive at least
the LMDB index directory seems to be stable and the correct size, but the lucene folder seems to be fluctuating weirdly and doesn't seem to reach the size of the snapshot
A profiler would probably help here (YourKit is pretty great if you haven't tried it), but just to rule this out could you try increasing the Lucene refresh-frequency as per https://docs.xtdb.com/extensions/full-text-search/#_parameters
it doesn't look like lucene uses the checkpoint, the folder should be over 40M but it is only a few and growing slowly
can confirm that the lucene module must be the culprit here... I just tried simply commenting out the lucene module completely and startup continues immediately after the download is complete
> the `cp/try-restore` is never called from lucene module oh, oops! I agree this looks to be missing, when compared to https://github.com/xtdb/xtdb/blob/e2f51ed99fc2716faa8ad254c0b18166c937b134/core/src/xtdb/mem_kv.clj#L134-L135
increasing the refresh-refrequency almost certainly will help with the replay speed though
thanks! I won't merge the PR myself now, but I will make sure it gets some brain cycles soon
Would you be okay to sign the CLA pdf and email it to us? Instructions here https://github.com/xtdb/xtdb/blob/master/CONTRIBUTING.adoc#how-to-contribute
there doesn't seem to be deps.edn files for the modules... it would be much easier to use forked fix versions as git deps without waiting for official release
noted, feel free to open an issue (not PR 😅) for that also - I'm not really sure what it would entail, but we do make rather extensive use of lein features currently. We can also publish a snapshot release very easily once the fix is merged in the meantime, if it helps at all
You can by adding a :deps/root
key
Has anyone seen issues with xtdb, rocksdb, and last year’s M1 Mac? I keep getting java errors, and can’t find much info on google about it.
When I try to follow the in-memory tutorial I get this error
Caused by java.lang.UnsatisfiedLinkError
'long org.rocksdb.LRUCache.newLRUCache(long, int, boolean, double)'
Maybe I’m too dumb to perceive it, but is there a workaround mentioned here?
It's not obvious. I just clicked through to the RocksDB issue linked and it looks like the suggestion there (which is echoed on the above, but I didn't realize it at first) is to run an x86_64_JDK under Rosetta. Someone with more knowledge will hopefully chime in, as I have no first-hand experience.
I've used sdkman successfully to make switching JDKs easy https://itnext.io/how-to-install-x86-and-arm-jdks-on-the-mac-m1-apple-silicon-using-sdkman-872a5adc050d
@UJL94RYSW I commented on the issue thread for future users, but switching to an x86 JDK is the current work-around, as other folks have suggested. The Java bindings for Rocks apparently need very little work to run on ARM... they just haven't been released yet.
Thanks y’all 🙂