Fork me on GitHub
#xtdb
<
2022-02-02
>
fugbix08:02:06

Hi everyone ☀️ — a couple of questions about xt:fn — assuming I submit the transaction

[[:xtdb.api/fn :my-func {...}] [:xtdb.api/fn :my-func {...}] ...]
where :my-fun return false when detecting some inconsistency , other a :xtdb.api/put transaction. • is the context passed to each function call holds a speculative database so far? • is the entire transaction aborted in case any of the db function call returns false? Intuitively I’d assume the answer is yes to both questions, but I’m having a bad time seeing this. Thank you!!

fugbix08:02:21

never mind, my bad. I was returning nil rather than false ..

jarohen09:02:16

> Intuitively I’d assume the answer is yes to both questions yes and yes 🙂 glad you got to the bottom of it in the end!

2
fugbix10:02:20

I just had to read the doc better :))

jarohen10:02:26

you're not the first person that that's caught out, if that helps, what with Clojure often treating false and nil as equivalent

fugbix10:02:08

yup falsity

fugbix10:02:53

But I guess in our case returning nil is a valid --> with tx-ops {}

tatut13:02:09

How costly is xt/db I'm getting OutOfMemoryErrors with simple calls to it with a largeish LMDB index (details in thread)

tatut13:02:41

java.lang.OutOfMemoryError: Direct buffer memory
at java.base/java.nio.Bits.reserveMemory(Bits.java:175)
at java.base/java.nio.DirectByteBuffer.<init>(DirectByteBuffer.java:118)
at java.base/java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:317)
at org.lwjgl.BufferUtils.createByteBuffer(BufferUtils.java:75)
at org.lwjgl.system.MemoryStack.create(MemoryStack.java:86)
at org.lwjgl.system.MemoryStack.create(MemoryStack.java:75)
at java.base/java.lang.ThreadLocal$SuppliedThreadLocal.initialValue(ThreadLocal.java:305)
at java.base/java.lang.ThreadLocal.setInitialValue(ThreadLocal.java:195)
at java.base/java.lang.ThreadLocal.get(ThreadLocal.java:172)
at org.lwjgl.system.MemoryStack.stackGet(MemoryStack.java:790)
at org.lwjgl.system.MemoryStack.stackPush(MemoryStack.java:799)
at xtdb.lmdb$new_transaction.invokeStatic(lmdb.clj:69)
at xtdb.lmdb$new_transaction.invoke(lmdb.clj:66)
at xtdb.lmdb.LMDBKv.new_snapshot(lmdb.clj:220)
at xtdb.kv.index_store.KvIndexStore.open_index_snapshot(index_store.clj:1110)
at xtdb.query.QueryEngine.db(query.clj:1944)
at xtdb.node.XtdbNode.db(node.clj:104)
at xtdb.node.XtdbNode.db(node.clj:100)
at <my code calling (xt/db node)>

tatut13:02:18

the db checkpoint was around 1.5gb and the machine has 8gb of memory with Java max heap set to 6gb

tatut13:02:01

I'm adding a db to each request, as nearly all service calls will need to do some queries. This is happening in AWS Fargate linux env, never encountered this locally on my dev macbook

tatut13:02:26

any pointers on where to look?

tatut13:02:40

version is 1.20.0

refset13:02:53

The xt/db call itself shouldn't be too costly, but I suspect you may be needing to leave more native memory in reserve for LMDB itself to use. For instance, you could try setting the max heap to 3GB, which leaves 3GB for off-heap (by default it's 1:1 IIRC), and implicitly somewhere in the region of 1.5GB-2GB for native allocations

tatut13:02:13

ah, I had it backwards... my first intuition was to increase the JVM max heap as I was getting OOME thrown

tatut13:02:57

but is the cost of xt/db directly proportional to the db size? as this only happens when the db is big

refset13:02:39

well, I suppose that may yet turn out to be a valid intuition 🙂

refset13:02:19

> is the cost of `xt/db` directly proportional to the db size? no, it shouldn't be. There are per-db caches that are created, but by default they are quite conservative

tatut13:02:25

any recommendations with how much to leave for linux kernel buffers, if it's memory mapped files, I guess those should benefit from kernel buffers

✔️ 1
tatut13:02:00

and another peculiar thing is that it doesn't seem to get resolved, the instance needs to be restarted for it to recover... so it's not that there's a load spike with many requests taking much memory

tatut13:02:39

the system started up, replayed a long tx log, was idle for a while and saved a checkpoint... then served a few requests and started throwing OOME.

1
refset13:02:33

> any recommendations with how much to leave for linux kernel buffers hmm, I guess that's what I really meant by "native allocations"(!)

tatut13:02:00

I'll need to investigate the heap dumps more to see what's up with it.

👍 1
refset13:02:17

feel free to open an issue to dig into this further, and if you are able to generate a ~minimal repro that would certainly be helpful 🙏

tatut13:02:32

yeah, that might be difficult, but I'll share what I find later (luckily we are not in production now, so this isn't causing actual down time for anyone... just an issue that we need to resolve before too long)

tatut13:02:56

it might just be that we need to throw more memory at the problem, but I want to understand the reasons before doing that

refset13:02:25

one other idea to try is setting MaxDirectMemorySize to a good value (`==Xmx`), since by default it is (probably) unbounded

tatut13:02:39

yeah, found that useful article at the same time

tatut13:02:53

It wasn't the Java heap but a separate config for direct memory buffers

tatut13:02:26

so I'll take a couple of gb from heap and move that to max direct memory... that should probably fix it

🤞 1
tatut13:02:10

> By default, JVM allows 64MB for direct buffer memory, that sounds very small default, but I guess most programs don't use big memory mapped databases

🙂 1
jarohen13:02:03

IIRC the 64MB is just a placeholder that gets overwritten quite early on when the JVM starts up - either to -XX:MaxDirectMemorySize, unbounded, or to the same as -Xmx depending on the flavour and version of JVM

jarohen13:02:45

but yes, in either of the two latter cases, this could potentially cause an OOM if you have 6GB heap and an 8GB memory limit

jarohen13:02:27

I doubt xt/db itself is the overall cause, tbh - it does very little beyond creating a handful of small objects - but it could well be the straw that broke the camel's back

jarohen13:02:40

if you can, it might be worth checking some of the JVM monitoring tools - you can get usage stats from JMX, a few profilers show it, if it's a local JVM, and I daresay if you have any other monitoring tools attached they may well do too

1
tatut06:02:29

thanks, I've been meaning to add the cloudwatch reporter with jvm metrics... I've only had the CPU/mem % that fargate reports so far

lgessler22:02:07

leaving this here for Googlers in the future: got a curious error message java.lang.IllegalArgumentException: No implementation of method: :latest-completed-tx of protocol: #'xtdb.db/LatestCompletedTx found for class: xtdb.kv.index_store.KvIndexStoreTx, ended up being that I was using xt/with-tx inside a transaction function, which (very reasonably) seems to be unsupported. Managed to work around it with no issue

👍 2
refset23:02:53

It's definitely something we hope to be able to support at some point, and is (I think) covered by this existing issue https://github.com/xtdb/xtdb/issues/1434 Out of interest, were you trying to implement integrity constraint validations using Datalog?

👍 1
lgessler03:02:53

cool! nah, not doing that, doing something else that I don't even know how to put concisely

🙂 1
Steven Deobald05:02:11

@U49U72C4V We're huge fans of the Clojurians archives — the volunteers who maintain them are saints. But, fwiw, http://discuss.xtdb.com is a saner (and mutable, which in this case is a good thing) home for tidbits you'd like to show up in Google searches. 🙂 Please feel free to cross-post there.

👍 1
lgessler16:02:13

I hesitated to post a little scrap like this there but it's good to know you think it belongs! I'll post it there later today

♥️ 1