datomic

simongray 2025-04-25T11:07:22.427189Z

Is it possible to apply a rule optionally? e.g. I have an ancestor rule that is serving me well, but I also want the results for the query sans this rule.

simongray 2025-04-25T11:08:07.273939Z

Am I forced to execute two separate queries and create a union of the results?

simongray 2025-04-25T11:12:52.406449Z

Currently I do this:

(set/union
  (d/q (conj query '(ancestor ?msItem ?e)) db manuscript-ancestor-rule)
  (d/q query db []))

favila 2025-04-25T12:51:18.518499Z

Or-join (which is just a rule you don’t name) where one of the branches uses identity to convey input to output unchanged. See this https://clojurians.slack.com/archives/C03RZMDSH/p1685543111246059?thread_ts=1685541923.981759&channel=C03RZMDSH&message_ts=1685543111.246059, which is probably similar to your ancestor rule.

2025-04-25T20:36:23.367579Z

We’re seeing occasional OOM kills in kubernetes with our valcache enabled peers and starting to suspect direct memory allocation could be related. Is there any recommendation for setting MaxDirectMemorySize when using valcache for pro (not cloud), it looks like it may be being set in the datomic cloud defaults

2025-04-30T14:43:40.730569Z

2025-04-30T14:45:23.687979Z

@joe.lane just following up here on the OOM problem, basically we’re seeing our valcache enabled peers hitting OOM limits in kubernetes for what appears to be off-heap allocation areas. We ran an experiment this week to run a smaller than normal heap size to give the pod plenty of headroom (heap is around 6gb, pod has 28gb). Datadog memory tracking is showing us here that heap is stable, but the actual RSS of the pod grows overtime

2025-04-30T14:46:07.611079Z

in other experiments, we’ve observed a large OS level file page cache, which is not surprising due to valcache usage

2025-04-30T14:46:37.213419Z

-Ddatomic.valcacheMaxGb=64

Joe Lane 2025-04-30T14:47:12.090669Z

And do you get different results when you disable valcache on peers? Could you run the same analysis on the transactor? What Datomic version are you running?

2025-04-30T14:47:35.655239Z

2025-04-30T14:48:37.565849Z

we also have direct allocation profiling enabled, and we can see that valcache (and redis) are high contributors there

2025-04-30T14:48:40.516799Z

com.datomic/peer {:mvn/version "1.0.7277"}

2025-04-30T14:48:53.893229Z

we will try disabling valcache entirely today

Joe Lane 2025-04-30T14:49:18.779359Z

What version of core.async are you using in your peers? Can you find it (may be pulled in transitively)

2025-04-30T14:50:03.184489Z

checking

2025-04-30T14:50:30.078969Z

clojure -Stree | grep async
    . org.clojure/core.async 1.5.648
  X org.clojure/core.async 1.5.648 :superseded
        . org.clojure/core.async 1.6.681
            X org.clojure/core.async 1.5.648 :older-version
          X org.clojure/core.async 1.5.648 :older-version
            X org.clojure/core.async 1.5.648 :older-version
          X org.clojure/core.async 1.5.648 :older-version
        X org.clojure/core.async 1.5.648 :older-version
    . org.clojure/core.async 1.6.681 :newer-version

Joe Lane 2025-04-30T14:51:07.152459Z

On mobile now, can’t parse. which is being used?

2025-04-30T14:51:59.727969Z

looks like we don’t pin core.async dep ourselves, but likely its coming transitively

2025-04-30T14:54:22.887359Z

looks like org.clojure/core.async 1.6.681

2025-04-30T14:59:19.456679Z

looks like we also have com.cognitect.aws/api providing it, but it should be being superseded

Joe Lane 2025-04-30T15:02:29.337369Z

Could you upgrade to 1.8.741

2025-04-30T15:02:35.653229Z

yes, we can try that

Joe Lane 2025-04-30T15:03:13.703909Z

Please also a/b test the transactor and valcache on/off in peers

2025-04-30T15:03:36.440609Z

will do

Joe Lane 2025-04-30T15:04:41.374679Z

Was it the jvm that oomed or the Linux OOMkiller?

2025-04-30T15:18:36.271669Z

not the jvm, kubernetes/linux is identifying that the container memory usage is breaching the defined limit

2025-04-30T15:20:01.507979Z

which is why this is peculiar to us, because heap remains fixed at a reasonable size, and there’s plenty of headroom for gc, compressed code etc. Its the “other” category that’s growing unlimited, hence why we’re investigating off-heap allocation/fs area

2025-04-30T15:26:07.887349Z

just rolled out a change to disable valcache, monitoring today

Joe Lane 2025-04-30T15:32:05.264809Z

FWIW, we run valcache in peers in k8s at Nu and afaik we don’t have this problem.

2025-04-30T15:32:34.310049Z

that’s great to know

2025-04-25T21:28:14.767349Z

I recognize a recommended value is probably highly specific to the application and not generalizable, but simply knowing if this is commonly restricted at all might help

Joe Lane 2025-04-25T21:31:03.378819Z

Not generally, no, there isn’t a recommended setting. What version are you running and do you have any diagnostics information? Is there a reason you suspect MaxDirectMemorySize matters here?

2025-04-28T21:25:28.486399Z

let me confirm a few more things this week and respond

👍 1