datahike

john 2026-03-03T17:43:17.385689Z

Hello. I'm working on a solution to a problem that I think floats in a similar area to concepts I've heard discussed in this community in the past. Basically, doing stuff on the stack. For cljs-thread I wanted shared persistent data between web workers so I invented self-gcing stack-atom kind of things: https://github.com/johnmn3/cljs-thread/blob/eve/README.md I'm going to port it to clojure soon though and target mmap'd files. That should theoretically allow for uncoordinated OS processes to independently bash on 500GB clojure maps in parallel. I'm curious, given y'alls expertise and experience in this area, where you see this going - whether you have any advice or interest in planning. Here's a report on prior art from claude: https://gist.github.com/johnmn3/d2b60b3c6ed08d9972b83c7138e60a6e tl;dr: > The closest thing to EVE in the wild is LMDB on the mmap+multi-reader-epoch axis, combined with immer on the persistent-HAMT axis — but no single system fuses these. Specifically, cross-process epoch-based GC of persistent immutable HAMT nodes does not appear to exist as prior art outside of EVE. The academic EBR literature (DEBRA, PEBR, crossbeam-epoch) is all single-process/single-address-space. The mmap databases (LMDB, Metall) use different reclamation strategies (freelist B-trees, no GC respectively). The persistent structure libraries (immer, im-rs) are fully in-process. Basically, the way I look at it, these are like "stack atoms" and they provide consistent, low latency, writable windows into process independent data in a way that might be innovative and new. But I'm not entirely sure... I'm curious if claude is missing any prior art. Does this method seem similar to anything you've seen before?

john 2026-03-09T02:07:48.593459Z

Here's what I'm talking about: 4 jvm-clojure-eve threads and 4 node-cljs-eve workers banging on 10 MB and 100 MB file atoms in parallel, with almost the same read/write times. Can theoretically scale to terabytes. Haven't tried yet. https://gist.github.com/johnmn3/dca4f571a0b310ed2cb58f31983c70d7

john 2026-03-09T02:17:19.196439Z

Well, theoretically to "8 exbibytes"

whilo 2026-04-14T03:56:04.108629Z

I guess you arrived at https://github.com/SeniorCareMarket/eve now @john? This indeed looks interesting, how do you think about synching across machines? You rely on shared mmap'ed files, right? I assume you implemented a custom edn pointer format.

whilo 2026-03-05T22:15:08.447229Z

Sorry for replying late, was super busy. I will take a look. Definitely looks very related and cool. The tricky bit with off heap memory is to handle it well, but I am also interested in adaptively compiling to WASM.

john 2026-03-05T22:39:58.107039Z

I'm working on a native version now. I'll show you when I have something

👍 1
john 2026-04-14T15:49:55.417439Z

Cross machine syncing is coming

john 2026-04-14T15:51:53.940479Z

The filebacked version of the atom uses mmap, yeah.

john 2026-04-14T15:52:11.615159Z

It's not a custom edn pointer format

john 2026-04-14T15:53:06.115609Z

IIUC

john 2026-04-14T15:53:42.675519Z

It's HAMT pointers

john 2026-04-14T15:54:04.293539Z

Pretty similar to clojure, just off-heap

john 2026-03-09T15:39:24.258699Z

BTW, Claude had to explain to me that "stack atoms" is the wrong word here lol I was thinking off-heap and stack are the same thing... What I mean are mmap'd file atoms.