Ah, found it :)
Yes, you can do so, but most databases allow you to only keep a small part of the data in memory and the rest on disk. The smaller your memory budget the slower things will typically be, as you need to load thinks from storage every time you hit something that is not loaded.
Yeah, I'm just going to be storing chunks of binary data. Maybe duckdb? I don't need any db functionality. Not even kv. Just indexical access into terabytes of binary or character data
How do you want to address this data?
For numerical arrays HDF5 is a good choice.
Chunks are generally ready out sequentially. They're decrypted just in time, then erased/overwritten after being read out
Okay I'll check that
I'm using files right now, but I'd rather use an existing solution
If it is just blobs then probably something like kv makes sense, because you need to be able to somehow individually address pieces of memory.
Looks like there's an hdf5 clojure lib, nice. I'll give a spin. Thanks!
Your data needs to be tensor shaped, i.e. not have varying size blobs, but be fairly regular in its nested size.
You can pad the space if needed, but that can get inefficient.
Sorry, that is too algebraic. Basically just nested arrays of fixed size. I guess it will make sense when you look at HDF5.
Yeah that works fine
I'll be using a chunk size that's ideally small, but large enough for high throughput. But all the same size. I'll dig into it.