datahike

john 2024-11-25T17:32:05.871369Z

Ah, found it :)

👍 1
whilo 2024-11-25T17:36:07.149869Z

Yes, you can do so, but most databases allow you to only keep a small part of the data in memory and the rest on disk. The smaller your memory budget the slower things will typically be, as you need to load thinks from storage every time you hit something that is not loaded.

john 2024-11-25T17:42:23.394279Z

Yeah, I'm just going to be storing chunks of binary data. Maybe duckdb? I don't need any db functionality. Not even kv. Just indexical access into terabytes of binary or character data

whilo 2024-11-25T17:44:43.158289Z

How do you want to address this data?

whilo 2024-11-25T17:44:57.871229Z

For numerical arrays HDF5 is a good choice.

john 2024-11-25T17:45:05.845379Z

Chunks are generally ready out sequentially. They're decrypted just in time, then erased/overwritten after being read out

john 2024-11-25T17:45:18.259099Z

Okay I'll check that

john 2024-11-25T17:45:51.472109Z

I'm using files right now, but I'd rather use an existing solution

whilo 2024-11-25T17:45:56.280889Z

If it is just blobs then probably something like kv makes sense, because you need to be able to somehow individually address pieces of memory.

john 2024-11-25T17:53:40.623659Z

Looks like there's an hdf5 clojure lib, nice. I'll give a spin. Thanks!

whilo 2024-11-25T17:54:51.500999Z

Your data needs to be tensor shaped, i.e. not have varying size blobs, but be fairly regular in its nested size.

whilo 2024-11-25T17:55:02.478899Z

You can pad the space if needed, but that can get inefficient.

whilo 2024-11-25T17:55:28.808119Z

https://en.wikipedia.org/wiki/Tensor

whilo 2024-11-25T17:56:03.951769Z

Sorry, that is too algebraic. Basically just nested arrays of fixed size. I guess it will make sense when you look at HDF5.

john 2024-11-25T17:57:19.805089Z

Yeah that works fine

john 2024-11-25T17:58:47.146929Z

I'll be using a chunk size that's ideally small, but large enough for high throughput. But all the same size. I'll dig into it.

👍 1