datahike

grounded_sage 2026-03-29T01:55:04.429309Z

You mentioned integrating Stratum into Datahike as a seperate indices @whilo what does that mean? Is it leveraging the fact that you can do fairly arbitrary things inside of Datahike like function calling or are you talking about an even deeper integration

whilo 2026-03-30T10:11:06.349959Z

https://github.com/replikativ/datahike/pull/795, hopefully can merge this in the next days.

👍 1
1
Casey 2026-03-29T06:52:37.872769Z

@whilo in the new datahike ecosystem (love your efforts here btw!), where do our large (> 4096byte, < 50KB) strings belong? Can we put them in datahike? Or does the limitation datomic has still apply to datahike? scriptum looks really great for indexing/search but we still need a place to store them.

👀 1
whilo 2026-03-30T10:37:22.104899Z

Hey @ramblurr and @grounded_sage. I think ideally it should be a configurable setting, but inputs should be bound. It depends on how much one trusts the user to understand the problem, I think it might be better to reject long inputs.

2026-03-30T13:32:14.132739Z

Hey @grounded_sage, I know this is besides the point of this thread but do you have a quick example of a "sub-query" to another store inside a datalog query?

Casey 2026-03-30T16:25:11.016999Z

I was going to ask the same question!

😁 1
whilo 2026-03-30T19:04:18.202899Z

What @grounded_sage means is that you can call a blob store in a clause as a function, e.g. [:find ... :where [(k/get blob-store ?e) ?s]. To load a string blob ?s for entity ?e. You can do this also with the file system etc. Storing data in Datahike is most useful if it is comparable and therefore can be scanned or searched. We might want to integrate a blob store functionality, too, so far I avoided this because it is more flexible to query other stores like this. Note that if these stores are mutable and you overwrite then you lose the persistent memory semantics, this would be a reason to provide a dedicated blob store or to put things into Datahike for now.

JAtkins 2026-03-31T04:27:11.339399Z

oh, that's awesome. I'll need to give this a spin

grounded_sage 2026-03-31T09:11:34.799679Z

correct. Sub-query wasn't the right terminology 🙂

2026-03-31T12:26:43.297199Z

Thanks

Casey 2026-04-01T08:57:41.932989Z

here is a working example of this for anyone interested: https://github.com/Ramblurr/playground/blob/main/datahike/blob-store/src/examples/blob_store.clj

🚀 1
grounded_sage 2026-03-30T01:20:50.223539Z

I had a similar question some time ago @whilo mentioned anything >2kb can cause the tree to be unbalanced which affects queries. I don't know if that is still valid now. You can tweak things but generally it is better to store large strings externally and use Datahike for the graph relations. You can perform "sub-queries" to other stores within Datahike queries.