Fork me on GitHub

(This is a repost from #C03S1KBA2, but posting here as it is more specific.) I am building something where each user owns a graph with (in expectation) 5000 nodes and 5000 labeled edges. nodes are generally 20-50 characters, and I am looking for database options. I am looking at datascript and asami, but I am wondering if there are any other options/heuristics to keep in mind. I am using CLJS with re-frame, so datascript with re-posh (and datsync?) is a consideration (I might need some sync between client and server). It would be nice if the database allowed for separate, user-specific graphs.


The way it works now is that graphs are wrapped in a “database” object. So multiple graphs are handled as multiple databases. Unfortunately, Asami is not currently supporting queries against multiple graphs. It is on the roadmap, but hasn’t happened yet


5000 is small, and should not be a problem on any system


If datascript is integrated with other systems, then that may be your defining characteristic. To my knowledge, no one has tried to integrate Asami like that


how heavy are the database objects? i might have one per user, but i don't need to run any queries against multiple graphs.


They’re a light wrapper around the graph. They do contain an array of the state of the graph after every transaction, but due to the way immutability works, that is not very expensive. :thinking_face: Unless, you were to build large graphs, completely empty them, then fill them again, over and over. But if that were the case you would be better to just create a new graph. ie. The wrapper isn’t a big deal

🙂 2

Speaking of which - when importing something “big”, eg Snomed, having a history of inserts isn’t really logically relevant. Maybe that’s an unproductive thought, but: is there any use in, and/or way to, avoid the state change history when importing data?


In a case like that you’ll have a previous state of “empty” and the next state is “SNOMED”. So, there’s not really any waste. Even if you then load a whole lot of stuff in the next transaction (eg, I load the inferred types, and instantiate all classes into prototypical instances), it doesn’t really waste anything at all. It’s almost entirely additive, with a few extra index nodes. If you compare the disk (or memory) usage between loading a few large datasets, or doing a single load, then it’s a tiny difference


This is a weird one, but can nodes point to edges? Would that just be done by having a node store edge id's (or point to edge id's)? Curious as to how you might implement this. In other words I want every node to have its set of "foreign edges"


The indexes are capable of this, but the API doesn’t provide a mechanism for it yet. I should have done it months ago, but my new job and health issues slowed me down 😕


No problem, I can probably figure out a workaround