Hey there đź‘‹
I’m scanning through some asami code and I noticed that the d/entity function has an arity-3 variant that includes a nested? boolean. What does this do exactly? I’m experimenting with setting it to false in an attempt to achieve a bit of laziness while crawling the graph, but it still seems to auto-pull nested entities.
Hmmm, I thought it wasn’t supposed to when set to false? I haven’t looked at it in a long while though!
Ah, I see. It’s not general nested entities
Seems like the key lies on these lines here: https://github.com/threatgrid/asami/blob/40e8cac1eb8dd87ddf1e975455da27b9d688ab6b/src/asami/entities/reader.cljc#L82-L83
I wonder if my nested entities do not possess the :tg/entity key?
it covers the case of things like:
[{:db/ident "first"
:val 1}
{:db/ident "second"
:val 2
:inner {:db/ident "first"}}]
This has 2 “top-level” individually addressable entities. If you call:
(entity db "second")
you’ll get: {:val 2 :inner {:db/ident "first"}}
But if you use:
(entity db "second" true)
you’ll get: {:val 2 :inner {:val 1}}Ohh okay I see
If it finds a “top-level” entity, then it just includes a reference to the entity (e.g. {:db/ident "first"})
But if you say, “No, I really do want you to go in and retrieve this other thing” then it will
I see. So maybe this is a side effect of how I’m transacting my data. My use case is language AST analysis, so I’m directly transacting the AST straight into asami. I don’t think I have any true “top-level” entities in this case…
Are you transacting triples, or are you transacting a sequence of maps?
It’s a sequence of a single map, which is very deeply nested. I’m basically transacting one big tree.
OK. Only the very top object will be considered “top level” then
You could always add on the attribute of {:tg/entity true} if you want to tell Asami to treat the entity as a top-level thing
“top level” was an idea that was just created to deal with ingesting and exporting JSON
Ah nice, I was just going to ask whether I should normalize my data manually ahead of time. I should be able to walk this tree and assoc :tg/entity easily.
Out of curiosity, are there ways to “walk” the graph other than with d/entity? Maybe something lower level, like with d/graph which I admittedly haven’t looked far into yet.
I’m shooting for something lazy. I saw your comment in the docs that hinted at d/entity being lazy one day.
The entity function really does just walk the graph 🙂
Haha ok cool, then I’m in the right place 🙂
“One day” yes
For now… notsomuch
I’m wondering if I could contrib something here, but I think my use-case is too narrow. I know asami supports durable storage now too, so that might complicate things considerably in the lazy department…
Not really. At least, not with the current architecture
Naively I want to slap a lazy-seq in there and call it a day, but surely that’s overly simplistic?
Well, the entities are based on key-value pairs that are lazy. Right now, they get turned into an entity via (into {} …)
But that only matters if you want to iterate over keys
what it needs is a lazy-seq type of approach where you have an object that meets the clojure.lang.Associative interface, and holds onto a map of what has been read so far, plus the context to load up anything that’s missing
Yea, sounds a lot like Wilker’s https://github.com/wilkerlucio/pathom3/blob/master/src/main/com/wsscode/pathom3/interface/smart_map.cljc
This is basically what Datomic entities do
That “lazy map” would be awesome to have as its own library even. I wonder if there is prior art in generalizing it.
To generalize it, I imagine you’d need to have the user pass in their own lazy-seq’d thunk that produces the next kv pair? Hm…that wouldn’t really work though…it’d be O(N) on key access…
Both Datomic and Pathom have the advantage of a schema-like hint data structure.
https://github.com/tonsky/datascript/blob/master/src/datascript/impl/entity.cljc
I think it needs an API for looking up keys to get a value. So basically a cached map
Yea you’re right. Datascript uses a (volatile! {}) as the internal cache
Even datascript has a schema upfront for things like refs…asami’s would need to be more dynamic
That would make me nervous on durable storage. I think I’d prefer an atom
Does asami support any special keywords on d/entity maps? For example, datascript lets you walk backwards using reverse refs with an underscore: :person/_friends
No, but I need to
However, I think I can only do that if I do the cached-map-plus-query-context objects
Yea I think so too. As it stands I don’t see how it could happen. Are you picturing a separate API for this, like d/lazy-entity kind of thing?
nope 🙂
I’ll just make it another instance of Associative/IPersistentMap
That makes sense. For some reason I thought it would change existing behavior, but if it does it means the implementation is wrong 🙂
I might take a crack at this and if I come up with something promising I will definitely share. It sounds like an interesting puzzle.
I think it should be relatively tractable. Reading an entity would return an object that is both an atom (so it can be shared) and a db (which is an immutable read-only structure). Getting any value by key is just a lookup in the map, and if it’s not there, then do a lookup in the graph associated with the db. There are 3 cases here: • If it’s a simple value, then just associate the key with the value in the cache, and return the value. • If it’s a sequential type (these are stored as linked lists), then I think it’s appropriate to get all of these eagerly. • If it’s an object type, then create a new cached entity, just like this one, with the same db, and a new root ID to build on
Cool, I think we’re on the same page. What is your reasoning behind eagerly fetching sequences? Not disagreeing at all, just curious. Does that have to do with durability?
Not at all. It just didn’t seem to be worth the hassle of laziness. But it would be so easy to wrap it in a lazy-seq that perhaps it’s no big deal
Can do eager at first and then reach for lazy-seq after maybe. That would keep the first draft focused on one thing.
I’m thinking that it would be nice to have a LazySeq that let you provide a count, because that can be returned immediately without having to traverse the linked list 🤔
But, yes. I think eager is fine
Thanks a lot for your help on the :tg/entity tag, I will try that. Asami is awesome btw. Your code is a joy to read 🙂