Sanity check: if I want to "delete" an entity/node, I should retract all attributes associated to that entity/node, correct?
Yes
Thanks! I'm thinking through representing an objects-within-objects structure in Asami. Specifically, I'm wondering whether child objects should be stored in an array as objects or merely as references. Here are some observations related to this, hoping they are helpful. Let's consider the following data:
[{:db/id :tg/node-0
:text "Text 0"
:child {:db/id :tg/node-1 :text "Text 1"}}]
I'm assuming "deleting" an entity is realised by retracting all attributes of the entity:
(d/transact conn [[:db/retract :tg/node-0 :text "Text 0"]
[:db/retract :tg/node-0 :tg/owns :tg/node-1]
[:db/retract :tg/node-0 :child :tg/node-1]
[:db/retract :tg/node-0 :db/ident :tg/node-0]
[:db/retract :tg/node-0 :tg/entity true]])
1) I would expect deleting node-0 would also delete node-1 from the db.
2) I would expect deleting node-1 would remove references to it in :tg/owns and :child attributes of node-0.
3) These 'issues' multiply when arrays are introduced. For instance, deleting the array does not delete the elements of the array, nor the references to the array or its elements from array's parent.
Of course, I shouldn't have expected this behaviour to come built-in š I guess I was thinking in terms of JSON or Datomic's :db/isComponent.
Retracting an attribute requires us to specify its current value, unless I'm mistaken. The reason for this is not apparent to me, but I know Datomic had the same requirement originally. If I'm storing objects-within-objects and I want to retract a :children attribute, I would have to pass in the whole children subtree?
So, unless my analysis wrong, it seems simpler to store children as references. Are there any performance benefits to storing them as objects?Trying to store children as array of references, I've no idea how to obtain a path between nodes. Consider:
[{:db/id :tg/node-0 :children [{:db/id :tg/node-1}]}
{:db/id :tg/node-1 :children [{:db/id :tg/node-2}]}
{:db/id :tg/node-2 :children []}]
To find the parent of node-2:
(d/q '[:find ?p :where [?p :children ?a] [?a :tg/contains :tg/node-2]] (d/db conn))
But it doesn't seem I can use transitive attributes here. Any suggestions?Then I thought I could use :tg/owns if I stored them as array of objects, but I found out only node-0 owns everything: node-1 owns neither the array (containing node-2), nor node-2.
[{:db/id :tg/node-0
:children [{:db/id :tg/node-1
:children [{:db/id :tg/node-2
:children []}]}]}]
I also tried setting tg/entity true on child nodes.:tg/owns is to track objects under the top-level objects.
Try:
(d/q '[:find ?p :where [?p ?a+ :tg/node-2][?p :children] (d/db conn))
That should get every parent of node-2 and then cut it back to only the nodes with a :children attribute
(I hope)
That works. Thank you!
No, no performance benefits at all
TBH, I havenāt really given thought to removing entire entities and their children. It seems like an obvious thing, but itās not been a use case for anything Iāve done! š³
I imagine it would be useful when Asami is used to hold app state, CRUD, etc. I think I've identified the source of my expectations :) From the https://github.com/threatgrid/asami/wiki/5.-Entity-Structure#arrays: > Typically, if a node likeĀ `:tg/node14856`Ā were to be deleted, then every node that it owns will also be deleted. However, dual ownership like this should ensure that theĀ `:tg/node-14855`Ā node is left alone.
I know I have lots of ongoing work to release, but in the meantimeā¦
I think datomic uses the :db/isComponent schema feature for this?
Makes sense. Not that it really matters unless I introduce :db.fn/retractEntity
this would tie into another change Iām considering
I donāt want to introduce schemas into Asami, but Iāve been thinking about allowing temporary schemas during transactions
something like:
(transact conn {:tx-data [{:db/ident "first" :data 1} {:db/ident "second" :data 2}]})
(transact conn {:tx-data [{:db/ident "first" :data 42}]
:schema {:db/ident :data :db/cardinality :db/cardinality/one})
(entity conn "first")
;; {:data 42}
someone brought up the idea of temporary schemas when I was showing off datalog support for pyramid. seems like it would be welcome
I donāt know what Iāll do with schema properties I donāt care about.
Maybe use them for validity checking
(transact conn {:tx-data [{:db/ident "first" :data "forty-two"}]
:schema {:db/ident :data
:db/cardinality :db/cardinality/one
:db/valueType :db.type/long})
;; Exception "Attribute :data cannot accept data: \"forty-two\"". Requires Long valuesounds useful!
Could :tg/own be used instead of :db/isComponent? If :tg/own was changed so that node-1 owns node-2 in example below:
[{:db/id :tg/node-0
:children [{:db/id :tg/node-1
:children [{:db/id :tg/node-2
:children []}]}]}]
I'm guessing you have other reasons for this behaviour, though, but it is a little confusing to me.It specifically goes to the root object. It's an internal method of tracking things that are components
If a sub node is not a component, then it shouldn't be inserted that way
That's a flippant reply, but it's not meant to be. Hopefully I can explain better in the morning
Not at all, and I had taken enough of your time yesterday! What I'm trying to say is that, looking at my example above naively, node-2 is in precisely the same relation to node-1 as node-1 is in relation to node-0 (and through the same attribute). Generally, if object 2 is nested within object 1, it feels natural to me to treat object 2 as a component of object 1 by default. For things that are not components, I would use references instead of nesting. But don't mind me š And thanks for your help!
Thanks for the detailed explanation!
Well, my comment was based on what :tg/owns is trying to do. Itās a system property that I use to track relationships between embedded objects. Every ātop levelā entity (entities that are described independently, and not exclusively inside another entity) will āownā everything underneath it that isnāt also a top level entity.
One reason for this is for an unpublished index (I need to finish more work before I put it out there, but itās being stored on disk already). Entities are serialized to bytes, and stored in this index. If you request an entity, then there will be a lookup for a triple that doesnāt yet exist (but should do⦠eventually):
[your-entity :tg/index ?index]
If there is a result for ?index then this will be looked up in the new index. This is deserialization operation, and much faster than building an object out of triples. If itās not there, then it can be rebuilt using triples.
But what if you update the object? Well, like any good immutable store, this requires a object be written to the index. It also means that anything it is embedded in will also need to be rewritten, up to the root embedding. I can find that recursively, by looking up:
[?parent ?attribute+ your-entity]
But this is tricky, since I donāt want to include anything that refers to your entity, but doesnāt actually own it. That means tracking the original root entity that this was created under. And thatās the :tg/owns attribute.
Incidentally, one of the reasons for the deserialization mess is because if entities look like:
{:id 1, :a "a property", :b 2, :c [{:id 11, :x 1, :y 2}, {:id 12, :x 10, :y 15}]}
Then there will be a pointer to the bytes that serialize all of this, but the serialization step also pulls out the offsets where the :id 1 and the :id 2 entities start, and these can be deserialized independently of the surrounding entity.
As per usual, the above structure would create triples something like:
:tg/node-101 :id 1
:tg/node-101 :a "a property"
:tg/node-101 :b 2
:tg/node-101 :c :tg/node-102
:tg/node-102 :tg/first :tg/node-105
:tg/node-102 :tg/rest :tg/node-103
:tg/node-103 :tg/first :tg/node-106
:tg/node-102 :tg/contains :tg/node-105
:tg/node-102 :tg/contains :tg/node-106
:tg/node-101 :tg/owns :tg/node-102
:tg/node-101 :tg/owns :tg/node-105
:tg/node-101 :tg/owns :tg/node-106
:tg/node-105 :id 11
:tg/node-105 :x 1
:tg/node-105 :y 2
:tg/node-106 :id 12
:tg/node-106 :x 10
:tg/node-106 :y 15
This isnāt exact, but itās close.
But now weāre also going to see new triples along the lines of:
:tg/node-101 :tg/index 555
:tg/node-105 :tg/index 580
:tg/node-105 :tg/index 594
(I canāt recall the exact relative offsets here, but itās close to that).[Asami 2.2.3 has been released](https://github.com/threatgrid/asami/blob/main/CHANGELOG.md#223---2021-11-10). The main change here is a fix for using :id for entities in updates.
Awww. Thank you! I appreciate hearing this!
I would like to use an uuid to identify objects in my in-memory db. Reading the wiki, it seems likeĀ `:id`Ā is the attribute I should use. However, I'm not getting the behaviour I am expecting when usingĀ `:id`Ā to update.
Thanks for this. :id is reasonably new but it should have worked. Iāll look at this soon
Thanks!
Think Iāve found the bug. Testing now
The problem with these things is that writing tests always takes longer than fixing the bug! š
Example:
(d/transact conn [{:id "1" :text "text"}]) ; create
(d/transact conn [{:id "1" :text "update text"}]) ; update
The second line creates a new node, instead of updating the existing one:
;; create
(#datom [:tg/node-21976 :id "1" 1 true]
#datom [:tg/node-21976 :text "text" 1 true]
#datom [:tg/node-21976 :db/ident :tg/node-21976 1 true]
#datom [:tg/node-21976 :tg/entity true 1 true])
;; update
(#datom [:tg/node-21979 :id "1" 2 true]
#datom [:tg/node-21979 :text "id text" 2 true]
#datom [:tg/node-21979 :db/ident :tg/node-21979 2 true]
#datom [:tg/node-21979 :tg/entity true 2 true])
Additionally, annotating in the second line:
(d/transact conn [{:id "1" :text' "update text"}]) ; update
actually retracts :text from the original node, then creates a new node with updated text:
;; create
(#datom [:tg/node-22003 :id "1" 1 true]
#datom [:tg/node-22003 :text "text" 1 true]
#datom [:tg/node-22003 :db/ident :tg/node-22003 1 true]
#datom [:tg/node-22003 :tg/entity true 1 true])
;; update
(#datom [:tg/node-22003 :text "text" 2 false]
#datom [:tg/node-22006 :id "1" 2 true]
#datom [:tg/node-22006 :text "update text" 2 true]
#datom [:tg/node-22006 :db/ident :tg/node-22006 2 true])
Using :db/ident instead of :id gives the expected behaviour in both cases.