asami

zeitstein 2021-11-10T12:34:57.083800Z

Sanity check: if I want to "delete" an entity/node, I should retract all attributes associated to that entity/node, correct?

quoll 2021-11-10T12:57:03.084Z

Yes

zeitstein 2021-11-10T13:33:44.084600Z

Thanks! I'm thinking through representing an objects-within-objects structure in Asami. Specifically, I'm wondering whether child objects should be stored in an array as objects or merely as references. Here are some observations related to this, hoping they are helpful. Let's consider the following data:

[{:db/id :tg/node-0
  :text "Text 0"
  :child {:db/id :tg/node-1 :text "Text 1"}}]
I'm assuming "deleting" an entity is realised by retracting all attributes of the entity:
(d/transact conn [[:db/retract :tg/node-0 :text "Text 0"]
                  [:db/retract :tg/node-0 :tg/owns :tg/node-1]
                  [:db/retract :tg/node-0 :child :tg/node-1]
                  [:db/retract :tg/node-0 :db/ident :tg/node-0]
                  [:db/retract :tg/node-0 :tg/entity true]])
1) I would expect deleting node-0 would also delete node-1 from the db. 2) I would expect deleting node-1 would remove references to it in :tg/owns and :child attributes of node-0. 3) These 'issues' multiply when arrays are introduced. For instance, deleting the array does not delete the elements of the array, nor the references to the array or its elements from array's parent. Of course, I shouldn't have expected this behaviour to come built-in šŸ™‚ I guess I was thinking in terms of JSON or Datomic's :db/isComponent. Retracting an attribute requires us to specify its current value, unless I'm mistaken. The reason for this is not apparent to me, but I know Datomic had the same requirement originally. If I'm storing objects-within-objects and I want to retract a :children attribute, I would have to pass in the whole children subtree? So, unless my analysis wrong, it seems simpler to store children as references. Are there any performance benefits to storing them as objects?

zeitstein 2021-11-10T21:57:00.093500Z

Trying to store children as array of references, I've no idea how to obtain a path between nodes. Consider:

[{:db/id :tg/node-0 :children [{:db/id :tg/node-1}]}
 {:db/id :tg/node-1 :children [{:db/id :tg/node-2}]}
 {:db/id :tg/node-2 :children []}]
To find the parent of node-2:
(d/q '[:find ?p :where [?p :children ?a] [?a :tg/contains :tg/node-2]] (d/db conn))
But it doesn't seem I can use transitive attributes here. Any suggestions?

zeitstein 2021-11-10T22:08:00.094Z

Then I thought I could use :tg/owns if I stored them as array of objects, but I found out only node-0 owns everything: node-1 owns neither the array (containing node-2), nor node-2.

[{:db/id :tg/node-0
  :children [{:db/id :tg/node-1
              :children [{:db/id :tg/node-2
                          :children []}]}]}]
I also tried setting tg/entity true on child nodes.

quoll 2021-11-11T03:39:09.095500Z

:tg/owns is to track objects under the top-level objects.

quoll 2021-11-11T03:48:37.097500Z

Try:

(d/q '[:find ?p :where [?p ?a+ :tg/node-2][?p :children] (d/db conn))

quoll 2021-11-11T03:49:39.099Z

That should get every parent of node-2 and then cut it back to only the nodes with a :children attribute

quoll 2021-11-11T03:49:50.099400Z

(I hope)

zeitstein 2021-11-11T07:41:30.103Z

That works. Thank you!

quoll 2021-11-10T13:49:19.085Z

No, no performance benefits at all

quoll 2021-11-10T13:50:03.085900Z

TBH, I haven’t really given thought to removing entire entities and their children. It seems like an obvious thing, but it’s not been a use case for anything I’ve done! 😳

zeitstein 2021-11-10T17:32:28.088500Z

I imagine it would be useful when Asami is used to hold app state, CRUD, etc. I think I've identified the source of my expectations :) From the https://github.com/threatgrid/asami/wiki/5.-Entity-Structure#arrays: > Typically, if a node likeĀ `:tg/node14856`Ā were to be deleted, then every node that it owns will also be deleted. However, dual ownership like this should ensure that theĀ `:tg/node-14855`Ā node is left alone.

quoll 2021-11-10T18:29:22.089Z

I know I have lots of ongoing work to release, but in the meantime…

lilactown 2021-11-10T18:30:47.090100Z

I think datomic uses the :db/isComponent schema feature for this?

quoll 2021-11-10T18:34:20.090900Z

Makes sense. Not that it really matters unless I introduce :db.fn/retractEntity

quoll 2021-11-10T18:34:39.091100Z

this would tie into another change I’m considering

quoll 2021-11-10T18:35:20.091300Z

I don’t want to introduce schemas into Asami, but I’ve been thinking about allowing temporary schemas during transactions

quoll 2021-11-10T18:38:47.091500Z

something like:

(transact conn {:tx-data [{:db/ident "first" :data 1} {:db/ident "second" :data 2}]})
(transact conn {:tx-data [{:db/ident "first" :data 42}]
                :schema {:db/ident :data :db/cardinality :db/cardinality/one})

(entity conn "first")
;; {:data 42}

lilactown 2021-11-10T18:39:01.091700Z

someone brought up the idea of temporary schemas when I was showing off datalog support for pyramid. seems like it would be welcome

quoll 2021-11-10T18:39:48.091900Z

I don’t know what I’ll do with schema properties I don’t care about.

quoll 2021-11-10T18:40:01.092100Z

Maybe use them for validity checking

quoll 2021-11-10T18:42:20.092300Z

(transact conn {:tx-data [{:db/ident "first" :data "forty-two"}]
                :schema {:db/ident :data
                         :db/cardinality :db/cardinality/one
                         :db/valueType :db.type/long})
;; Exception "Attribute :data cannot accept data: \"forty-two\"". Requires Long value

lilactown 2021-11-10T18:57:03.092600Z

sounds useful!

quoll 2021-11-10T20:29:11.093100Z

https://github.com/threatgrid/asami/issues/223

šŸ‘šŸ» 1
zeitstein 2021-11-10T22:21:41.094200Z

Could :tg/own be used instead of :db/isComponent? If :tg/own was changed so that node-1 owns node-2 in example below:

[{:db/id :tg/node-0
  :children [{:db/id :tg/node-1
              :children [{:db/id :tg/node-2
                          :children []}]}]}]
I'm guessing you have other reasons for this behaviour, though, but it is a little confusing to me.

quoll 2021-11-11T03:55:34.100500Z

It specifically goes to the root object. It's an internal method of tracking things that are components

quoll 2021-11-11T03:56:22.101600Z

If a sub node is not a component, then it shouldn't be inserted that way

quoll 2021-11-11T04:02:49.102600Z

That's a flippant reply, but it's not meant to be. Hopefully I can explain better in the morning

zeitstein 2021-11-11T06:59:42.102800Z

Not at all, and I had taken enough of your time yesterday! What I'm trying to say is that, looking at my example above naively, node-2 is in precisely the same relation to node-1 as node-1 is in relation to node-0 (and through the same attribute). Generally, if object 2 is nested within object 1, it feels natural to me to treat object 2 as a component of object 1 by default. For things that are not components, I would use references instead of nesting. But don't mind me šŸ™‚ And thanks for your help!

zeitstein 2021-11-12T10:52:08.108700Z

Thanks for the detailed explanation!

quoll 2021-11-11T20:13:40.103300Z

Well, my comment was based on what :tg/owns is trying to do. It’s a system property that I use to track relationships between embedded objects. Every ā€œtop levelā€ entity (entities that are described independently, and not exclusively inside another entity) will ā€œownā€ everything underneath it that isn’t also a top level entity. One reason for this is for an unpublished index (I need to finish more work before I put it out there, but it’s being stored on disk already). Entities are serialized to bytes, and stored in this index. If you request an entity, then there will be a lookup for a triple that doesn’t yet exist (but should do… eventually): [your-entity :tg/index ?index] If there is a result for ?index then this will be looked up in the new index. This is deserialization operation, and much faster than building an object out of triples. If it’s not there, then it can be rebuilt using triples. But what if you update the object? Well, like any good immutable store, this requires a object be written to the index. It also means that anything it is embedded in will also need to be rewritten, up to the root embedding. I can find that recursively, by looking up: [?parent ?attribute+ your-entity] But this is tricky, since I don’t want to include anything that refers to your entity, but doesn’t actually own it. That means tracking the original root entity that this was created under. And that’s the :tg/owns attribute.

quoll 2021-11-11T20:28:19.103500Z

Incidentally, one of the reasons for the deserialization mess is because if entities look like: {:id 1, :a "a property", :b 2, :c [{:id 11, :x 1, :y 2}, {:id 12, :x 10, :y 15}]} Then there will be a pointer to the bytes that serialize all of this, but the serialization step also pulls out the offsets where the :id 1 and the :id 2 entities start, and these can be deserialized independently of the surrounding entity. As per usual, the above structure would create triples something like:

:tg/node-101 :id 1
:tg/node-101 :a "a property"
:tg/node-101 :b 2
:tg/node-101 :c :tg/node-102
:tg/node-102 :tg/first :tg/node-105
:tg/node-102 :tg/rest :tg/node-103
:tg/node-103 :tg/first :tg/node-106
:tg/node-102 :tg/contains :tg/node-105
:tg/node-102 :tg/contains :tg/node-106
:tg/node-101 :tg/owns :tg/node-102
:tg/node-101 :tg/owns :tg/node-105
:tg/node-101 :tg/owns :tg/node-106
:tg/node-105 :id 11
:tg/node-105 :x 1
:tg/node-105 :y 2
:tg/node-106 :id 12
:tg/node-106 :x 10
:tg/node-106 :y 15
This isn’t exact, but it’s close. But now we’re also going to see new triples along the lines of:
:tg/node-101 :tg/index 555
:tg/node-105 :tg/index 580
:tg/node-105 :tg/index 594
(I can’t recall the exact relative offsets here, but it’s close to that).

quoll 2021-11-10T18:30:52.090300Z

[Asami 2.2.3 has been released](https://github.com/threatgrid/asami/blob/main/CHANGELOG.md#223---2021-11-10). The main change here is a fix for using :id for entities in updates.

šŸŽ‰ 4
quoll 2021-11-10T00:07:28.078400Z

Awww. Thank you! I appreciate hearing this!

zeitstein 2021-11-10T07:56:45.080800Z

I would like to use an uuid to identify objects in my in-memory db. Reading the wiki, it seems likeĀ `:id`Ā is the attribute I should use. However, I'm not getting the behaviour I am expecting when usingĀ `:id`Ā to update.

quoll 2021-11-10T11:12:44.082400Z

Thanks for this. :id is reasonably new but it should have worked. I’ll look at this soon

zeitstein 2021-11-10T12:27:36.082600Z

Thanks!

quoll 2021-11-10T13:53:26.086Z

Think I’ve found the bug. Testing now

šŸ‘ 1
quoll 2021-11-10T14:06:52.086200Z

The problem with these things is that writing tests always takes longer than fixing the bug! šŸ˜†

šŸ˜„ 1
zeitstein 2021-11-10T07:57:16.080900Z

Example:

(d/transact conn [{:id "1" :text "text"}])  ; create
(d/transact conn [{:id "1" :text "update text"}])  ; update
The second line creates a new node, instead of updating the existing one:
;; create
(#datom [:tg/node-21976 :id "1" 1 true]
 #datom [:tg/node-21976 :text "text" 1 true]
 #datom [:tg/node-21976 :db/ident :tg/node-21976 1 true]
 #datom [:tg/node-21976 :tg/entity true 1 true])
;; update
(#datom [:tg/node-21979 :id "1" 2 true]
 #datom [:tg/node-21979 :text "id text" 2 true]
 #datom [:tg/node-21979 :db/ident :tg/node-21979 2 true]
 #datom [:tg/node-21979 :tg/entity true 2 true])
Additionally, annotating in the second line:
(d/transact conn [{:id "1" :text' "update text"}])  ; update
actually retracts :text from the original node, then creates a new node with updated text:
;; create
(#datom [:tg/node-22003 :id "1" 1 true]
 #datom [:tg/node-22003 :text "text" 1 true]
 #datom [:tg/node-22003 :db/ident :tg/node-22003 1 true]
 #datom [:tg/node-22003 :tg/entity true 1 true])
;; update
(#datom [:tg/node-22003 :text "text" 2 false]
 #datom [:tg/node-22006 :id "1" 2 true]
 #datom [:tg/node-22006 :text "update text" 2 true]
 #datom [:tg/node-22006 :db/ident :tg/node-22006 2 true])
Using :db/ident instead of :id gives the expected behaviour in both cases.