Fork me on GitHub

A data modelling question. Here's an example, expressed as clj data for simplicity:

[{:entity/id 1
  :entity/props-and-vals [{:prop 2 :val "today"}
                          {:prop 3 :enum 4}]}

 {:prop/id 2 :prop/name "due date"}
 {:prop/id 3 :prop/name "status" :prop/enums [4 5]}

 {:enum/id 4 :enum/val "todo"}
 {:enum/id 5 :enum/val "done"}]
Requirements: 1. 3 is a prop which has enumerated values, 2 is a prop whose value is a string. 2. Props are defined db-wide, enumerated values are defined per prop, while string values are defined per each entity. 3. Maps inside props-and-vals need to be ordered per each entity. 4. Queries need to answer questions like "find all entities that have this :prop and this :val/`:enum`" . Matter of fact, these will probably be among the most frequent queries. The questions is: how should props-and-vals be modelled?


Given requirements and especially 4, I'd make it a "multi-cardinality join", i.e.

[{:entity/id 1 :entity/props-and-vals [-1 -2]}

 {:pv/id -1 :pv/prop 2 :pv/val "today"}
 {:pv/id -2 :pv/prop 3 :pv/enum 4}]
But I dislike the idea of storing these :pv/id docs per each entity. But I probably dislike slowing down queries even more. What would you do? Is there a more creative solution? Only thing that comes to mind is a minor optimization: :pv/id -2 is not really necessary – having the ref to enum 4 would be enough to reconstruct what the corresponding prop is. But maybe there's a solution that looks less like the clj example and so is non-obvious?


bikeshedding, but might this work?

[{:entity/id 1
  :due-date "today"
  :status :done
  :prop-order [:due-date :status]}]


I don't think so. What if the user changes the :prop/name? (Maybe should've included that in the requirements.)


maybe the following, if you wanted to query the metadata too:

[{:entity/id 1
  :prop/due-date "today"
  :prop/status :status/done
  :props [:prop/due-date :prop/status]}

 {:xt/id :prop/due-date
  :label "due date"}

 {:xt/id :prop/status
  :label "status"}

 {:xt/id :status/done
  :enum :prop/status
  :label "done"}]


heh, messages crossed 🙂

🙂 1

yeah - could introduce a surrogate id for props if you didn't want to include it in the name itself


Interesting! Really need random ids for the props though. But they would need to be keywords, correct? And queries like these would work?

(xt/q db
      '{:find [e]
        :in [enum-id enum-val simple-id simple-val]
        :where [[e enum-id enum-val] [e simple-id simple-val]]}
      :enum-prop-1 :val-id   ; status
      :simple-prop-1 "today" ; due date


I'll also just add that it's not metadata, really. Quite important for the prop to be linked to its enumerated values. (I messed up the requirements again!)


yes, they need to be keywords to be document attributes - but :prop/<uuid-str> is a valid keyword, for example


Well I'm chuffed 🙂 Shows the flexibility of the schema-less approach. I will give this a go, thank you! (This has been on my mind for months...)

☺️ 1

> And queries like these would work? those and many others, I suspect - you could have some good fun with graph/pull queries there


I often use the document's :xt/id for such things, fwiw - means pull becomes more powerful


Yeah, of course they would, I was just a bit stunned 🙂


> I often use the document's :xt/id for such things, fwiw - means pull becomes more powerful I plan on adopting the strategy

{:xt/id uuid :entity/uuid uuid ...}
for that reason. And to be able to easily distinguish between doc types.

👌 2

You know, I might try not using this. God forbid there is any duplication 😆 All I need to distinguish doc types is for each type to have one always-appearing keyword 🙂 E.g. [e :prop/name] would filter to props.


Actually the :prop/<uuid-str> seems like an interesting convention to adopt for :xt/id generally – can tell the type just by looking at the id. Downsides?


I don't like using :xt/id for anything in your application logic because it complects entities and documents, information and data. so I always use maps, like :xt/id {:type uuid}. Then when we want another document for that logical entity, to avoid write duplication for example, we can do something like :xt/id {:type uuid :qualifier :x}


this does mean that pull maybe becomes more trouble than its worth, but imo EQL doesn't make sense without a schema anyway


That's interesting. I might have a use case for that, but I'll have to think about it. @U050V1N74 This query throws Query didn't match expected structure

  (xt/db node)
  '{:find [e]
    :in [enum-id]
    :where [[e enum-id]]}
I don't think you can pass in attributes as args? This works:
  (xt/db node)
  '{:find [e]
    :where [[e :prop/status-id]]})


From the > XTDB does not support: > • vars in the attribute position, such as [e ?a "Ivan"] or [e _ "Ivan"] So I guess I'll just have to build :where before running the query.


Also, if one doesn't know beforehand which props will be in an entity, can't join the props in pull. E.g:

['* {:prop/status-id [:prop/name]}]
Can't work if I don't know :prop/status-id in advance. Not really an issue for me (not going to use pull), but worth mentioning.