datomic 2023-02-22 | Slack Archive

Gonzalo Rafael Acosta02:02:10

Hello, I have a query that is running over 7M entities and taking 15 secs to complete. It's around 30 conditions, 3 ranges, one missing and one not statement, 2 ands, one or etc. I am using 2 instances t3.2xlarge within a query group. A colleague of mine says postgres would be much faster doing this. What's your experience / opinion? Thanks

jaret14:02:59

This is a can of worms without specifics. It's impossible to comment on performance characteristics without schema/data size/the specific query. 30 clauses seems like a lot. If you're on cloud we'll have query-stats out shortly and that should make this the sort of thing you can easily reason about.

Gonzalo Rafael Acosta17:02:48

yes that is why i asked for an opinion not a scientific study

Gonzalo Rafael Acosta17:02:11

i will be contacting support on my own

Gustavo A.17:02:41

Hello, I just want to confirm with you guys how the :db/isComponent flag works. After a second read of its definition, and an unfortunate experience, my understand is that isComponent means something like “the entity I’m referring to is part of me”. Take this context: Parent entity -> Child entity {:db/ident :parent/name ....} {:db/ident :parent/child :db/type :db.type/ref :db/isComponent true ....} Here the Parent refers to a Child entity and flags it as being a component/part of it. But if I define something like this {:db/ident :child/parent :db/type :db.type/ref :db/isComponent true ....} It would mean the opposite, that the child entity owns the parent entity. Am I right?

favila17:02:17

yes

favila17:02:36

another way of thinking of it: in a system using isComponent “properly”, there is only ever one reference to the entity, and it’s via the component attribute. I.e. the datom [E component-attr V] occurs only once per (likely) V, but at least only once per component-attr + V

favila17:02:02

that is why the reverse-ref of isComponent is cardinality one

favila17:02:15

the expectation is that there’s no other inward edge to that entity

favila17:02:04

(this is not enforced though)

Gustavo A.17:02:25

kk, nice to know that too

Gustavo A.17:02:58

also thanks for confirming, I’m glad that I got it right now

dvingo17:02:01

Hi, I'm looking for some ideas with the following problem: • given I am working on a large, old clojure codebase with datomic as the system of record • and there is no well-defined place where schemas are stored, nor specs to determine what the shapes of entities are How do you determine: 1. All the schema in a database? (list of attributes and their attributes) 2. The statistics of how many datoms per attribute are in the db 3. The "shapes" of the entities - what groups of attributes are highly correlated to be asserted on an entity. Thanks!

dvingo17:02:47

I would expect a database product to provide these details. For example with postgres you can easily determine all the tables and counts of rows in those tables. and xtdb includes https://docs.xtdb.com/clients/clojure/#_attribute_stats

Jarrod Taylor (Clojure team)17:02:35

You may find https://github.com/JarrodCTaylor/schema-cartographer-on-prem could provide some of what you are looking for.

Keith18:02:29

> All the schema in a database? (list of attributes and their attributes) You can query for schema in the same way you query for data - is that insufficient? > The statistics of how many datoms per attribute are in the db d/db-stats should give you that kind of information > The "shapes" of the entities - what groups of attributes are highly correlated to be asserted on an entity. See @U0508JRJC’s reply above for at least a partial answer to that question

dvingo19:02:45

Thanks for the replies. For 1. Yes, I am aware you can query for the attributes, this also returns lots of internal attributes which are very noisy. This means I have to write custom code to learn basic facts about my database. For 2. We are using on-prem 1.0.6242 and this function is not available. Apparently this just returns the total number of datoms though which would provide zero information about the distribution of attribute counts. For 3. Thank you for this reference, but it is very strange to recommend someone's personal project in order to provide key functionality of a commercial database product. It seems like I'll have to craft custom code. I would expect a database to be aware of its own data.

2023-02-22

Channels