Fork me on GitHub
#datomic
<
2023-02-01
>
octahedrion14:02:11

if one tests a potential schema and profiles some queries with test data using dev-local do the results predict real-world performance once deployed to AWS ?

ghadi14:02:39

the query engine will work the same way but almost certainly the cardinalities of your entities are smaller in your tests, as well as differing in storage and caching characteristics

ghadi14:02:29

given enough realistic sample data, you can use the newly released https://blog.datomic.com/2023/01/Query-Stats.html functionality to gain confidence

ghadi14:02:01

basically what joe said, no

octahedrion14:02:35

but does local query performance vs cardinality not even scale similarly to that on AWS ?

octahedrion14:02:57

and surely a change in :where order which resulted in better performance of a query locally would also result in better performance on AWS ?

ghadi15:02:47

yes the query execution plans will translate, it's the same engine

ghadi15:02:11

storage stack is very different though

octahedrion15:02:05

yes I get that

enn15:02:19

does the query planner consider cardinality when constructing its execution plan?

favila16:02:49

this seems unlikely, because of what I know about the separation between the query engine and the datasource. The query planner doesn’t even make index choices or resolve keywords to attributes

favila16:02:14

it gives the datasource a pattern, and gets (ultimately) an iterable of things matching that pattern

Keith15:02:28

Datomic executes your :where clauses in the order you specify: https://docs.datomic.com/on-prem/best-practices.html#most-selective-clauses-first That said, with the addition of https://docs.datomic.com/on-prem/api/query-stats.html, you can now observe per-clause cardinalities and adjust your clause ordering accordingly.

joshkh15:02:28

is grounding a unique value enough to bust the cache when running a query?

'{:where [[?item :item/id 123]
          [(ground "some-unique-value") ?grounded]]}

Keith15:02:25

Are you referring to the caching mechanism mentioned here https://docs.datomic.com/on-prem/best-practices.html#parameterize-queries?

Keith16:02:20

Any change to the query's structure will cause a cache miss.

enn15:02:17

With the new query-stats feature (which is great!) I’ve been able to observe that it doesn’t always execute the :where clauses in the specified order. Here is a real example from a debugging session yesterday. Note that the [(!= ?group ?group2)] clause has been moved up in the :sched plan compared to its location in the original :where clause. I’m just wondering if that reordering is predictable/stable/consistent or if it depends on the cardinality of the data in some way.

:query-stats
 {:query
  {:find [[?group2 ...]],
   :in [$ ?group],
   :where
   [(not [?group :group/archived?])
    [?group :group/workspace2 ?wksp2]
    [?group :group/name ?name]
    [?group2 :group/workspace2 ?wksp2]
    (not [?group2 :group/archived?])
    [?group2 :group/name ?name]
    [(!= ?group ?group2)]]},
  :phases
  [{:sched
    (([(ground $__in__2) ?group]
      (not-join [?group] [?group :group/archived?])
      [?group :group/workspace2 ?wksp2]
      [?group :group/name ?name]
      [?group2 :group/workspace2 ?wksp2]
      [(!= ?group ?group2)]
      (not-join [?group2] [?group2 :group/archived?])
      [?group2 :group/name ?name])),

Keith15:02:44

Correct! Predicate clauses, such as the != clause you mentioned, do get reordered when the query schedule is created so that all required logic variables are bound before that predicate clause is used and get moved close to the clause where the predicate will be used. You should also see that [(!= ?group ?group2)] isn't listed as a clause under :clauses, but instead is attached to the [?group2 :group/workspace2 ?wksp2] clause underneath the :preds key.

thanks3 2
Keith15:02:44
replied to a thread:With the new query-stats feature (which is great!) I’ve been able to observe that it doesn’t _always_ execute the `:where` clauses in the specified order. Here is a real example from a debugging session yesterday. Note that the `[(!= ?group ?group2)]` clause has been moved up in the `:sched` plan compared to its location in the original `:where` clause. I’m just wondering if that reordering is predictable/stable/consistent or if it depends on the cardinality of the data in some way. :query-stats {:query {:find [[?group2 ...]], :in [$ ?group], :where [(not [?group :group/archived?]) [?group :group/workspace2 ?wksp2] [?group :group/name ?name] [?group2 :group/workspace2 ?wksp2] (not [?group2 :group/archived?]) [?group2 :group/name ?name] [(!= ?group ?group2)]]}, :phases [{:sched (([(ground $__in__2) ?group] (not-join [?group] [?group :group/archived?]) [?group :group/workspace2 ?wksp2] [?group :group/name ?name] [?group2 :group/workspace2 ?wksp2] [(!= ?group ?group2)] (not-join [?group2] [?group2 :group/archived?]) [?group2 :group/name ?name])),

Correct! Predicate clauses, such as the != clause you mentioned, do get reordered when the query schedule is created so that all required logic variables are bound before that predicate clause is used and get moved close to the clause where the predicate will be used. You should also see that [(!= ?group ?group2)] isn't listed as a clause under :clauses, but instead is attached to the [?group2 :group/workspace2 ?wksp2] clause underneath the :preds key.

thanks3 2