This page is not created by, affiliated with, or supported by Slack Technologies, Inc.
2021-05-07
Channels
- # announcements (32)
- # asami (1)
- # babashka (127)
- # beginners (135)
- # bristol-clojurians (1)
- # calva (21)
- # chlorine-clover (5)
- # cider (2)
- # clara (9)
- # clj-kondo (24)
- # cljsrn (2)
- # clojure (25)
- # clojure-australia (4)
- # clojure-europe (135)
- # clojure-nl (8)
- # clojure-russia (3)
- # clojure-spec (4)
- # clojure-uk (9)
- # clojurescript (55)
- # cursive (6)
- # datomic (62)
- # events (1)
- # fulcro (1)
- # helix (19)
- # jobs (3)
- # jobs-rus (1)
- # kaocha (8)
- # malli (6)
- # meander (3)
- # off-topic (2)
- # pathom (3)
- # podcasts-discuss (1)
- # polylith (3)
- # practicalli (3)
- # re-frame (4)
- # reitit (5)
- # remote-jobs (1)
- # reveal (1)
- # rewrite-clj (9)
- # ring-swagger (1)
- # shadow-cljs (59)
- # xtdb (4)
Are there docs anywhere about the expected CPU use of queries vs transactions? Our current setup doesn't yet have query groups, and we're performing a lot more writes (i.e. transacts) than we are queries. I'm seeing CPU hitting 98+% on the transactors, and then everything falls over. I'm curious if creating a query group to offload the queries could/would drop CPU on the transactors a lot more than the ratio of queries/transacts would suggest, because maybe queries are a lot more CPU intensive?
Also, is there documentation anywhere on all the standard graphs on the Datomic Cloud dashboard? Like, TxBytes
. Is that a per second average or an aggregate of all the data transmitted since the last datapoint? I'm assuming the latter, as changing the dashboard period, and therefore the interval between datapoints, alters the value significantly.
A wish question (I wish-and-hope-this-exists): Does anyone have something that allows me to edit a Datomic cloud database as a spreadsheet? Or as a simple CRUD app? We have a bunch of static information that we display to the internal users on Metabase - and they want to change the values they see.
https://github.com/hyperfiddle/hyperfiddle This may be what you're looking for!
In Datomic, what is the best-practice way to model this relationship: object A
contains references (i.e. many instances) of object B
and we want a field in object B
to be unique within the context of object A
. From the documentation, it does not seem like :db/unique
(either with :db.unique/identity
or :db.unique/value
), by itself, is appropriate. Wondering how to correctly model this constraint within the Datomic Schema.
@U6SN41SJC Look into using :db.unique/identity
tuples for this, either heterogeneous or composite.
Also, depending on how many "many instances" is, maybe B
should point to A
?
True. It doesnt matter which direction the index points and that would probably be easier.
How many is "many instances"? The answer to which direction it should go depends on the required selectivity of the access patterns. Again, all predicated on "many instance" 🙂
10 or less.
as a guess. A
is an environment for our testing platform and B
is the meta data for each service that will be tested in that environment. Our platform currently consists of ~8 services and I don’t see that number going up significantly.
Then performance doesn't matter here and you should do whatever is most convenient for you. That entire dataset will fit in memory, yay!
Is there a way for me to know which Datomic Cloud query group node a client api request went to?
We are receiving ~20 datomic client timeouts all on the exact same d/pull call within a 3 minute window, which is surprising because that call doesn't actually pull that much data. I was curious if the node those client api requests went to was overwhelmed.
Not at that time. The query is set to a 15s timeout and it's hitting that on every one of those calls.
It's a query with a pull. e.g.,
(d/q {:query '[:find (pull ?p [* {::props-v1/filter-set [*]}])
:where
[_ :customer/prop-group1s ?p]]
:args [db]
:timeout 15000})
From looking at the query group dashboard, I can see that the group was overwhelmed at the time. min cpu of 99 & max of 100. There were only 2 nodes in the group. I also observe that at lease one other query resulted in 50.4k count. The overwhelmed system simply manifests itself in those frequency, but small queries. Thinking the fix is to scale the system up at the time of the 50.4k query. Separately, does the Query Result Counts graph show the number of datoms a query returns or something else?
So if that query is pull
'ing in the :find, it could actually be some scalar * the reported number?
Assuming all the results are uniform, yes, that many datoms would be returned. Datoms isn't really the right measurement here though.
If I know each pull returns exactly 3 datoms, then the returned datoms is:
reported number * 3 = "that many datoms"
Instead of scaling the qg up, can you make a separate QG for that other query so they don't affect each other?
Yes, that is an option. I'd like a bit more data on which queries are causing that huge result set. I have a couple ideas but need more data to know how to split. Why would you tend to prefer splitting over scaling?
Is one of them a scheduled batch job? You can always spin the QG up just for that job 🙂
Another option I've been considering is "filling out" my query group with spot instances. It's likely that would solve this problem as well, at a fraction of the cost.
e.g., cpu spikes to near 100, some small number of queries timeout, then the event is over.
> Getting timeouts due to hitting peak capacity
^^ That is a symptom, and we still don't know why it occurred do we?
FWIW, a shorter timeout on your pulls with retry wrapped around it would also alleviate the above symptom because the request would (eventually, but how unlucky can you be?) be routed to a different node.
& there's only 2 nodes in the group at the event time. So if both nodes are processing 1+ 50.4k queries, perhaps pretty unlucky.
So there are only 2 nodes in the QG and there are 2 queries returning 50.4k results being issued at the same time?
I don't know for certain since I don't have that data instrumented right now but, yes it is likely. There's up to 5 queries that could all run in the same 10s window that are of that size.
Datomic Cloud currently uses the older launch configuration setup in creating ASGs so a mixed group of Spot & On-Demand is not possible 😢 I created a feature request here: https://ask.datomic.com/index.php/607/use-launch-template-instead-of-launch-configuration.