This page is not created by, affiliated with, or supported by Slack Technologies, Inc.
2016-02-05
Channels
- # aatree (2)
- # admin-announcements (15)
- # announcements (2)
- # aws (8)
- # beginners (160)
- # boot (290)
- # braid-chat (28)
- # cider (8)
- # clara (1)
- # cljsrn (3)
- # clojure (154)
- # clojure-czech (7)
- # clojure-russia (162)
- # clojurebridge (2)
- # clojurescript (128)
- # cursive (29)
- # datomic (30)
- # emacs (7)
- # events (1)
- # hoplon (5)
- # jobs (1)
- # ldnclj (7)
- # leiningen (3)
- # off-topic (11)
- # om (82)
- # onyx (68)
- # overtone (1)
- # parinfer (57)
- # portland-or (1)
- # proton (18)
- # re-frame (8)
- # reagent (32)
- # ring-swagger (3)
- # yada (5)
:b/c
is card one or many?
whoa! the memcached solution reduced the cold query time from my house (25 ms ping time) from ~30 s to 2.2 s. I think we have our solution!
cool, good to hear. I wonder if there’s a cost in the structure of that pull that’s non-obvious. I’m doing testing against a larger mbrainz than the sample we provide, I see several orders of magnitude bump in perf to put in the second pull statement, I’ll discuss that with the dev team, though, too.
actually never mind, that time is only introduced when I have a typo in one of the pulled attributes, interesting.
thanks, we'll keep that in mind and see if we notice differences with two-pull queries.
sorry thinking aloud
yeah, I’m not sure, I see < 150 msec w/local postgres storage for this query (larger mbrainz than public) with 10,340 count:
(time
(count
(d/q '[:find (pull ?t [:track/name :track/release]) (pull ?a [:artist/sortName :artist/startYear])
:where
[?a :artist/name "Pink Floyd"]
[?t :track/artists ?a]]
(d/db conn))))
anyways, glad the memcached option seems to be helping!
~500 msec with reverse ref in first pull instead of typo 😛 (again 10,340 total results)
(time
(count
(d/q '[:find (pull ?t [:track/name :medium/_tracks]) (pull ?a [:artist/sortName :artist/startYear])
:where
[?a :artist/name "Pink Floyd"]
[?t :track/artists ?a]]
(d/db conn))))
@sonnyto: you could maybe have a look at https://github.com/cddr/integrity#integritydatomic
Based on this stack overflow post I understand how I can get updated-at values using the history db. http://stackoverflow.com/questions/24645758/has-entities-in-datomic-metadata-like-creation-and-update-time But for performance I wanted to retrieve these timestamps together and part of another query. So is that possible? And is this the correct way to do it?
(d/q '[:find (pull ?a structure) ?created-at (max ?updated-at)
:in $ structure
:where
[?a :action/status "foo"]
[?a :action/id _ ?id-tx]
[?id-tx :db/txInstant ?created-at]
[?a _ _ ?all-tx]
[?all-tx :db/txInstant ?updated-at]
]
(d/db conn)
ent/ActionStructure)
Assuming :action/id
is a unique attribute that is only set when the entity is created.
@currentoor: "for performance I wanted to retrieve these timestamps together and part of another query" There is usually no need to combine queries for performance reasons.
Smaller, simpler queries usually perform better than large, complex queries.
Yeah I can totally see where you're coming from @stuartsierra but for this specific use-case I'm fetching about 1000 entities from the DB then mapping over them to get their created-at updated-at timestamps. The timestamp loop makes up about have my total execution time.
Individually these created-at updated-at queries are negligible but in aggregate they take a significant amount of time.
Do you think they would still take just as long if I put them inside the larger query?
@currentoor: As with any performance question, measure first. But I would not expect the combined queries to perform any better than separate queries.
I would look at the size of the ?updated-at
query results. If you have many transactions updating each entity, that could account for some of the cost of the query.
Hmm. So I know this is hearsay but I'm getting pressured to store created-at updated-at attributes directly on the entity, just like other DBs. I know this is re-inventing stuff but what about performance, do you suspect this would be faster than using Datomic's built in time facilities?
@currentoor: As always, test and measure. Make sure you have realistic-sized data to test.
Will do, thanks.
I'm having getting a set of tx-times with this query.
(defn timestamps [db lookup-refs]
(d/q '[:find (min ?tx-time) (max ?tx-time)
:in $ [?eid ...]
:where
[?eid _ _ ?tx _]
[?tx :db/txInstant ?tx-time]]
(d/history db)
lookup-refs))
I'm passing in four lookup-refs so I would expect the result to be four tuples, one for each of the lookup-refs. But instead I get this.
[[#inst "2016-02-05T22:22:31.085-00:00" #inst "2016-02-05T22:31:29.292-00:00"]]
Can a query be used to take in a collection and return a collection in the same ordering?
Oh I get, uniqueness is the issue. This works.
(defn timestamps [db lookup-refs]
(d/q '[:find ?id (min ?tx-time) (max ?tx-time)
:in $ [?eid ...]
:where
[?eid _ _ ?tx _]
[?eid :action/id ?id ?tx _]
[?tx :db/txInstant ?tx-time]]
(d/history db)
lookup-refs))