This page is not created by, affiliated with, or supported by Slack Technologies, Inc.
2019-09-10
Channels
- # announcements (14)
- # bangalore-clj (1)
- # beginners (89)
- # calva (166)
- # cider (33)
- # clara (12)
- # clj-kondo (1)
- # cljdoc (8)
- # clojure (101)
- # clojure-austin (1)
- # clojure-colombia (7)
- # clojure-dev (14)
- # clojure-europe (5)
- # clojure-hamburg (10)
- # clojure-italy (9)
- # clojure-nl (31)
- # clojure-spec (4)
- # clojure-uk (39)
- # clojurescript (17)
- # clojutre (3)
- # code-reviews (16)
- # cursive (72)
- # data-science (1)
- # datomic (81)
- # duct (8)
- # emacs (4)
- # figwheel-main (1)
- # graalvm (2)
- # jobs (9)
- # kaocha (21)
- # lambdaisland (2)
- # luminus (4)
- # off-topic (35)
- # re-frame (1)
- # reagent (101)
- # reitit (4)
- # ring-swagger (5)
- # shadow-cljs (17)
- # sql (40)
- # tools-deps (4)
- # vim (28)
I notice the client api version of d/datoms seems to ignore the fourth (tx) part of :components. Is this a known issue? Bug or by design?
Yes, known. By design I believe, as if you have EAVT you have the datom, so you don’t need index access
I would at least expect it to be mentioned that only three components in the vector are inspected
You make a good point though. The only additional possible information datoms could give you is whether it was an assertion or retraction
Also is there a better way to get all datoms matching pattern on client api other than repeatedly adjusting offset? This feels like O(n^2). If client had seek-datoms or a “starting at” datoms argument, one could use the last seen datoms as the start to the next chunk
I’m struggling with workloads which are too large for a single query where I would normally use peer d/datoms lazily to produce an intermediate chunk or aggregate
@U09R86PA4 How big is the total DB and how big are the results you’re looking for? Also, how frequently are you running this query (or ones like it)?
Use cases vary but they follow the pattern of being able to aggregate as you go and aggregation is much smaller than input; or preparing subsets of input for the same query rerun many times
When I ran it on a peer, the result took a few minutes but bounded memory and result set size was 60 out of 120 million input datoms (cold instance, no valcache or memcached)
are you running the peer server with the same memory settings as you did for the peer?
These are queries I couldn’t run on even a really large peer. I don’t fault peer server for not being able to handle it naively. I just can’t use my usual d/datoms workaround for controlling the size of intermediate results by being lazy
Hrm. I don’t quite understand what the 4th component has to do with it. Do you have large #s of datoms with the same AVE that only differ in T?
I discovered it while doing thought experiments with a client api with a seek-datoms; I could use it to construct the start of the next chunk instead of merely seeking the whole result over again by the offset (which seems to be how it is behaving. Is that actually how client’s datoms is implemented?)
does the server’s impl of client d/datoms have the same time complexity as peer (->> (apply d/datoms index components) (drop offset) (take limit))
or is it more efficient than that?
further down: “Synchronous API functions are designed for convenience. They return a single collection or iterable and do not expose chunks directly. The chunk size argument is nevertheless available and relevant for performance tuning.”
Yeah I thought async‘s chunking was just doing offset adjustment for you like a normal rest api would
Ok it works! Thanks! Large :chunk makes a huge difference in sync api and is not ignored, so I consider that a doc bug
i think that would be a good candidate for a feature request on the feature request portal
I can’t find this doc link you shared on the on-prem section of the datomic docs website
is there a Datomic certification path anywhere? 😮