This page is not created by, affiliated with, or supported by Slack Technologies, Inc.
2017-10-23
Channels
- # aws-lambda (1)
- # bangalore-clj (3)
- # beginners (80)
- # boot (8)
- # clojars (1)
- # clojure (200)
- # clojure-dev (37)
- # clojure-greece (26)
- # clojure-italy (11)
- # clojure-norway (3)
- # clojure-russia (14)
- # clojure-spec (21)
- # clojure-uk (30)
- # clojurescript (50)
- # core-logic (10)
- # core-matrix (1)
- # cursive (15)
- # data-science (21)
- # datomic (45)
- # devcards (2)
- # emacs (4)
- # fulcro (12)
- # garden (2)
- # jobs (5)
- # juxt (1)
- # lambdaisland (1)
- # leiningen (4)
- # luminus (20)
- # lumo (26)
- # off-topic (33)
- # onyx (27)
- # parinfer (1)
- # pedestal (3)
- # perun (5)
- # re-frame (20)
- # reagent (27)
- # ring (1)
- # ring-swagger (21)
- # shadow-cljs (259)
- # spacemacs (14)
- # yada (3)
hi, I have a very specific problem to solve which might sound very generic one in the beginning. In short, I’m trying to implement LIMIT functionality with Datomic but my use case really doesn’t allow me to use for example the datoms API and pull based approach really doesn’t work either since that does the limitation on attribute level
so, I’m kinda like trying to solve the LIMIT
problem just by using the q
based queries
since queries done with it are eager and it doesn’t support limit itself I’m kinda out of ideas
the reason why I need to stick with q
and really cannot use datoms API is that we are creating a SPARQL endpoint to our data and thus we really need to actually perform datalog queries
also, I think I wouldn’t be able to solve this via datoms API either since when making actual queries I think the query engine uses all the indices available and with datoms I’m restricting myself to single index
so even if I do get access to all datoms via datoms API, I cannot exactly do stuff like:
that executes but will not provide the same results as straight query due to different indices (or well, that exact query just might, but in general you can’t rely on that)
and besides, doing limit functionality on datoms level there wouldn’t give me correct results since then limiting is done too early
so, am I just screwed or is there some hidden feature somewhere which would allow me to do LIMIT
?
Would sample work for you? http://docs.datomic.com/query.html#aggregates-returning-collections
LIMIT somewhat implies order afaik, which I don't think Datomic has. Datomic uses sets. We've run into a desire for LIMIT before, and had to work around it using d/datoms
(and we could, in our situation).
Perhaps you could write a custom aggregate for this, but I suspect there isn't one for a reason.
I think your LIMIT solution, is coupled to your ordering situation. You can't LIMIT until you've ORDERed
did a quick test with sample, execution time is same with sample and if you just get all the data out
if you need to sort and limit a dataset, you'll need to have the whole thing in memory at some point, don't you
I can't think of any way for a RDBMS to sort and then limit, on the database server, without having the whole working set in memory
(and if I can't think of an algorithm to do that off the top of my head, then obviously it cannot exist, right?)
I guess doing it naively after making the query is our best bet here, at least we can reduce the amount of data sent over the wire that way
and at least it would improve performance in ORDER BY + LIMIT scenario, since one needs to do the sorting only as long as the LIMIT has been reached
@niklas.collin so, I haven’t gotten into a case where I had to implement this yet, but this problem bothered me a bit and I thought of two solutions:
(1) if there is only one ordering you care about (e.g. a newsfeed, where you want to retrieve the top N entries), store the data in a format which allows for this specific query to be efficient. e.g. a linked list
(2) if (1) does not work (e.g. because you need to sort arbitrarily), build a materialised view of your database using the tx report queue
@hmaurer both good ideas but unfortunately not usable in this case, thanks for your input and ideas though 👍
I was trying to look for information on AWS cross region fail-over support. https://github.com/awslabs/dynamodb-cross-region-library from amazon can clone from one dynamodb table to another across regions. And dynamodb streams (http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/Streams.html) guarantee that the records appear once and in sequence. Does this satisfy the consistent copy options for HA in datamic or is something else that is missing? http://docs.datomic.com/ha.html#other-consistent-copy-options
Hi, Datomic Team! I would like to recommend you new host DB for Datomic – Tarantool DB.
Sub 1 ms latency • 100K-300K QPS per one CPU core • 100K updates per node • Small number of nodes (money saver) • Expiration • Always up, no maintenance windows • Optimized for heavy parallel workloads
+ Full ACID DB
You can avoid HornetQ, cause Tarantool can work as queue too.
Tarantool is a cache + acid db in one solution. Proven in production many years on highload services: Badoo, Avito, http://Mail.ru
It has App server in DB, so you can write stored procedures in High Level Language