Fork me on GitHub
#datomic
<
2017-10-27
>
Empperi09:10:51

I was talking about LIMIT functionality and lack of it in Datomic here few days ago, I want to continue a bit on the subject and say that after thinking in detail about this I can totally get the basic reason why it is not supported. However I think it should be relatively easy to support IF the query API would return a lazy sequence instead of an eager one. So, my question goes to that department now: why is it exactly that the query API results are actually eager and not lazy?

Empperi09:10:48

I’m guessing it has something to do with different indices within Datomic and query planner and combining the datasets from these indices reliably, but this is mostly just guesses and would love to hear some ideas

Empperi09:10:24

I would guess that the where clauses are handled via reducers within Datomic (that would just make sense) and based on that assumption creating a lazy sequence shouldn’t be too much of a problem

Empperi09:10:26

but I’m pretty certain I’m missing something here, otherwise we would be receiving lazy sequences already. I want to understand the internals of Datomic a bit better so that I can circumvent it’s limitations and use it’s advantages more efficiently

augustl09:10:17

I would imagine it's something that could be included in the query engine if you're OK with limiting when a certain condition is met and the ordering is "whatever the order the query engine iterates the indices in"

augustl09:10:36

it is fundamentally walking a lazy tree of chunks, after all

Empperi09:10:43

I actually think it is not as long as the resultset is eager

Empperi09:10:09

because the where clauses are applied one by one

Empperi09:10:21

so in order to do the LIMIT you need to apply all of them

Empperi09:10:00

and at that point you already have all the data without the LIMIT processed and since it is in memory due to peer cache then what’s the point in returning just a subset? Just return it all and let the client to do the limit functionality

Empperi09:10:35

but, if this processing would be lazy then you could do this depth first traversal of the where clauses instead of breadth first (which I guess is currently happening)

Empperi09:10:06

then one could just simply do (take limit query-results) at Datomic level

Empperi09:10:21

and it would work exactly the way people would want it to work

Empperi09:10:40

but, actually just got another idea why it is like this: to optimize the peer cache population

Empperi09:10:45

right, it must be actually because of that

Empperi09:10:21

because you actually want to do the breadth-first: that way after doing the first where clause you know the absolute worst case of data you’re going to need in order to do rest of the query

Empperi09:10:42

then you can retrieve that from the storage backend with one sweep and do the rest of the stuff in memory

jfntn19:10:02

We’re currently running an index creation migration and would like to get a sense of how long it will take to finish but I’m not sure how to check on that?

currentoor19:10:14

In the docs it says the client library can support non-JVM languages. Are there any examples of that? Can we for example use datomic’s new client library from a ruby process? http://docs.datomic.com/architecture.html#clients