datomic 2016-04-12 | Slack Archive

Any idea what i’m doing wrong with this pull query?

(d/q `[:find [(pull ?person [:odata/first_name :odata/last_name])]
       :where [?person :odata/ssn "343-34-3434"]] (d/db conn))

Can’t seem to get anything other than a Argument [:odata/first_name :odata/last_name] in :find is not a variable exception being thrown

dominicm08:04:17

I'm not sure, but is it possible you mean :find (pull ?person...

gurdas09:04:50

Getting the same exception when omitting the wrapping vector

dominicm11:04:21

@gurdas: I'm not sure if this is it either, but I tend to use the ' not the ` when writing datomic queries, I can never remember the distinction, but it may have some effect.

dominicm11:04:09

@gurdas: yeah, just tried locally, that looks like it's the problem.

Lambda/Sierra12:04:13

Syntax-quote (backquote) is used when you want to substitute in some values, it will automatically namespace-qualify any bare symbol, making it impractical for Datomic queries.

dominicm12:04:44

@stuartsierra: I'm guessing that pull was becoming some.namespace/pull then? and therefore, it was looking for variables to pass in, not values?

Lambda/Sierra12:04:56

possibly

dominicm12:04:01

I guess pull is a "special form" if that is true.

Lambda/Sierra12:04:37

“special form” generally means symbols that are built-in to the Clojure compiler, so nothing to do with Datomic really.

dominicm12:04:48

A special form within the datomic datalog query syntax. I might be wrong, I don't understand datomic's rules engine, and all the custom functions you can implement.

Lambda/Sierra13:04:00

yes, pull is something Datomic's datalog parser recognizes.

Ben Kamphaus13:04:29

Just jumped in here right after typing this up: http://stackoverflow.com/a/36574439/3801886 — that answer has the link you’re looking for, to the grammar ( http://docs.datomic.com/query.html#grammar ) and pull expressions ( http://docs.datomic.com/query.html#pull-expressions ).

casperc15:04:58

I need to implement limit and paging functionality on resultsets that can potentially be in the millions. Is there a recommended way of doing this, since it is not directly supported by datomic? The best thing I can come up with is doing the query finding only entity ids, doing the limiting and then doing pulls on the limited resultset.

casperc15:04:10

Is there any better way or will this actually perform well even if the query matches alot of entities?

Lambda/Sierra15:04:02

@casperc: That's a good start.

Lambda/Sierra15:04:25

Another possibility is to start with iteration, using d/datoms or d/index-range, get batches of candidate entities, then use queries to filter them.

Lambda/Sierra15:04:01

Stop as soon as you have a “page” full of results.

casperc15:04:58

@stuartsierra: Hmm, and use the list of entity ids as input for the query?

Lambda/Sierra15:04:16

yes

Lambda/Sierra15:04:34

But I would try the straightforward query approach first.

casperc15:04:53

Thanks, I’ll do that

casperc15:04:58

I would hope that limit and/or paging is added as a built in functionality eventually though. Seems a bit strange to have to implement it yourself tbh.

Ben Kamphaus15:04:15

+1 to @stuartsierra ’s suggestions. A lot of this depends on the shape of your query. I.e. if it’s equivalent to filtering by Attribute or Attribute + Value and you want to get the same attributes from each entity, you could use d/datoms or d/index-range filtered to match the queries pretty easily and then map a d/pull for attributes. But definitely stick with query unless you’re sure it is/will be a performance bottleneck as it’s the simplest approach.

Ben Kamphaus15:04:37

Understood re; limit/offset/order by equivalents, it’s a feature request under consideration.

casperc15:04:25

+1 to that feature request from here then 😉

Ben Kamphaus15:04:19

There are some obvious advantages to expressing those in declarative form, but do note that the work for it (perf cost) will still be done in the peer given Datomic’s architecture.

casperc15:04:19

I understand, but it is natural to compare features with other databases and through that lens it seems like an oversight.

casperc15:04:48

(though it is still completely doable)

casperc15:04:23

Another way to address it would be to add a section on how to do paging/limits to the best practices section of the docs. This way it would at least be addressed, which I haven’t seen anywhere currently.

Ben Kamphaus15:04:16

Yep, like I said we’re investigating adding it 😉 It does run into the difference between “get everything done in one query” vs “use queries and other datomic api to build composable pieces of your application” approaches, which runs into the difference of Datomic being a db in your app vs. having roundtrip pressure inform a lot of design decisions. I definitely understand the request and that there’s potential value, just trying to point to some of the context that informs how features are prioritized.

dominicm15:04:29

@bkamphaus: Just so I'm sure that I understanding correctly, are you saying that any implementation of limit/offset/sort would be similar/the same as what Stuart suggested for this problem?

Ben Kamphaus15:04:59

@dominicm: nope not offering any precise comments specifically on what the implementation would look like. Just that it would necessarily involve sorting query results on the peer/in the app (because query work isn’t done on some server somewhere).

Ben Kamphaus15:04:30

Obviously you can skip a sorting step if you are mapping pull or entity on a seq from datoms when you can guarantee that you’re (1) filtering by attr or attr + value as your query, and (2) you’re ordering by value (3) you have the index set for the attr and (4) the sort order of datoms aligns with the sort order you want precisely, etc.

dominicm15:04:38

@bkamphaus: ah, so it would be possible for the datomic peer will be smart about things like sorting then. That would make it a worthwhile feature to implement.

Lambda/Sierra15:04:29

Yeah, right now Datomic doesn't have any built-in support for “secondary indexes,” another oft-requested feature. That would probably make it easier to support efficient limit/offset. But you can create your own secondary index by adding another attribute with a computed value (e.g. “Join Date, Lastname, Firstname”) if you need to maximize the efficiency of iterating over that specific thing.

Lambda/Sierra15:04:06

Still though, start with simple queries and see if performance is adequate first.

dominicm16:04:43

Datomic is pretty good for paging now that I think about it. If you've ever used reddit, you'll notice that on page2+ they include an "after" as part of the query params, this is a reference to the last posting on the previous page. This obviously doesn't work perfectly if the last posting has changed position, but because datomic's database is a value, you would just have a "at" parameter for the time the page2 should be checked at.

dominicm16:04:46

You might want to get the "current" value of those entities, for changes to upvotes or deletions (spam). That would also be pretty easy in datomic. I like it.

gurdas18:04:44

@stuartsierra: @dominicm Thanks for the clarification; didn’t know about the differences between backtick and single quote when it came to namespace resolution

2016-04-12

Channels