Fork me on GitHub
#datomic
<
2017-03-17
>
thomas12:03:44

hi, I have been running into the problem where Datomic Entities are not maps....

thomas12:03:26

and I have found this bit of code that https://gist.github.com/jtmarmon/0a644fbca15a1742964c seems to solve it (Almost)

thomas12:03:52

but now I get NPE's

thomas12:03:35

I added a (if (nil? e)... to my code, but then it looks like I get way too much data.

thomas12:03:46

what is the best way of dealing this?

Lambda/Sierra13:03:40

@thomas What are you trying to accomplish? If you want maps to start with, d/pull may be an easier API to use, since it does return normal Clojure maps.

Lambda/Sierra13:03:01

As a side note, (-> e class .toString (clojure.string/split #" ") second (= "datomic.query.EntityMap")) is a complicated way to write instance?.

thomas13:03:37

I am using an rather old version of datomic... so can't really use d/pull 😞

thomas13:03:02

and yes... re the second item I think you are correct.

thomas13:03:10

and I am trying to do a postwalk on the EntityMap. and that was causing problem.

thomas13:03:37

then @dominicm pointed out that EntityMaps are not Clojure maps... there is a (into {} entity) in the code, but from what I understand that won't work on nested maps

thomas13:03:52

So what is the best way of turning EntityMaps into Clojure maps? With (into {} entity) I can't do a postwalk and with the above mentioned function I am getting different results it seems and those result in failure higher up the stack.

dominicm14:03:18

(clojure.walk/postwalk (fn [x] (if (instance? datomic.query.EntityMap x) (into {} x) x)) input)
Does this work @thomas?

thomas14:03:58

let me try

thomas14:03:27

no, that gives me an java.lang.AbstractMethodError error again 😞

thomas14:03:49

from what I understand a prewalk should do the trick....

thomas14:03:51

let me try htat

thomas14:03:54

that looks like it...

thomas14:03:17

prewalk instead of postwalk... let me try that in our slightly bigger env

thomas14:03:44

that looks ok!

thomas14:03:00

prewalk it is...

thomas15:03:46

Now it is working as expected.

dominicm15:03:40

oh, of course. Makes sense.

thomas16:03:49

most things makes sense after you have fixed them 😉

a.espolov16:03:33

Guys for query [:find (max 5 ?e) (pull ?r [*]) :with ?player :where [?player :player/region ?r] [?player :player/attr.stats.wins ?e]]

a.espolov16:03:36

I understand that in order to get all of the attributes for the entity layer:p I'll have to write your function aggregation?

favila17:03:47

I don't understand the question

favila17:03:03

what is ?e here? why are you calling max on it?

a.espolov17:03:31

@favila want to choose 5 entities which have a maximum value of an attribute :player/attr.stats.wins

a.espolov17:03:16

query is works just me instead of one attribute (:player/attr.stats.wins) are interested in the whole essence of :player

favila17:03:48

what is the type of :player/attr.stats.wins?

favila17:03:11

cardinality-one?

favila17:03:27

so you want (d/pull db ?player [* {:player/region [*]}]) for the 5 players with the most wins?

a.espolov17:03:20

@favila i'm sorry)) (d/q '[:find (max 5 ?e) (pull ?r [:region/key]) :with ?player :where [?player :player/attr.stats.wins ?e] [?player :player/region ?r] (d/db conn)

favila17:03:49

top 5 players per region?

favila17:03:13

[:find (max 5 ?n-wins) ?r (pull [*] ?player)
         :where [?player :player/attr.stats.wins ?n-wins]
                      [?player :player/region ?r]

favila17:03:03

sorry hit enter too soon

favila17:03:08

is that what you are thinking?

favila17:03:45

since you are aggregating two different ways I'm not sure you can do it in a single query

a.espolov18:03:37

@favila the query above does not work( 'invalid expression pull'

favila18:03:01

did you remove the :with?

favila18:03:17

I don't think that query gives you what you want anyway

favila18:03:30

oh, you have to reverse arg order

favila18:03:37

man I hate that so much, I always get that wrong

a.espolov18:03:42

I actually agree on something to get :db/id for each :player

favila18:03:46

d/pull and pull reverse arg order

a.espolov18:03:04

@favila I don't quite understand the use case, it is not difficult to show the minimum sample?

favila18:03:49

anything in :with is kept for result-set, but removed for aggregation in :find

favila18:03:06

so you can't both aggregate by a subset and also keep the full tuple

favila18:03:21

So you must either query twice, or do the aggregation yourself afterwards

a.espolov18:03:01

@favila Thank you, but I still don't understand how two queries you can get at least :db/id for :player with the highest number of victories

favila18:03:52

In first query, you get [?region #{?max-five-wins}]

favila18:03:08

in second query, you find players whose regions and wins match

favila18:03:32

you may get more than 5 results per region if multiple players have the same number of wins

favila18:03:47

but you can sort+limit that yourself

favila18:03:59

I will write an example

favila18:03:42

(let [max-wins-per-region           (d/q '[:find ?r (max 5 ?n-wins)
                                           ;; No point to :with ?player
                                           :where
                                           [?player :player/attr.stats.wins ?n-wins]
                                           [?player :player/region ?r]] db)
      max-wins-per-region-relations (into []
                                      (mapcat (fn [[region-id n-wins]]
                                                (map #(do [region-id %]) n-wins)))
                                      max-wins-per-region)]
  (d/q '[:find ?r-key (pull [*] ?player)
         :in $ [[?r ?n-wins]]
         :where
         [?player :player/region ?r]
         [?r :region/key ?r-key]
         [?player :player/attr.stats.wins ?n-wins]]
    db
    max-wins-per-region-relations))

favila18:03:20

(I assume :player/region is cardinality-one? players have only one region?)

a.espolov18:03:56

both question answer is yes

favila18:03:18

so here is another approach

a.espolov18:03:16

@favila thanks) for a short time discovered many newlifer

favila18:03:55

You can always do the aggregation yourself, too, if you can hold the intermediate result set

favila18:03:42

query that returns [?player-id ?region-id n-wins], then do a reduce over it

favila18:03:25

in that case the player cutoff will be arbitrary. If 6 players in a region all have 10 wins, which player is not listed?

favila18:03:47

if that doesn't fit in memory, you can use (d/datoms :aevt :player/region), and reduce over it, grabbing player data and aggregating at the same time

favila18:03:03

Remember, queries run on the peers. You can do everything with code instead of queries if you want. The client is going to download every intermediate value either way.

favila18:03:33

There's no need to fit everything into queries, certainly not into one single query

a.espolov18:03:56

@favila when in the base of the 305 000 players request to get top players totally leaves 2.2 minutes

favila19:03:59

@a.espolov if you have an index on :player/attr.stats.wins, it may be faster to put that clause first

favila19:03:27

(depends on the selectivity of the index)

a.espolov19:03:53

@favila you can add an index to the entity attribute if already exist in the database instances are selected?

favila19:03:45

If :db/index = false, you can add an index

favila19:03:04

By selectivity I just mean how many records you will get for a given value

favila19:03:47

e.g., if number-of-wins clusters around a few common values, it may touch fewer datoms to find the region then the number of wins. But if number of wins tends to vary widely the index is more selective, so getting by number of wins, then the region will touch fewer datoms

a.espolov19:03:08

@favila la index i can disable attribute have no problems with existing records in the database?

favila19:03:20

@a.espolov I don't understand?

a.espolov20:03:27

:db/index = true => :db/index = false such an operation for data in database is executed without problems?

favila20:03:15

you can change :db/index by itself at will

favila20:03:30

But removing an index in your scenario will not help query performance. The reasons to remove an index are: less storage for the index, less indexing time.

favila20:03:00

I think it's usually better to index everything, then turn it off later if either of those becomes a problem