Fork me on GitHub
#datomic
<
2023-01-17
>
icemanmelting12:01:30

Hi guys, I was testing Datomic on prem with postgres backend, and I have a few million datoms inserted, although I would like to count them. I have come up with the following query: [:find (count ?e) :where [?e :tweet/id]] Is there a reason for this query to take so long that it either runs out of memory, or never returns a result?

favila12:01:03

The reason is that d/q is eager and must hold the entire result in memory before aggregation. It is not useful for large results like this. Use d/datoms instead

icemanmelting12:01:53

I was using the console, to do that query, do you know how I would go about using d/datoms in datomic console?

favila12:01:24

I don’t think you can. Use a repl

icemanmelting12:01:34

ok, thanks for your help

jaret14:01:01

also if you would like the total count of datoms in the history database you can use https://docs.datomic.com/on-prem/clojure/index.html#datomic.api/db-stats

icemanmelting14:01:17

thanks for that 🙂

icemanmelting16:01:58

It also didn’t help that my id attribute wasn’t set as unique…

icemanmelting16:01:12

@U09R86PA4 question, I have used datoms like so

(d/datoms (d/db (d/connect client {:db-name "twitter"})) {:index :aevt :components [:tweet/id]})

icemanmelting16:01:59

But how can I know how many elements are there from that structure? If I try to iterate and increment a counter, it always returns 1000, which I think might be a default chunk size

favila16:01:42

if you are using the sync api:

favila16:01:43

(count (seq (d/datoms (d/db (d/connect client {:db-name "twitter"})) {:index :aevt :components [:tweet/id] :limit -1})))

favila16:01:00

add :limit -1 and just coerce to a seq and count it

icemanmelting16:01:39

ok, that was what I was missing, thanks!

Phillip Mates16:01:18

Hiya, I'm looking to use create a heterogeneous tuple and am confused about how to properly read tuples from entities afterwards for example, I have

{:db/ident :my/tuple
  :db/valueType :db.type/tuple
  :db/tupleTypes [:db.type/long :db.type/ref]
  :db/cardinality :db.cardinality/one
and if I transact
(datomic.api/transact datomic-connection [{:my/id 123 :my/tuple [12 [:my/user "foofoo"]]}])
I get a set, not an ordered sequence, for :my/tuple
(datomic.api/entity (datomic.api/db datomic-connection) [:my/id 123])
; =>
{:my/id 123 :my/tuple #{17592186045528 12}}
How do I then get the first or second tuple values from the entity? Also curious why it is a set in the first place. I know you can do this with a custom datomic query + the untuple query function thing, but I'm not sure if that plays well with a query returning a collection that also includes the entities. Plus I'm curious about the possibility to do this at the entity level thanks!

2
Phillip Mates11:01:43

I found a work around by not going through the entity api: for instance I wanted to do

;; users with name "foofoo" aged 12-13
(->> (datomic.api/q
       '[:find ?my-entity
         :in $
         :where
         [?my-entity :my/tuple ?tup]
         [?user :my/user "foofoo"]
         (or [(tuple 12 ?user) ?tup]
             [(tuple 13 ?user) ?tup])]
       (datomic.api/db datomic-connection))
     (map (partial datomic.api/entity (datomic.api/db datomic-connection)))
     (map (fn [entity] {:age (get-in entity [:my/tuple 0])
                        :my/id (:my/id entity)})))
but the fact that :my/tuple results in a set means (get-in entity [:my/tuple 0]) is non-deterministic in regards to which element of the tuple it evaluates to after re-reading the different options available to :find, https://docs.datomic.com/on-prem/query/query.html#find-specifications, I realized that I could do:
(->> (datomic.api/q
       '[:find ?my-id ?age
         :in $
         :where
         [(untuple ?tup) [?age _]]
         [?my-entity :my/id ?my-id]
         [?my-entity :my/tuple ?tup]
         [?user :my/user "foofoo"]
         (or [(tuple 12 ?user) ?tup]
             [(tuple 13 ?user) ?tup])]
       (datomic.api/db datomic-connection))
     (map (fn [[id age]] {:age age
                          :my/id id})))

Phillip Mates11:01:38

hmm, I tried for a minimal reproduction and actually am not able to get it to return a set instead of a vector. So I guess it is some issue on my side with intermediate wrapper logic

Phillip Mates11:01:44

(ns tuple-fun
  (:require [datomic.api :as d]))


(def schema
  [{:db/ident :oblique/text
    :db/valueType :db.type/string
    :db/cardinality :db.cardinality/one}
   {:db/ident :oblique/strategy
    :db/valueType :db.type/tuple
    :db/tupleTypes [:db.type/long :db.type/ref]
    :db/cardinality :db.cardinality/one}])

 (defn setup-connection! []
   (let [uri "datomic:"
         _ (d/create-database uri)
         conn (d/connect uri)
         ]
     @(d/transact conn schema)
     conn))

(def conn (setup-connection!))

@(d/transact conn [{:db/id "os1" :oblique/text "Only a part, not the whole"}
                   {:oblique/strategy [1 "os1"]}])

(def oblique-entity
  (d/entity (d/db conn)
            (d/q '[:find ?oblique-strategy .
                   :where [?oblique-strategy :oblique/strategy _]]
                 (d/db conn))))

(:oblique/strategy oblique-entity)
;; results in a vector as expected

moe22:01:36

https://tonsky.me/blog/datascript-internals/ is a great write-up. I was wondering if anyone was aware of other resources (articles, papers, clearly written code) which may provide additional insight into what's going on under the covers when unifying queries and esp. accessing/representing history in any Datalog implementation.

moe22:01:14

i.e. I'm specifically looking for implementation details

jjttjj01:01:49

It's a video, and coming at things from the opposite end vs that datascript writeup (more emphasis on how to fit it into their distributed architecture), but I always liked this talk about #xtdb (formerly crux): https://www.youtube.com/watch?v=YjAVsvYGbuU

moe02:01:33

thanks for that