Fork me on GitHub
#datomic
<
2022-07-13
>
Athan22:07:43

Hi, I am exploring Datomic system partition and at the same time I am learning its query language. I have a couple of questions. Here is a simple query to fetch all system idents

(sort-by first (d/q '[:find ?e ?ident
					  :where
					  [?e :db/ident ?ident]
					  [(< ?e 72)]]
					(d/db conn)))
and here is a modified version to read doc strings
(sort-by first (d/q '[:find ?e ?ident ?description
					:where
					[?e :db/ident ?ident]
					[(< ?e 72)]
					[?e :db/doc ?description]]
				(d/db conn)))
1. How can I modify the second query so that it returns the same set of id(s) as the first one but also get the doc string for those id(s) that are associated with it ? What is the equivalent to SPARQL OPTIONAL (see this https://w.wiki/5Sp7) ? 2. I noticed there is a gap of :db/id(s)
[4 :db.part/user]
[7 :db/system-tx]
and I tried to pull the following entities with
(d/pull (d/db conn) '[*] 6)
(d/pull (d/db conn) '[*] 7)
Nothing returned back, any particular reason that these entities with [:db/id 6, :db/id 7] do not exist in Datomic bootstrap schema ? Another question is what if Datomic wants to expand its bootstrap schema in the future, i.e. add some extra functionality, system ident, etc. Have you reserved id range for that? I have counted 71 system idents. Is there a possibility to collide with user customized schema that starts always from 72 ? Is that correct, am I missing something here ?

Athan08:07:41

1. It seems I found the equivalent of OPTIONAL in SPARQL which is https://docs.datomic.com/on-prem/query/query.html#get-some and the query becomes

(sort-by first (d/q '[:find ?e ?ident ?description
						  :where
						  [?e :db/ident ?ident]
						  [(< ?e 72)]
						  [(get-else $ ?e :db/doc false) ?description]]
					(d/db conn)))

favila15:07:47

Another way to do this is to pull. Generally itโ€™s preferred to use query for finding entities and pull to retrieve data from them, rather than to produce a tabular result that combines entity-finding with field extraction.

(d/q '[:find (pull ?e [:db/id :db/ident :db/doc])
       :where
		[?e :db/ident]
		[(< ?e 72)]]
	(d/db conn))

๐Ÿ‘ 1
favila15:07:21

There is also d/qseq which performs the pull lazily, which can save memory for large results

๐Ÿ‘ 1
favila15:07:38

and you can parameterize the pull

๐Ÿ‘ 1
favila15:07:15

(d/q '[:find (pull ?e pull-expr)
       :in $ pull-expr
       :where
		[?e :db/ident]
		[(< ?e 72)]]
	(d/db conn) [:db/id :db/ident :db/doc])

Athan16:07:47

Smashing, thanks Francis great tips indeed for a newcomer like me, I did a quick benchmark with 1000 repetitions, the pull version in find specifications is approx 40% faster although the data set is too small to draw safe conclusions. 1. Without pull - Elapsed time: 1.233 secs 2. With pull - Elapsed time: 0.827 secs So first it finds all the eid(s) then it's mapping a pull on the result set to retrieve data patterns, something like

(map #(d/pull (d/db conn) '[:db/id :db/ident :db/doc] %) eids)