Fork me on GitHub
#datomic
<
2016-09-15
>
magnars10:09:34

@robert-stuttaford Hi! The ever-decreasing index for latest entities is working great, but I stumbled over this from DataScript: https://github.com/tonsky/datascript/wiki/Tips-&amp;-tricks#getting-top-10-entities-by-some-attribute - Quote Nikita: > "Reverse return a special view on an index that allows walking it in the reverse direction. This operation is allocation free and about as fast as direct index walking." Would this be possible in Datomic as well? Or is this possible for DataScript because it keeps everything in memory? Any thoughts? 🙂

robert-stuttaford11:09:17

i'll try it and let you know, @magnars 🙂

robert-stuttaford11:09:09

very quick testing seems to suggest that the same thing works in Datomic, @magnars

robert-stuttaford11:09:18

which is -ing awesome

magnars11:09:32

whoa, that is impressive

robert-stuttaford11:09:40

with a warm cache, counting 50k datoms forwards and backwards takes the same amount of time

robert-stuttaford11:09:42

trying a bigger index

magnars11:09:39

is that using a special trick, or does it work out of the box with reverse?

robert-stuttaford11:09:34

(time (->> (d/datoms (db/db) :aevt :chat.event/client-uuid)
             seq
             reverse
             count))

  ;; with no reverse
  "Elapsed time: 925.572616 msecs"
  3809777

  ;; with reverse
  "Elapsed time: 2667.018015 msecs"
  3809777

robert-stuttaford11:09:58

nearly 3x slower for a larger index

robert-stuttaford11:09:04

but still quite quick

magnars11:09:15

yeah, not bad

robert-stuttaford11:09:21

(time (->> (d/datoms (db/db :events) :aevt :event/uuid)
             seq
             reverse
             count))

  ;; with no reverse
  "Elapsed time: 1492.233743 msecs"
  6576672

  ;; with reverse
  "Elapsed time: 4446.358167 msecs"
  6576672

magnars11:09:54

But the implementation of reverse in clojure.core is not lazy, and looks like (reduce1 conj () coll)

robert-stuttaford11:09:39

that-shrug-emoji 🙂

magnars11:09:28

Nikita says "This operation is allocation free", so he can't be talking about that implementation of reverse.

robert-stuttaford11:09:24

i guess we'll have to ask the fine Cognitects if a similar thing happens for Datomic indexes.. @marshall, @jaret? 🙂

Niki11:09:40

no idea what datomic does

Niki11:09:05

I think you can get a pretty good idea just by trying to iterate reasonably big database in reverse

Niki11:09:03

@robert-stuttaford you clearly don’t want to do seq before reverse

Niki11:09:23

if there were any kind of optimizations that might kill it

Niki11:09:11

user=> (time (first (d/datoms (db/db) :aevt :event/uuid)))
"Elapsed time: 0.322985 msecs"

user=> (time (first (reverse (d/datoms (db/db) :aevt :event/uuid))))
"Elapsed time: 124.845047 msecs"

user=> (time (count (seq (d/datoms (db/db) :aevt :event/uuid))))
"Elapsed time: 188.143681 msecs"

Niki11:09:22

pretty sure there’s no optimization for it

Niki11:09:59

which is a real bummer: all they had to do is implementing reverse iterator

Niki11:09:21

you can write to mailing list and they might consider that for the next version

magnars11:09:05

That's very interesting. Thanks for the answer, @tonsky. I'll keep my ever-decreasing index for now then. 🙂

Niki11:09:50

I remember reading (that was ~2 years ago, but still) that they recommend to store negative values if you want quick access to latest, not first, datoms

Niki11:09:25

we even use to store negative timestamps (as longs) back in these days, for that reason

Niki11:09:02

yeah, that’s the solution

Niki11:09:43

I believe B-Tree index can handle insertion to the head just as well as it does instertions to the tail

Lambda/Sierra14:09:52

The structure of Datomic's indexes in storage makes iterating in reverse non-trivial. When you call reverse you're just using ordinary Clojure sequence reverse, which realizes the whole sequence in memory.

magnars14:09:21

Thanks for the clarification, Stuart. đź‘Ť

iwillig18:09:44

is 100k datums still consider the the rough size limit for datomic transactions?

jgdavey18:09:18

10 billion is the figure I hear most often Ignore me.

marshall18:09:38

per transaction?

jgdavey18:09:46

^whoops. misread

marshall18:09:02

I wouldnt suggest over 100k. That actually is a bit high, but of course like everything, it depends

marshall18:09:17

ideally you’re in the thousands to tens of thousands max

marshall18:09:34

per transaction

jdkealy19:09:33

I'm trying to excise an entity, though in my tests the entity is available immediately to (d/touch ) on a new db instance... Is there some kind of indexing period when the entity will still be there?

jdkealy19:09:41

I'm successsfully excising in other parts of my app... My excised entity does have some ref'd entities that are components, I don't know if that makes a difference.

jaret19:09:16

How many datoms are you excising? At some point after the excision the indexing job runs. The resulting index will no longer contain the datoms excised. The indexing job however is proportional to the size of the entire database.

jaret19:09:49

you can utilize sync-index to force the execution of the indexing job

jdkealy19:09:58

Cool. I'll look into that... It's in my tests, just a single entity with 5 datoms, and a component with 2. I don't totally need to excise it, I'm just a little paranoid after reading about 10B datoms and I'm not particularly worried about preserving history on this individual entity type.

Ben Kamphaus20:09:53

@jdkealy everything in Datomic is optimized around immutability so in general it’s a terrible idea to excise. I’d leave it to the “legally compelled to remove this fact” case.

jaret20:09:28

@jdkealy To echo what Ben said I highly recommend that you not go that route. @marshall and I have been discussing this “problem” with other clients and if you think you’d benefit maybe we can chat about solutions

jdkealy20:09:28

Yes, I'd like that... I'm fine with adding some attribute like archived:true/false or something like that... I'm really just concerned with the 10BN datoms I've been reading about. I think this kind of data I might want to use a different data store for.

marshall20:09:39

@jdkealy We’re happy to set up a call. Can you email me at <mailto:[email protected]|[email protected]> ? Also, you could consider noHistory on some attributes if you don’t want/need any history tracking on them

jdkealy20:09:11

Cool, will do marhall

eggsyntax21:09:34

I think I'm missing something obvious here: why is it that

'[:find ?id :where [?id :foo/bar 33]]
works (finds several ids), but
'[:find ?id :where [?id _ 33]]
returns an empty set? I could imagine that being disallowed by datomic, but if so, I'd expect an error rather than just getting no results.