Fork me on GitHub
#datomic
<
2019-05-23
>
Ivar Refsdal09:05:54

How fast/slow is datomic.api/as-of supposed to be? What affects its performance when d/pull is used? Without as-of (using the current database) my d/pulling of 6K entities takes about 10 secs. With as-of, it takes about 3 minutes. Is this expected, or does this indicate something else is wrong (too little memory?)? Thanks

Joe Lane14:05:28

@ivar.refsdal Is this for On-Prem? I don’t have any experience with On-Prem specifically but I can imagine the performance here depends on several factors such as memory, caching, client vs peer, number of datoms, etc. Do you have any additional information you can provide?

Ivar Refsdal10:05:13

Thanks for replying! Yes, this is On-Prem and I'm using the peer library. In VisualVM I see that quite some time is being spent inside Fressian/readUTF8Chars (or something like that). Does that mean it is accessing the network? How would I count the total number of datoms? I did

(format "%,3d" (reduce (fn [cnt _] (inc cnt)) 0 (d/datoms (d/history db) :eavt)))
=> "37,605,542"
What is considered a big amount of datoms? Edit: I'm doing a pull star on the as-of-databases. Would it considerably improve performance if this was narrowed? Thanks.

favila13:05:46

Fressian/readUTF8Chars is just decoding a string from a block of fressian-encoded values

favila13:05:17

do you maybe have any very large or numerous string values that are only seen in as-of?

favila13:05:22

otherwise I think this is a red herring

jeroenvandijk15:05:02

Thanks! Are there case studies for datomic ions out there?

benoit15:05:02

@marshall FYI it was not clear to me you had to press the spacebar to start the tutorial

Joe Lane15:05:11

@marshall The livetutorial is fantastic. Kudos

👍 4
benoit15:05:20

Ok I didn't see the controls at the bottom 🙂

Alex Miller (Clojure team)15:05:25

hopefully the first of many...

johnj17:05:40

We need this for solo! 😉 is there any reason the NLB can't be added ?

Joe Lane17:05:05

@marshall I see in the latest ion-starter the clojure version was bumped from 1.9 to 1.10. Does that imply datomic cloud now supports clojure 1.10?

Joe Lane18:05:39

Great! Thanks.

joshkh17:05:31

is :db/fulltext supported in Datomic Cloud? i didn't see it in the schema reference docs, but i thought i'd ask just in case.

marshall17:05:42

No fulltext in Cloud

joshkh17:05:37

okay, thanks Marshall 🙂

joshkh17:05:27

i think i've seen some on-prem examples where some loop function was used to stream values from the transaction log. is something like that possible with Cloud?

ghadi17:05:42

yes you can do that with d/tx-range

ghadi17:05:54

which has slightly different arguments in cloud

ghadi17:05:17

In fact, I am building a pump from the tx-log to ElasticSearch for full-text searching @joshkh

joshkh17:05:39

that's exactly what i'm trying to do!

Joe Lane17:05:52

That is strangely what I am also doing, but into lucene.

ghadi17:05:59

basically filter all the datoms where the value is a string

ghadi17:05:10

and index raw datoms into ElasticSearch

ghadi17:05:43

{"e": 42, "a": "whatever/attr.txt", "v": "full text"}

ghadi17:05:09

then when you get hits you have to know how an attribute relates to a larger entity

joshkh17:05:16

how might that tx-range work efficiently? are you grabbing chunks and storing the latest :t somewhere? wouldn't you have to provide it with new starts and ends?

Joe Lane17:05:20

So you’re going the approach of making the Datom the document instead of reifying the entity into a document?

ghadi17:05:34

yes you have to keep track of highwater marks @joshkh

ghadi17:05:44

i'm going to put them in a dynamo table

ghadi17:05:04

@joe.lane yeah the advantage of raw datom is that you never have to change your indexing code

joshkh17:05:33

brilliant. thanks @ghadi

ghadi17:05:36

if you were making full documents in ES, you'd have to rematerialize and re-index everytime the needs changed

ghadi17:05:49

it's a Rich design, I take no credit

ghadi17:05:48

when the needs change, you can change the query side that knows how to turn "leaf" hits into the larger entity

ghadi17:05:11

it might not work for all needs, but this is what I'm trying first

joshkh17:05:57

i was working on a little trick to rematerialize entities but without references or components (which can be massive). worked out pretty well when i was manually moving entities to elasticsearch.

ghadi17:05:05

you have to handle retractions too, beware

ghadi17:05:19

[e a v t false] -> means delete the document in ES

ghadi17:05:02

and cardinality many attrs might make several documents with the same e + a

joshkh18:05:56

i was imagining storing the high water mark in datomic but realised that would cause a loop 😉

ghadi18:05:24

it's possible, but yeah i'd put it elsewhere

ghadi18:05:58

note that S3 doesn't guarantee read-your-writes when you update the same object

ghadi18:05:20

so Dynamo > S3 for this 🙂

souenzzo20:05:52

New datomic video-tutorial is awesome! thanks!

Ivar Refsdal10:05:13

Thanks for replying! Yes, this is On-Prem and I'm using the peer library. In VisualVM I see that quite some time is being spent inside Fressian/readUTF8Chars (or something like that). Does that mean it is accessing the network? How would I count the total number of datoms? I did

(format "%,3d" (reduce (fn [cnt _] (inc cnt)) 0 (d/datoms (d/history db) :eavt)))
=> "37,605,542"
What is considered a big amount of datoms? Edit: I'm doing a pull star on the as-of-databases. Would it considerably improve performance if this was narrowed? Thanks.