Fork me on GitHub
#datomic
<
2017-08-15
>
kennethkalmer09:08:57

I’m curious how folks are dealing with slow queries, more specifically slow queries that eagerly load a ton of results for further processing. I’ve done as much as I can to speed up queries (liberally adding indexes, isolating huge swaths of data in separate partitions to help the peer load less segments), and now I’m starting to hit limits due to the size of data…

kennethkalmer09:08:02

My next experiment is to try and simplify the queries to only give me a starting point, and then try various combinations of core.async and transducers to walk the graph and see if that speeds things up

kennethkalmer09:08:23

Just curious if anyone else has gone down this path and has any advice to offer

val_waeselynck09:08:11

I've had the same problem, my approach has been to offload the work to ElasticSearch. I think Datomic is just not well suited for low-latency analytical queries that span a lot of data; fortunately, thanks to the txReportQueue and the Log API, it's very well suited to be a source for derived data systems.

val_waeselynck09:08:59

Also note that the current Datalog engine is not completely immune to the N+1 problem; I've observed that running a Datalog query which only needs one index access is still 100x slower than using the raw index API - as if there was some startup time associated with the Datalog engine

val_waeselynck09:08:07

Of course I encourage you to profile and draw your own conclusions

kennethkalmer09:08:40

thanks, these very useful insights! I’m already working off derived data (source data is also in datomic but in different partitions), but there are just some conditions that datalog seems to fall flat under

kennethkalmer09:08:18

I’m also trying to keep the stack very flat and simple, it is not a huge app, just a lot of varied data (but not “big data” either)

robert-stuttaford09:08:05

you should use Datalog when you need to model joins. if you know you’ll only be using one index, you can almost certainly do it faster with d/datoms, because Datalog will always produce two result sets one for the clause, and one for the find expressions. reductions over d/datoms produce only one.

kennethkalmer10:08:35

Thanks Rob, I’ll have a look. I’m really sure I can do it with one index, if not, it won’t be a big jump to restructure the derivatives to make this possible

kennethkalmer10:08:11

ZOMG Rob! You just opened my eyes to something amazing

ibarrick13:08:43

Is there a way to add :db/doc to a transaction using the client API?

favila13:08:21

{:db/id "datomic.tx" :db/doc "my doc"}

ibarrick13:08:15

Could you point me to a place I can go to understand that?

favila13:08:08

>the temporary id "datomic.tx" always identifies the current transaction

ibarrick13:08:25

Perfect, that's exactly what I needed. Thank you!

ibarrick20:08:03

I'm getting: The following forms do not name predicates or fns: (tx-ids) Am I not able to use the helper functions for the Log API from Client API?