Fork me on GitHub
#datomic
<
2017-03-09
>
marshall14:03:23

@csm do you have a small repro case? Also, what version of datomic?

dominicm15:03:52

I'm doing some profiling, and I'm seeing some threads: query-1, query-2, query-3, query-4. They're holding the bulk of our allocations right now. Wanted to know if they're from datomic & what they do, should I be concerned about their usage.

dominicm15:03:29

They aren't actually growing, so it may be caches or something. But thought I'd ask.

favila15:03:25

@dominicm I believe they are the threads that actually execute queries

favila15:03:16

I always see them light up (and no other threads) during a long-running query

dominicm15:03:41

I see… Particularly for d/q or for all queries e.g. d/datoms? @favila

favila15:03:50

there are no other queries

favila15:03:57

or am I missing something?

dominicm15:03:05

Sorry, do they get used for d/datoms do you know?

favila15:03:12

I believe not

favila15:03:50

d/datoms has hardly any cpu impact

favila15:03:54

so I can't be sure

dominicm15:03:27

No, I thought it shouldn't.

dominicm15:03:38

Bit surprised at what would be querying right now is all.

dominicm15:03:44

Anything big should have been converted to d/datoms already

favila15:03:39

I think you get a # equal to number of cores

favila15:03:33

and a single query will run on many of those threads at once

dominicm15:03:01

Hmm, I'm only seeing a single query thread light up allocations at once. Also, all 4 threads (before my job started) were sat at ~25% usage of the memory.

dominicm15:03:09

Now they're at 20% (big job running)

favila15:03:37

maybe there's no parallelism to exploit?

dominicm15:03:41

The big job isn't supposed to allocate much (lazy sequence), so I suspect we're doing something wrong there.

dominicm15:03:51

Ah, only when parallelism can happen. That makes sense. 🙂

dominicm15:03:33

I kinda thought 99% of datomic was parallelisable.

favila15:03:56

only large intermediate sets benefit

favila15:03:26

if a clause is very selective, there isn't anything to parallelize over

dominicm15:03:22

Makes sense. I think we're doing enrichment of data from other entities. Inferring certain properties, I'm not overly familiar with the intimate details of the queries we use running over these datoms though

devth17:03:02

watching google next keynote live stream – spanner looks really cool. i know it's been discussed before, but my next thought was immediately "i wonder if this would work as a datomic backend"

csm18:03:13

@marshall I don’t have a repro case yet (this was provoked by two µ-services, one writing, one reading) though I would like to put one together. This is with 0.9.5561.

csm18:03:25

I actually tried (dissoc (d/db conn) :t :next-t), and at least couldn’t reproduce the issue

shaun-mahood18:03:16

Has anyone integrated Datomic as part of a project that includes a separate GIS system? I'm going to building a PostGIS backed system and I want to see if I can use Datomic as well for as much of the non-spatial data as possible, but I'm not sure exactly where I should be putting those boundaries and if there are any real-life issues that I'll run into that I haven't considered yet.

Lambda/Sierra20:03:33

@shaun-mahood I don't know about GIS specifically, but I do know people have used Datomic alongside other storage/indexing systems.

Lambda/Sierra20:03:29

The general rule is to store things in the system that's optimized for storing that kind of data: small, relational, transactional values in Datomic; geo-spatial data in a geo-spatial system; binary blobs in a blob store; etc.

shaun-mahood23:03:33

@stuartsierra: Thanks, that's kind of what I'm leaning towards - good to know there is some success out there for other areas and it's not just a plain old bad idea. In unrelated news, the Lambda Island video on component finally got me over the hump to start using it 🙂