Fork me on GitHub
#datomic
<
2017-04-24
>
maleghast08:04:49

OK, so I have a weird one... (I am new to Datomic and may be just making a fool of myself, but hey-ho). I created a Datomic, in-memory DB from a vector of 23 maps, having defined the following schema:

(def meteorological-observation-stations-schema
  [{:db/ident :station/id
    :db/valueType :db.type/string
    :db/cardinality :db.cardinality/one
    :db/doc "The Unique Identifier for the monitoring station"}
   {:db/ident :station/name
    :db/valueType :db.type/string
    :db/cardinality :db.cardinality/one
    :db/doc "Long name for the monitoring station"}
   {:db/ident :station/elevation
    :db/valueType :db.type/double
    :db/cardinality :db.cardinality/one
    :db/doc "The Elevation at which the monitoring station is placed in metres."}
   {:db/ident :station/latitude
    :db/valueType :db.type/double
    :db/cardinality :db.cardinality/one
    :db/doc "Latitude for the monitoring station"}
   {:db/ident :station/longitude
    :db/valueType :db.type/double
    :db/cardinality :db.cardinality/one
    :db/doc "Longitude for the monitoring station"}])
but, as far as I can tell the 23 maps in that ^^ format have created 161 nodes. I wrote a simple datalog query to check that the nodes had been created correctly and it comes back with 161 results and almost none of them are correct 😞

maleghast08:04:43

I've visually checked my input (CIDER + Emacs ctl+x-e) so I know that I am putting the correct data into the d/transact...

maleghast08:04:49

*confused *

dominicm08:04:25

@maleghast when you say "nodes" do you mean "entities"? What query did you use?

maleghast08:04:09

Hold on, will paste:

(def stations-query '[:find ?station-name ?elevation
                      :where [_ :station/name ?station-name]
                             [_ :station/elevation ?elevation]
                             [(> ?elevation 200)]])

maleghast08:04:22

and yes, I mean entites, with attributes

maleghast08:04:57

I did wonder if the schema was creating an entity for each definition, but that would be 115, not 161 - 161 is (* 23 7)

maleghast09:04:14

What I am getting back is 7 "answers" per station, and 6 of the values for elevation are wrong.

maleghast09:04:28

Like, they are numbers that I have not put into the DB at all__

maleghast09:04:11

I am fairly certain that I am simply missing something about how to write the query, tbh, but after a day's worth of banging my head against it I thought that I ought to just ask... 😉

maleghast09:04:49

Here's the data I am pushing in after I've added / created the schema: https://pastebin.com/0QCLBj9b (Side-note, when did refheap,com stop working..?)

kirill.salykin10:04:15

:where [_ :station/name ?station-name]
                             [_ :station/elevation ?elevation]
you want names for stations with elevations > 20 is this correct? I think you should join them like this
:where [?id :station/name ?station-name]
                             [?id :station/elevation ?elevation]

maleghast10:04:11

@kirill.salykin - OK, thanks, I will try that.

dominicm10:04:48

@maleghast To expand on what @kirill.salykin said. I think you want to make sure that the elevation & the station entity come from the same entity. If you don't you will get station^2 number of entities (- a few where the elevations aren't high enough)

dominicm10:04:18

Sorry, I worded that poorly, hopefully my meaning comes across 😛

maleghast10:04:51

@kirill.salykin - That works, but I need to understand how to write a query that does the above but then also only returns the entities with an elevation greater-than 200

maleghast10:04:11

@dominicm - Yeah, I think that I see what you mean

dominicm10:04:13

That's the query that @kirill.salykin suggested I think

dominicm10:04:46

You can include your [(> ?elevation 200)] still

maleghast10:04:10

Nope, if I do include it I get an error about not being able to resolve the symbol ?elevation

maleghast10:04:19

Hold on I will put it back in and get the actual error...

maleghast10:04:25

Nope, I was being an idiot - forgot to put the s-expression for the predicate inside a vector. *facepalm *

maleghast10:04:58

Thanks both 🙂

maleghast10:04:52

Do either of you have a strong recommendation for learning Datalog - I found the Cognitect docs / tutorial to be less than optimal... 😉

kirill.salykin10:04:17

this is pretty good

maleghast10:04:52

@kirill.salykin - Thanks very much; will have a look now(ish)

dominicm14:04:09

http://docs.datomic.com/best-practices.html#sec-14 I think the pre-processor got confused here. Not sure who to ping about it? 😄

jaret14:04:47

The link is broken

jaret14:04:50

I can fix it

jaret14:04:42

Thanks @dominicm for the report

kschrader14:04:49

Can anyone from Cognitect comment on the following line Note that large caches can cause GC issues, so there is a tradeoff here. (from http://docs.datomic.com/capacity.html)

kschrader14:04:13

we’re thinking about trying a big Object Cache to keep most of our data locally on our peers

kschrader14:04:57

but I’m wondering what that’s going to do to as far as adding a lot of GC overhead

kschrader14:04:28

we can profile this ourselves, I’m just wondering if there’s anything obvious that I should be thinking about here

marshall15:04:22

One thing to consider is you could also use some proportion of the box memory for a local memcached instance

kschrader15:04:30

right now we have 8GB allocated, and 4GB goes to the ObjectCache

kschrader15:04:11

so we were just thinking about making it ⬆️

marshall15:04:00

well, I definitely know folks running 16G in prod. You just want to make sure to keep any eye on your system and ensure you’re not getting into GC hell

marshall15:04:20

how much “extra” ram do you have?

kschrader15:04:27

we run on EC2, so hypothetically I have as much extra ram as I want 🙂

kschrader15:04:55

@marshall do you mean a 16G object cache in production?

kschrader15:04:24

and how would I configure a local memcached instance? :thinking_face:

souenzzo15:04:31

(d/transact conn [[:db.fn/retractEntity (d/tempid :db.part/user)]]) dont thows error... Can I use it?

marshall15:04:45

@kschrader i meant a 16g heap, half of it ocache you can run memcached on the same instance

marshall15:04:57

just install and launch it with whatever excess memory you have

marshall15:04:11

configure both your peer and your txor to use it

marshall15:04:34

since it’s local to the peer you’ll get really great read throughput and latency

marshall15:04:58

for example, I’m running a test right now on an m4.2xlarge, which has 32G of memory (IIRC); My jvm has an 8g heap and i’ve got a 22GB memcached instance running on it

marshall15:04:18

then i configure the memcached endpoint in both the peer and transactor to that instance address & the memcached port

kschrader15:04:57

I see, that’s an interesting idea

kschrader16:04:11

so the transactor would be configured to point at several memcached instances?

marshall16:04:34

if you have a separate system-wide memcached instance

kschrader16:04:36

the problem is that our peer boxes scale up and down during the day

kschrader16:04:00

and I don’t think that there’s a dynamic way to update the memcached config on the transactor

marshall16:04:19

no, there isnt

marshall16:04:29

you could still do it

kschrader16:04:55

we have a memcached cluster though, I don’t think that that’s our problem

marshall16:04:57

you just wouldn’t get the txor pushing new segments to your peer local memcached instance

kschrader16:04:11

I think that our problem is churning segments during some of our queries

kschrader16:04:22

which seems to be a high-overhead activity

kschrader16:04:32

when I profile locally in a memory constrained environment I see a lot of time spent in org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse and com.fasterxml.jackson.core.json.UTF8StreamJsonParser.getText

kschrader16:04:58

if I increase the memory allocation then those hotspots go away

kschrader16:04:22

and once the cache is warmed the response time is about 40x faster for the queries that I’m profiling with

kschrader16:04:13

(which obviously isn’t the same load as our production infrastructure, but it’s clearly something)

kschrader16:04:50

we also see a bunch of time spent in java.io.BufferedInputStream.read, java.io.DataInputStream.readFully, and java.io.DataInputStream.readInt

kschrader16:04:28

which all feels like cache churn to me, but I could be misunderstanding something

robert-stuttaford16:04:23

@marshall @jaret any way we can configure a longer timeout for S3 restores?

Copied 0 segments, skipped 128128 segments.
Copied 0 segments, skipped 128128 segments.
Copied 0 segments, skipped 128128 segments.
java.util.concurrent.ExecutionException: java.net.SocketTimeoutException: Read timed out

robert-stuttaford16:04:36

these are becoming tiresome to retry-a-thon our way through

robert-stuttaford16:04:52

this is when we restore to a dev machine

kschrader16:04:26

☝️ we also have this problem on a regular basis

marshall16:04:25

@kschrader that may or may not be cache ‘churn’ - it is potentially just the cost of reading a ton of data

marshall16:04:40

which may also be alleviated by local memcached

marshall16:04:03

@robert-stuttaford and @kschrader I don’t believe that is configurable currently - I’d suggest adding it as a feature request

marshall16:04:08

i will also pass it along

kschrader16:04:26

ok, but if I increase the heap size then the problem goes away

marshall16:04:00

how do your object cache hit rate metrics look?

marshall16:04:11

in the prod env

marshall16:04:55

unlikely to be churn then is it possible you’re under memory pressure from your app?

kschrader16:04:07

that’s also a possibility

marshall16:04:20

there’s nothing horrible about running a 12 or 16 heap

marshall16:04:27

just make sure you keep an eye on it at first

kschrader16:04:57

I think that we’ll try to bring up another cluster with 16GB heaps and see what happens

marshall16:04:09

and have a reasonably scaled compute with it - you don’t want to have a huge heap with i.e. a 1 or 2 core processor

marshall16:04:17

you can get in a situation where G1 can’t keep up

marshall16:04:24

without more compute

kschrader16:04:28

yeah, we’d move to 8 CPUs along with it

robert-stuttaford17:04:04

@marshall i hope you give substantial weight to each vote on http://Receptive.io because each one counts for an organisation which represents many people 🙂

jaret17:04:51

@robert-stuttaford We are absolutely considering organizational weight and power users when looking at our feature request feedback

erichmond21:04:21

this isn’t technically datomic support, I know, but can we ask questions here?