Fork me on GitHub
#datomic
<
2022-10-25
>
frankitox12:10:46

Hi! Is there any caching going on in peers? Is there any way to force that caching? I'm having some problems with new deployments. When I deploy I initialize around 15 new machines that connect to the database and the first responses from the webserver take really long

frankitox12:10:55

Charting this responses correlates with a 'Read usage' metric in DynamoDB, so I'm thinking the problem might be that each new server asks data from the transactor which then pulls from Dynamo.

enn17:10:15

nit: the peer server is not reading from the transactor, it's reading directly from Dynamo. In addition to adding another caching layer like memcached or valcache, if you know the data that the peer is going to need to serve web requests, you can try to force it to be loaded into the object cache by reading that data in your app before it starts handling traffic.

frankitox19:10:43

@U060QM7AA so peers may read straight from Dynamo? Yes, that seems like the more sensible idea. Something like preloading the database may help.

enn19:10:55

That’s my understanding. Writes are centralized through the transactor but each peer can read from storage.

jdkealy18:10:20

what would happen if someone manually edited a record in dynamodb table ?

ghadi18:10:15

that's a bad kitty

jdkealy18:10:31

would that require an entire database restore ?

ghadi18:10:28

potentially

Joe Lane18:10:02

did this actually happen? if so, contact support.

jdkealy19:10:01

no. i just saw someone clicking around and inspecting records in AWS dynamo console, and i noticed the update button, and it dawned on me…. “this would be really easy to make my day horrible”

onetom19:10:55

that's a good point... i think we are also overly liberal with our AWS access permissions...

Dustin Getz20:10:06

For the entity API, is the perf cost of not maximizing reuse of the entity ref an issue? By which I mean only locally saving the ref returned by (d/entity db e), but in larger scopes passing around scalar ids and re-establishing the entity ref?

Dustin Getz20:10:00

I understand that the entity essentially memoizes its access to (d/datoms :eavt e) but at what scale does this start to matter especially with SSD cache 1ms away?

favila20:10:45

It really only pays off if you repeat reads to the same attributes, because that guarantees you will never go even to the object cache. If you’re always reading new attributes there’s no difference.

favila20:10:22

d/touch forces all attributes to be read if you want to do that eagerly

Dustin Getz20:10:45

do you know what the cost difference is between object cache and SSD?

favila21:10:09

It’s the difference between pure B+tree pointer chasing (assuming every pointer is loaded) vs io scheduling and decoding fressian into objects on miss

👍 1
Dustin Getz21:10:19

And is it true that for "large" dbs that the object cache might be entirely evicted and rebuilt on a request to request basis due to the object cache being significantly smaller than the actual indexes?

favila21:10:03

if your entire workload does not fit in OC, something will get evicted

favila21:10:29

also new indexes create new segments, and are an inherent source of eviction

👍 1
Dustin Getz21:10:51

is that common for the workload to not fit in OC ina prod cofiguration (so permitting eng to partition the query load across multiple query boxes if that is a thing people do to make datomic fast)

favila21:10:26

You’re asking if people use big or small dbs with datomic. 🤷

👍 1
favila21:10:40

I know that for us, we have 17+ billion datom db, and routing request loads for locality became essential for performance at < 4 billion datoms

Dustin Getz21:10:35

and is it even possible to make it so OC is mostly not thrashing every request

favila21:10:36

but we have a lot of dumb select *-like workloads which probably thrash the cache horribly anyway, and we have new indexes every 10-15 minutes

👍 1
favila21:10:01

yes it is, certainly, by controlling your read workload and locality

👍 1
favila21:10:28

watch your object-cache hitrate

👍 1