Fork me on GitHub
#datomic
<
2020-07-13
>
kschltz12:07:17

Hi, there. I have the following predicament We have a production cloud setup with over 3B datoms, and I need to "update" the whole base without interfering, or at least causing little impact as possible, to the undergoing operation. I need to transform something like:

{:ns.some-field "foo"
 :ns.time{:zone "America/Sao Paulo"
           :utc #inst"2020-01"}}
into:
{:ns.some-field "foo"
 :ns.utc-time  :utc #inst"2020-01"}

kschltz12:07:36

I was considering to deploy a transaction function to do that, but I wasn't so sure about not interfering with my transactor. Any advice is welcome

favila12:07:53

1. make normal operation use old and new style; 2. backfill the old style; 3. make normal operation use only new style; 4. retract the old style (optional)

favila12:07:23

if you can insert an abstraction layer between the application and the database, you could probably produce your new style on-the-fly from the old style. If so, you can cut or simplify some of these steps

kschltz12:07:14

We're already on step 1

kschltz12:07:39

Now we're trying to figure out what would be the best approach to accomplish the backfill

favila12:07:59

best approach in what sense?

kschltz12:07:54

we tried to walk through the indexes, retrieve the old style, the transact it back, doing what we needed via :db/cas

kschltz12:07:29

but it proved harmful, since it took a toll on our write throuput

favila12:07:51

how large were your transactions?

kschltz12:07:10

500 swaps each

favila12:07:31

did you retract also?

favila12:07:55

so about 500 datoms per tx? the rule of thumb is ~1000, that may help. you can also not pipeline if you are worried about working too fast

favila12:07:25

What is your estimate of how many datoms you need to commit?

kschltz12:07:24

the best would be to convert the whole base

kschltz12:07:40

which is over 3B datoms

favila12:07:06

you have no other datoms than these?

kschltz12:07:24

in this particular database, no

favila12:07:04

well, then writing 3b additional datoms is unavoidable

favila12:07:23

plan your write capacity accordingly?

favila12:07:50

if this database is really this simple, perhaps you could copy it to a new database?

favila12:07:57

then switch the application over

kschltz12:07:47

I really liked the idea of copying the database

kschltz12:07:55

seems to be the most practical one

favila12:07:19

do you care about transaction history?

favila12:07:29

i.e. preserving transaction times and metadata

kschltz12:07:00

for a portion of it, yes

kschltz12:07:13

I'll discuss it with our team

favila12:07:32

ok, this is frought with edge cases, but you can do something called “decanting”

favila13:07:13

you read the transaction log and transform it before transacting each transaction to a new db

favila13:07:22

think of it as a git-rebase

kschltz13:07:04

I was just thinking about it

favila13:07:15

the advantage is only that you avoid having a single db with old+new style sum of datoms

favila13:07:53

you could also perform most of it offline (writing into a dev database if you are on-prem), then backup+restore to get it into production

favila13:07:07

you will need some catchup and downtime to switch the application over, though

kschltz13:07:19

thats manageable

favila13:07:39

if you can afford it, it’s probably better to just go the way you are going now, honestly

favila13:07:07

I assume you’re using dynamodb with provisioned write capacity?

favila13:07:31

your call, but bumping it up for a few days will probably be cheaper than the engineer time to get a decant running smoothly

kschltz13:07:33

Well, I think we have a few paths to consider now

kschltz13:07:44

I much appreciate your insights

kschltz13:07:46

thanks a lot

favila13:07:06

glad to help

Lone Ranger22:07:24

finally have #datomic deployed in production. HUGE shout-out to @favila for a tremendous amount of help along the way!! partywombat thanks so much to @marshall for getting us hooked up with the license 🙂

🎉 3
kenny23:07:33

Composite tuple attrs don't support reverse lookups, right?

kenny15:07:56

It seems like a useful addition 🙂 I'd like to express "all entities in a card many attribute must be unique by a particular other attribute."

favila16:07:26

could you add that invariant to the attribute itself and ensure it with an attribute predicate?

favila16:07:43

“the attribute itself” = a normal cardinality-many ref

kenny16:07:54

Perhaps. It's not clear that enough context is passed to the predicate to be able to check something like that.

favila16:07:06

ah, yeah, attr predicate may not work

favila16:07:11

db/ensure will though

kenny16:07:18

Yeah, looks possible with an entity predicate. It'd be "built in" if composite tuples supported reverse lookups though 🙂

favila16:07:41

the semantics are a bit unclear though

kenny16:07:07

Even if the card many ref is required to be a component?

favila16:07:11

I’m not sure I follow?

favila16:07:35

a composite tuple is non-homogenous, so which backref is it following?

favila16:07:02

it puts you in a weird situation where :eavt and :vaet are inconsistent

kenny16:07:33

I think I'm missing something. What's the inconsistency?

favila16:07:26

Suppose you have a composite tuple :foo+bar consisting of refs :foo and :bar from the same entity

favila17:07:14

:eavt entries will be [123 :foo 456] [123 :bar 789] [123 :foo+bar [456 789]]

favila17:07:14

if :_foo+bar worked and brought you to 123, from where could you follow it, and what would the :vaet index look like?

kenny17:07:58

I think I see what you're getting at but I was after something different. I see how my question was poorly worded. I'm after something like this:

[{:db/ident       :user/addresses
  :db/valueType   :db.type/ref
  :db/cardinality :db.cardinality/many}
 {:db/ident       :address/type
  :db/valueType   :db.type/keyword
  :db/cardinality :db.cardinality/one}
 {:db/ident       :address/user+type
  :db/valueType   :db.type/tuple
  :db/tupleAttrs  [:user/_addresses :address/type]
  :db/cardinality :db.cardinality/one
  :db/unique      :db.unique/identity}]

kenny17:07:38

i.e., a particular user can have many address types but only one of each type.

favila17:07:10

why not :user/address_type1 :user/address_type2 etc?

favila17:07:15

is the type open?

kenny17:07:58

But you're right - if it was closed I'd go that route.

favila17:07:46

ah, ok, you want this index to keep an invariant

favila17:07:57

I think I get it now

favila17:07:05

wait, I still don’t get it

favila17:07:23

how would your attribute ever not be unique?

favila17:07:32

your hypothetical composite attr

kenny17:07:09

If you try asserting one that already exists?

favila17:07:08

?user :user/addresses ?address has two ?address with the same type?

kenny17:07:50

That would fail or upsert (if :db/id is specified) on transact

favila17:07:19

is that the scenario you are trying to exclude by this index?

kenny17:07:44

Having multiple of the same type in this card many coll would result in undefined behavior for us. The card many must by "distinct-by" the :address/type.

favila17:07:24

ok I can see this making sense as a feature. I completely misunderstood what you were asking for originally. Although I think db/ensure is the right way to tackle this.

favila17:07:54

you could also introduce another joining entity

kenny17:07:03

Yeah, sorry... Reading my question again I realize how ambiguous it is 😬

kenny17:07:31

I thought about that. It feels a bit gross though...