Fork me on GitHub

@magnars we use d/with in our stats processor to check for empty txes prior to submitting them, to prevent tx noise. empty txes have 1 datom: :db/txInstant


I'm wondering. If I configure my datomic peer library using datomic:, how does it know which transactor to connect to?


@pesterhazy: the transactor URI is written to storage.


@jgdavey: interesting. what happens if I have two transactor connect to the same storage uri?


One will become a fallback (HA), and won’t write it’s location unless the other one stops phoning home


that's all automatic, then?


Well, HA technically only kicks in with the paid licenses. Anyone else care to expound?


I just created a new Auto Scaling Group with a new license key. Based on what you said, if I disable the old ASG, the new one should kick in automagically


with no client reconfiguration required


But yes, so long as peers and transactors share storage, the transactor location “communicated” to the peer through storage.


I guess that requires that the transactor has a sort-of public IP address


Well, it just needs to be accessible to the peer.


Transactors actually write two IPs to storage: host is normally the internal network address, and alt-host is usually the public IP


a reassuring word about this in the docs would be great (though maybe I didn't look hard enough)


Peers try host first, then use alt if the first isn’t accessible.


I don’t want to misspeak here, though. Other thoughts, @bkamphaus ?


In this case I'm actually fine with things working out of the box (as they seem to be)


@jgdavey: a slight correction, alt-host is not usually the public IP, but only provided if a different public IP is needed. Of course with docker (or containerization in general) and more vms in the clouds setup, this does show up more.


High availability is documented in fairly high detail here:


I do think there is an organizational deficiency in the docs at present around the heartbeat mechanism and how peers determine which transactor to correct to, including the alt-host mechanism (we’re transitioning this from an implementation detail to a public facing transactor property). We’re considering how we want to address it.


yeah, for me it wasn't clear how peers discover the transactor in the case of dynamodb


I considered the idea that the address is written to storage, but rejected it as unlikely simple_smile


@robert-stuttaford: have you had time to look into turning your datomic-backup script into a gist yet?


the use case is to get a partial backup of a prod db for development, which doesn't include credit card information or db sessions


sorry s/db sessions/session data/


one way I can think of is to get the data on a test system, excise everything you don't want, and then do a backup. Is that what people do?


@pesterhazy: for a few different reasons, to build a dev db I would avoid anything that implicitly “forks” the db (excise on a backup) and do something like replay the log, filtering out datoms that should not go in the other copy.


at the connection and storage level, dbs are unique and there’s no accommodation in Datomic for the concept of “two different versions of the same database” with forked, missing data, etc. The idea of using filtered dbs, or dbs as-of etc. e.g. in query using the API are ways of dealing with db values.


yeah I'm also inclined to think that excision is not the right tool for the job


I've looked into filtering the tx log, but haven't found an obvious way to determine that kind of entity a :db/add refers to


and that's what I want -- filter out certain kinds of entities (payment records, session data), not filter out a specific attribute


plus an attribute (like :user) might be a possible attribute of both payments (which I want to discard) and addresses (which I want to keep)


you can always pull the entity in question to see what’s associated with it. does everything in the db (or at least that has refs to/from it) have some kind of UUID - any unique identifier other than the entity id?


you can always do stuff like pull the entity as of the time immediately before/after a tx, also to inspect it (using the as-of filter on a db), doing it a lot can get expensive perf wise, but it depends on the overall size of the db you’re filtering whether or not that really matters. Also, since it’s going to dev, and doesn’t impact a liveness window for prod.


that's useful


many things have a unique identifier, though maybe not all


havent had a chance, sorry, @pesterhazy !


I'm trying my hand at a simple edn dumper for datomic


that might get the job done as well


that’s what i have, except it writes transit instead of edn