I'm curious if there's a recommendation for backup/restore for datomic cloud. I'm looking at potentially launching a new product on it, but the existence of d/delete-database scares the everloving tar outa me. I have REPL workarounds for it, but I'd rather have incremental backups available as well.
I second concerns voiced in this thread and if someone would ask me reasons why to not use Datomic Cloud then this always has, is and will continue to be on the top of my list and I've been perplexed how come this topic does not come up more often :S 1. Over years we have accumulated a bunch of churned customers and unused demo accounts in production system and there's a database for each of them. While I'd like to delete them, it is just not worth the risk of something going wrong. It is extremely low but not being able to recover if something were to happen just does not make it even worth thinking about. 2. Also, a requirement to have data backed up and in another region is mandated for us too. There's no way for us around it. The following just does not cut it for the clients and the board. > All data in Datomic Cloud is stored in S3 (as well as other highly durable storages), which provides very high durability guarantees, making secondary data loss prevention (i.e. backup) unnecessary. https://forum.datomic.com/t/cloud-backups-recovery/370
Even if restoring a database would not be a straightforward process and require assistance of Datomic team, just being able to back up customer data (a Datomic system or all the databases in our case) using something straightforward like AWS Backup service would tick so many boxes 🙂
I hope it did not come off as complaining/negative though. Probably small (EU based?) Cloud customers just do not move the roadmap needle enough to warrant the effort on certain features e.g. backups, excision... so it is what it is. We are still out there though and I wish there were more 🙂
Yeah I’m wondering what the suggested 3p would be
there's this https://github.com/fulcrologic/datomic-cloud-backup, but it is unmaintained... I also wrote one way back in the day https://github.com/solita/mnt-teet/blob/af98a88d908c22a89adb7f69b26fba3fadc17cea/app/backend/src/clj/teet/backup/backup_ion.clj idk if there are any truly battle tested and maintained solutions out there
yeah I was hoping someone on the datomic team might chime in and say, "the mnt-teet approach will work, at least in theory" or smth like that.
In my case, after 1-2 years of development of an ERP, we lost our primary (and single) cloud database (probably) due to a d/delete-database operation that no developer (nor Cloud Trail) would allow us to trace back to so that we could learn.
I had put in place a wrapper over all datomic apis, such that e.g. d/delete-database wasn't accessible to the application; only a developer with access to a REPL connection to the prod environment, and knowing how to dis-disallow the delete-db op, could operate it.
If I remember well, this happened just before the factory started using the new ERP, or just a bit after it started but the factory was still testing its processes, so we didn't lose important data; but we lost management's confidence in the system and the CTO decided to rewrite the entire system over DynamoDB, and that was the start of the end for Clojure and Datomic at this company!!
Scary story, scar-ry story.
I stopped maintaining datomic cloud backup at Fulcrologic because it wasn’t peformant. All the tests pass and the logic is sound. That said, I keep hearing promises of a Cloud “Data Portability” feature that is supposed to give you that and much more, so it seemed “inevitable” that my tool would be outdated. I’m still waiting as well. Cloud needs a number of things that I’d love to hear an update on the status of @joe.lane : • Backing up data • Data excision (esp tuples we don’t need anymore and other cruft) • Moving data from one db system to another I agree very much with other sentiments: Just because the db runs on highly-redundant AWS systems, that does not negate the possibility for mistakes and nefarious acts (someone getting access to AWS creds and deleting the s3 bucket). Any reasonably serious business that cannot afford to lose customer data really needs these tools.
Also: I’m not sure, but the Cloud team may have some internal hack solution you could use to affect a backup to another region?
afaict, it's still 3rd party solutions only
Is the primary concern you have @potetm just around user error deleting a given Datomic DB our of the catalog?
yeah that's the one that scares me the most. I get that the data itself is replicated to high heaven. (Though I will add having a copy-able snapshot I can port around is a secondary concern. Mostly for peace of mind, but also for portability.)
Interesting. Is this honestly blocking your decision on whether to use Cloud or not? We have not forgotten about this problem space, and have been working on solutions to problems in this space for a very long time.
Re-reading that first sentence, it sounds like I was being passive aggressive, allow me to try again 🙂
Is whether your engineers can accidentally invoke d/delete-database the primary differentiator of whether you will use cloud?
If so, I can think of a few approaches both at or below user-space that could be implemented to mitigate this risk.
I'm curious what the "below user space" option is. I have a user-space patch already, but if the DB could forbid that action, my primary concern would be met.
What does your user-space patch consist of?
How many databases do you plan on creating? Do you need this per database or per system?
overwrite the d/delete-database var with a fn that checks the class of the connection.
uh, per system would be nice, but afaik there's only gonna be one database on this project, so they're synonymous.
in my case, the backup to another physical region was client mandated... it was a must that a database can be backed up and restored in a completely new environment