This page is not created by, affiliated with, or supported by Slack Technologies, Inc.
2024-01-25
Channels
- # aleph (1)
- # announcements (6)
- # babashka (13)
- # beginners (21)
- # clj-http (25)
- # clj-kondo (23)
- # clojure (17)
- # clojure-europe (49)
- # clojure-nl (1)
- # clojure-norway (8)
- # clojure-uk (4)
- # clojuredesign-podcast (15)
- # clojurescript (6)
- # cursive (18)
- # datomic (22)
- # emacs (29)
- # hyperfiddle (55)
- # introduce-yourself (4)
- # polylith (34)
- # portal (10)
- # releases (1)
- # shadow-cljs (16)
- # spacemacs (2)
datomic on-prem storage docs for SQL mention PostgreSQL, MySQL and Oracle… will MariaDB do? as it should mostly be compatible with mysql, but not exactly
Prior to Free we had one customer I knew of using MariaDB. The only downside of FREE for me is not being able to heartbeat with these projects at license renewal time to see how things are going. Please do let me know if you encounter any hiccups setting up MariaDB if you go that route.
I'm coming back to programming after a hiatus of several years to build an application that may scale, perhaps significantly, as a do-it-yourself startup focused on climate solutions. I've used datomic before on several toy projects and am fully sold on the concept of using datoms to store data, rather than a "place-oriented-programming" approach. It makes sense to me to use Datomic Pro. What I'm in the dark about, from lack of experience, is what storage service to start with to keep costs low to begin with and yet allow for rapid scaling, if this happens to occur, before there is sufficient revenue to pay someone to help. I'm tempted to start with an SQL database. I know how to move those around from server to server if necessary, and if this thing doesn't scale, then this would be an easily affordable option. I don't know how to move datomic data from one storage solution to another. I spent a few hours on google, looked through the docs in a cursory way, and didn't find what I was looking for. If this is easy to do, then my initial choice of storage service may not matter much. Thanks in advance!
One way to move storage is to do a datomic backup and then restore into a new transactor. It requires you to stop the world during the transfer, but it gives you full freedom in choice of storage.
@U07FCNURX recently moved a Datomic pro transactor to SQLite, which seems like a good place to start - cheap and easy to get up and going.
Yes, using SQLite as a storage solution for Datomic pro was an elegant solution for me. SQLite is optimised for a single writer and many readers, which is a good fit for Datomic's model - and there is no need for a separate running process. It does require you to run on hardware with direct access to disk, however. If your startup needs to scale up later, moving to a different storage solution is no issue.
@U07FCNURX So moving to another storage solution effectively means restoring a datomic backup into a new transactor as @U9MKYDN4Q suggested? Is that the method generally used?
Just saw no one had added the link. Backup and restore and moving underlying storages is a great super power IMO (providing future leverage and choices). Datomic uses all supported underlying storages the same as a K/V store so you can use what works best for you.
Reading through SQLite’s documentation, I get the impression that this piece of software is actually quite robust, within its limitations, and has the potential to support a substantial amount of traffic. As @U07FCNURX noted, it is a single writer, but to that end I found this statement in its documentation: “Actually, SQLite will easily do 50,000 or more https://www.sqlite.org/lang_insert.html statements per second on an average desktop computer. But it will only do a few dozen transactions per second. Transaction speed is limited by the rotational speed of your disk drive. A transaction normally requires two complete rotations of the disk platter, which on a 7200RPM disk drive limits you to about 60 transactions per second. Transaction speed is limited by disk drive speed because (by default) SQLite actually waits until the data really is safely stored on the disk surface before the transaction is complete. That way, if you suddenly lose power or if your OS crashes, your data is still safe.” This seems to imply that it will/may handle more than 60 write transactions per second on an SSD. Here’s another statement from the documentation: “SQLite works great as the database engine for most low to medium traffic websites (which is to say, most websites). The amount of web traffic that SQLite can handle depends on how heavily the website uses its database. Generally speaking, any site that gets fewer than 100K hits/day should work fine with SQLite. The 100K hits/day figure is a conservative estimate, not a hard upper bound. SQLite has been demonstrated to work with 10 times that amount of traffic.” Given that Datomic Pro has been released without licensing charge to promote its use, I think it would be nice to see documentation how to use SQLite as it storage service for first time users and small to medium sized applications. The SQLite docs say that, contrary what many might assume by its name, it is by far the most used database engine in the world, and is most likely in the top 5 most used pieces of software in the world.
Hi Datomic folks, I’m running into a use case that seems likely to be somewhat common, and I’m wondering if anyone in the community has found a good solution.
Our transactions generally have two categories of data: updates to domain objects and audit metadata. For example, if we set some attribute :foo/title
, we will also add audit attributes on the transaction recording auth and tenancy information, as well as incrementing a :foo/last-updated
attribute.
Sometimes, we try to perform an update to a domain object that turns out to be a no-op--`:foo/title` already has the value we are transacting. We can try to defend against this in the peer, but of course that is vulnerable to races. In this case, Datomic removes the :foo/title
datom from the transaction since it has no effect. However, we then end up with an “empty” transaction that has no real content, but does have a bunch of audit metadata. This is not optimal because we have various things that consume the transaction log and take action based on it, and now those consumers need to know to ignore these “empty” transactions.
Ideally we would be able to determine that there is no real domain operation needed and abstain from transacting in the first place. I’m thinking we might be able to do something like this by dry-running the transaction using d/with
in a transaction function, then looking at the datoms in the transaction to check whether it contains only audit data. But that seems a little hairy and I also worry about the performance.
I haven’t had issues using with
for this in the past (check to see if tx is a no-op and abort if so), not sure I follow while you’d anticipate that being difficult or slow?
on the peer I am not too concerned … on the transactor I’d worry that basically I’m asking the transactor to do double the work (a dry run of every transaction followed by transacting).
Ah, I see. I probably wouldn’t approach it that way in a transaction function? Though I guess it depends. From a high level, it would kind of flow from the domain semantics? I.e., two simple cases: Either everything that’s getting into the system does so as part of some guarded, curation process. In which case, speculative work slowing things down to keep noise out of the system is justified. Or it’s a noisy stream of events being collected in which e.g. tx queue subscribing peers are already filtering down to only a subset of interest.
I wouldn’t want to satisfy too many constraints broadly across the system at the same time. It’s also not clear to me how you want this to proceed from the peer side, domain semantics wise? I.e., what does it mean if two peers submit a tx at the same time, and there’s order dependent behavior, in that if one peer gets in first, it will be a no-op, but if it gets in later, it would overwrite the transaction it wasn’t aware of.
if you just mean eg a race to commit the same fact, I’d probably prefer to use [cas](https://docs.datomic.com/pro/transactions/transaction-functions.html#dbfn-cas) or something with the same semantics: reject if the value has changed since the peer last saw it before submitting the tx. Rather than relying on assuming that a tx I expected to result in datoms didn’t being an implicit cause for rejection that way.
Yes, we’re using cas here already on the last-updated timestamp--I guess that does make it safe to do the d/with
on the peer and rely on the result. I had not really thought about that.
We do dry run all transactions using d/with on the peers and it does ~half our throughput and doubles the latency on API requests to the peers but it’s horizontally scalable, whereas the transactor is not, and it reduces transactor load if you have a lot of invalid transactions or dupe facts.