Got this thought after reading https://clojurians.slack.com/archives/C1Q164V29/p1767608105617649?thread_ts=1767558223.369749&cid=C1Q164V29 in the thread above. Maybe we're at the start of the "ad-hoc DB connectivity renaissance". We started with ad-hoc libraries, got to JDBC, and are now moving away from it, at least in some projects. Similar to how we started with plain SQL, got to ORMs, and have gone back (although of course not via string manipulation).
I was keen on testing out the pg2 library, but didn't find its speed ups to be very noticeable in real world applications, as it seems to get most of its speed from lazily realizing result sets, and we generally need all the results we can get.
We use it in one place IIRC where we need the listen/notify stuff, which jdbc does not provide.
I haven't looked into pg2 internals, but I'd also expect for it to be much more debuggable. Jumping around all the JDBC interfaces and implementations is not fun.
Deffo. I get lost the moment I start going into jdbc.next
Yeah, I've ended up spelunking across next.jdbc, xerial's SQLite driver, and HikariCP to figure out how to best make use of the combination for SQLite. PG is well supported, and perhaps the happy path with this library set. However, SQLite configuration and use is not obvious. For example, loading extensions and setting certain PRAGMAs (like auto_vacuum)... I've hand coded these steps.
All this said, I'd still prefer "one RDBMs connector library to rule them all", because I don't want to learn several different interfaces because that's were my "DB-interfacing and using" mistakes happen, in application code.
Not sure though, Like, at least in app code, you don’t just willy nilly switch out the database.
No, but it tends to be that one adds use cases for the same database software. and/or adds a new database to offload some work from the "primary" database.
So one PG installation for "primary" OLTP. And over time, another for OLAP, another for Sessions, another for Job queues, another for metrics, perhaps a vector store, perhaps a NoSQL-like document store. Even search.
We went down that path -- multiple databases for multiple use cases -- and decided the cost and complexity wasn't worth it and migrated everything back to MySQL. That said, we do use Elastic Search and Redis as well. Redis is mostly TTL K/V stuff and our pub/sub hub. Elastic Search is for complex, weighted searches (dating profile matches). But MySQL is a lot more capable than many people seem to think and having a single SQL dialect to deal with reduces the mental burden of working with the system.
Oh, sorry, I meant another PG installation for each new use-case.
We have multiple databases inside a single MySQL installation. I would have thought schemas would be the closest in PG? We have primary/secondary w/replication and we run reporting stuff off readonly accounts on the secondary to avoid heavy query load on the primary.
I don't know how MySQL works, but with PG, ideally • Start off with schemas in a single installation • Migrate schema out to dedicated installation once it starts messing with the rest of the needs
But only after beefing up the installation reaches its limits. • Make single box bigger till it doesn't help any more. • Move to leader/follower replication (trade off with eventual consistency of reads). • Beef that up until that doesn't work. • Now, figure out which use case to move off into its own thing (ideally one schema / namespace per use case)
Until that time, be happy merely building billion-dollar business, using Postgres mainly. Case study 1: Wave. https://danluu.com/simple-architectures/ > Wave is a $1.7B company with 70 engineershttps://danluu.com/simple-architectures/#fn:R whose product is a CRUD app that adds and subtracts numbers. In keeping with this, our architecture is a standard CRUD app architecture, a Python monolith on top of Postgres. https://twitter.com/danluu/status/1462607028585525249 where possible has allowed us to scale to this size while engineers mostly focus on work that delivers value to users. Case study 2: Zerodha. Scaling with common sense https://zerodha.tech/blog/scaling-with-common-sense/ > In April, when I wrote our https://zerodha.tech/blog/hello-world, a 7+ million trade day was our highest ever. Since then, we have had multiple 8-9 million trade days with hundreds of thousands of additional concurrent users daily. We have been scaling with simplicity and common sense.
Heh, Case study 3: World Singles Networks -- run 40 dating web sites and iOS/Android around the world on a single primary MySQL server (with a single replicated secondary) 🙂
> Elastic Search is for complex, weighted searches (dating profile matches). 😮 - surprised to read this, have you written about this elsewhere @seancorfield? I'd love to learn more.
Our approach, on top of ES, is kinda proprietary so, no, I haven't written much about it, nor talked about it in any depth on podcasts.
We used to use Transparensee which was a proprietary search engine originally aimed at the real estate market and we were an early adopter outside that market and built our first gen matching algorithm on top of that. But it was super expensive to license, so I figured out a way to replicate it on top of ES for a much lower cost 🙂
Amazing.
as it seems to get most of its speed from lazily realizing result setsThat's right, pg2 is faster in that direction: it parses messages lazily. I could not beat Java JDBC as it's pretty optimized
But it also allows to process items on the fly without saturating memory