sql

p-himik 2026-01-05T10:23:02.154459Z

Got this thought after reading https://clojurians.slack.com/archives/C1Q164V29/p1767608105617649?thread_ts=1767558223.369749&cid=C1Q164V29 in the thread above. Maybe we're at the start of the "ad-hoc DB connectivity renaissance". We started with ad-hoc libraries, got to JDBC, and are now moving away from it, at least in some projects. Similar to how we started with plain SQL, got to ORMs, and have gone back (although of course not via string manipulation).

slipset 2026-01-05T11:05:54.447409Z

I was keen on testing out the pg2 library, but didn't find its speed ups to be very noticeable in real world applications, as it seems to get most of its speed from lazily realizing result sets, and we generally need all the results we can get.

slipset 2026-01-05T11:06:44.177539Z

We use it in one place IIRC where we need the listen/notify stuff, which jdbc does not provide.

p-himik 2026-01-05T11:08:14.578469Z

I haven't looked into pg2 internals, but I'd also expect for it to be much more debuggable. Jumping around all the JDBC interfaces and implementations is not fun.

slipset 2026-01-05T11:09:07.885399Z

Deffo. I get lost the moment I start going into jdbc.next

adi 2026-01-05T13:15:33.141619Z

Yeah, I've ended up spelunking across next.jdbc, xerial's SQLite driver, and HikariCP to figure out how to best make use of the combination for SQLite. PG is well supported, and perhaps the happy path with this library set. However, SQLite configuration and use is not obvious. For example, loading extensions and setting certain PRAGMAs (like auto_vacuum)... I've hand coded these steps.

adi 2026-01-05T13:21:38.326539Z

All this said, I'd still prefer "one RDBMs connector library to rule them all", because I don't want to learn several different interfaces because that's were my "DB-interfacing and using" mistakes happen, in application code.

slipset 2026-01-05T13:35:30.160809Z

Not sure though, Like, at least in app code, you don’t just willy nilly switch out the database.

adi 2026-01-05T14:18:50.962129Z

No, but it tends to be that one adds use cases for the same database software. and/or adds a new database to offload some work from the "primary" database.

adi 2026-01-05T14:20:33.470099Z

So one PG installation for "primary" OLTP. And over time, another for OLAP, another for Sessions, another for Job queues, another for metrics, perhaps a vector store, perhaps a NoSQL-like document store. Even search.

seancorfield 2026-01-05T14:34:59.013959Z

We went down that path -- multiple databases for multiple use cases -- and decided the cost and complexity wasn't worth it and migrated everything back to MySQL. That said, we do use Elastic Search and Redis as well. Redis is mostly TTL K/V stuff and our pub/sub hub. Elastic Search is for complex, weighted searches (dating profile matches). But MySQL is a lot more capable than many people seem to think and having a single SQL dialect to deal with reduces the mental burden of working with the system.

💯 1
adi 2026-01-05T15:02:06.989519Z

Oh, sorry, I meant another PG installation for each new use-case.

seancorfield 2026-01-05T15:09:31.340839Z

We have multiple databases inside a single MySQL installation. I would have thought schemas would be the closest in PG? We have primary/secondary w/replication and we run reporting stuff off readonly accounts on the secondary to avoid heavy query load on the primary.

adi 2026-01-05T15:12:00.604729Z

I don't know how MySQL works, but with PG, ideally • Start off with schemas in a single installation • Migrate schema out to dedicated installation once it starts messing with the rest of the needs

adi 2026-01-05T15:15:55.512129Z

But only after beefing up the installation reaches its limits. • Make single box bigger till it doesn't help any more. • Move to leader/follower replication (trade off with eventual consistency of reads). • Beef that up until that doesn't work. • Now, figure out which use case to move off into its own thing (ideally one schema / namespace per use case)

adi 2026-01-05T15:19:03.314129Z

Until that time, be happy merely building billion-dollar business, using Postgres mainly. Case study 1: Wave. https://danluu.com/simple-architectures/ > Wave is a $1.7B company with 70 engineershttps://danluu.com/simple-architectures/#fn:R whose product is a CRUD app that adds and subtracts numbers. In keeping with this, our architecture is a standard CRUD app architecture, a Python monolith on top of Postgres. https://twitter.com/danluu/status/1462607028585525249 where possible has allowed us to scale to this size while engineers mostly focus on work that delivers value to users. Case study 2: Zerodha. Scaling with common sense https://zerodha.tech/blog/scaling-with-common-sense/ > In April, when I wrote our https://zerodha.tech/blog/hello-world, a 7+ million trade day was our highest ever. Since then, we have had multiple 8-9 million trade days with hundreds of thousands of additional concurrent users daily. We have been scaling with simplicity and common sense.

seancorfield 2026-01-05T15:31:53.168259Z

Heh, Case study 3: World Singles Networks -- run 40 dating web sites and iOS/Android around the world on a single primary MySQL server (with a single replicated secondary) 🙂

1
Harold 2026-01-05T16:42:26.233279Z

> Elastic Search is for complex, weighted searches (dating profile matches). 😮 - surprised to read this, have you written about this elsewhere @seancorfield? I'd love to learn more.

seancorfield 2026-01-05T16:43:35.672369Z

Our approach, on top of ES, is kinda proprietary so, no, I haven't written much about it, nor talked about it in any depth on podcasts.

seancorfield 2026-01-05T16:45:14.616979Z

We used to use Transparensee which was a proprietary search engine originally aimed at the real estate market and we were an early adopter outside that market and built our first gen matching algorithm on top of that. But it was super expensive to license, so I figured out a way to replicate it on top of ES for a much lower cost 🙂

Harold 2026-01-05T16:46:09.624159Z

Amazing.

igrishaev 2026-01-05T17:34:39.009139Z

as it seems to get most of its speed from lazily realizing result setsThat's right, pg2 is faster in that direction: it parses messages lazily. I could not beat Java JDBC as it's pretty optimized

igrishaev 2026-01-05T17:39:01.666749Z

But it also allows to process items on the fly without saturating memory