Fork me on GitHub
#sql
<
2020-08-22
>
kulminaator05:08:27

gave a testrun to cockroachdb yesterday with java & postgresql jdbc driver ... worked like a charm ... and since clojure's sql support is built ontop of jdbc it should work fine with clojure too

kulminaator05:08:42

i wish postgresql boys would take a note from the features that these guys are providing and pimp up their db

seancorfield05:08:49

If you hit any weirdness with next.jdbc, LMK (did you see the Snowflake discussions above?)

seancorfield05:08:57

Does CockroachDB support multiple result sets? That seems to be something that the PG maintainers are dragging their heels on implementing. @kulminaator

kulminaator05:08:39

i'm more after their multimaster setup than other exotic features

kulminaator05:08:58

they have a very nicely designed cluster architecture

kulminaator05:08:55

you can upgrade the db node by node in the cluster, you can add new nodes on the run, decomission old nodes

kulminaator05:08:00

like a 21st century db

seancorfield05:08:15

We use Percona's fork of MySQL, partly for the improved ProxySQL/replication/primary/secondary machinery.

seancorfield05:08:03

It's interesting how poor the base open source versions are in this area.

kulminaator05:08:09

i suggest you check out both, cockroachdb and yugabytedb ... they both seem to be on the same track ... just to see what's out there

kulminaator05:08:24

both have free open source variants of their db available with somewhat reduced featureset

seancorfield05:08:33

We're not switching. We have too much invested in Percona at this point.

kulminaator05:08:00

well at least you can think of what you can ask from percona people as next features

seancorfield05:08:14

We have hundreds of millions of rows of data 🙂

seancorfield05:08:41

As maintainer of c.j.j for nine years and now next.jdbc, nearly all of my pain has come from Postgres users 🙂

seancorfield05:08:40

Never from MySQL/Percona users. Some pain from Oracle users (I do feel sorry for them -- I was certified as an Oracle DBA back when I worked at Macromedia)

kulminaator05:08:45

i have been on postgres for 13 years by now

kulminaator05:08:50

but i feel it's time has come

kulminaator05:08:17

the single node db instance design just doesn't cut my needs anymore

kulminaator05:08:45

yes one can shard the data and build exotic master-slave setups for failovers etc.

kulminaator05:08:49

but it's all patches on patches

kulminaator05:08:53

instead of cluster by design

seancorfield05:08:33

We migrated all our data from MS SQL Server to MySQL a decade ago and it's still a painful memory so migrating to a new DB is... not something we look forward to. But a MySQL-compatible cloud DB would probably be our first choice.

kulminaator05:08:42

luckily have been avoiding oracle for 15 years , i dont miss it 😄

kulminaator05:08:03

failures with messages like ORA-43432 just send up a cold shiver in one's back

seancorfield05:08:09

We also went all-in on MongoDB for a while... and that was... well... let's just say we were happy to migrate back to MySQL 🙂

kulminaator05:08:48

i loved the thread on db sides stack overflow with "i upgraded my mongodb, where is my data"

kulminaator05:08:27

i think sql as a design is over all sane, it has civilizing limiting featurset

kulminaator05:08:08

and at least for a fintech company anything without proper transactions should be a no-no

kulminaator05:08:28

developers are extremely bad at building their own simulations of transactions

kulminaator05:08:52

so the nosql crowd advising people to do so is just asking for trouble

seancorfield05:08:29

I got into MongoDB fairly early when they were all "We're not Oracle!" and then once they started getting popular they were like "Hey! We've added all these Enterprise features! You could migrate to us from Oracle!" and I was like "WTF?"

kulminaator05:08:08

but by giving up mongodb you gave up the chance to reinvent a lot if bicycles

seancorfield05:08:21

I've gotten pretty comfortable with my MySQL-cycle.

seancorfield05:08:11

I know how to get it to perform well at high data volumes, I know how to modify large tables with zero downtime... I'm pretty happy with it in production, to be honest.

seancorfield05:08:02

MongoDB promised a lot of ease of use but robustness and scalability were both hard and expensive 😞

kulminaator05:08:37

how does your percona mysql cluster handle the situation where you just yank a power cable from one of the nodes

kulminaator05:08:20

is it up to you to start some kind of failover or does it have some kind of quorum system built in to make such decisions ?

kulminaator05:08:30

with the standard postgres setup & replications it's up to admins to do such things ... and it's partially one of the reasons i feel it's time to move away

seancorfield05:08:10

I don't remember exactly how our DB folks have ProxySQL setup. I know it can be automatic, but I don't remember the details of failover and how replication lag affects that.

seancorfield05:08:58

(I know from experience that yanking the power cord on our secondary DB doesn't affect anything, and replication catches up pretty fast once it is powered back on 🙂 )

kulminaator06:08:05

with mongodb my experience is limited , i just remember doing one disaster prevention (machines running out of disk, spinning up new ones with bigger disks, cleaning the data, moving back to old ones) .... that was not a great weekend

kulminaator06:08:26

but from what i saw on that weekend i don't really want to do anything with it

seancorfield06:08:36

Some of the cloud services are getting pretty impressive at that sort of scaling/migration/upgrading these days. We use Elastic Cloud for our Elastic Search instances and their ops stuff is amazingly painless.

kulminaator06:08:08

well that's the thing

seancorfield06:08:16

Back when we used MongoDB, we used MongoLab and they were fairly good. A lot of stuff was automated, but not as smooth as the Elastic Cloud stuff.

kulminaator06:08:22

cockroachdb and yugabyte both feel like elastic in that experience

kulminaator06:08:34

it acts like a proper cluster and manages itself

seancorfield06:08:37

I wonder how Datomic feels in that sort of area?

kulminaator06:08:45

and i feel databases should get to the same level

seancorfield06:08:47

(Datomic Cloud, specifically)

kulminaator06:08:17

the compromise i don't want to do is ditching sql

seancorfield06:08:25

We talked to Cognitect about our use cases. Some of what we do would work well on Datomic but they said some of it isn't a match (and some of our data volumes are bigger than Datomic was designed for).

kulminaator06:08:53

things start to fall apart surprisingly fast once you reach terabytes of data

seancorfield06:08:55

I think for our member profile data and our billing/transactional data, Datomic would work well, but we have a lot of high-write data and a lot of append-only data with high volumes that they said would not be a good fit. And we don't want to run a multi-DB solution.

kulminaator06:08:58

even with postgres and mysql

seancorfield06:08:17

We have several tables with well over 100M rows at this point.

kulminaator06:08:28

usually it's developers fault though 😄

kulminaator06:08:07

but yeah, funny stuff happens ... initially nobody wants to become a billionaire so they create tables with int32 primary keys ...

kulminaator06:08:30

then they finally understand that hey, we are getting many records here ... so tables migrate to bigint primary keys

kulminaator06:08:52

and then you start seeing tables where you have stored blocked users and those have bigint primary keys

kulminaator06:08:03

and you go like ... why would we block more than a third of the planet ?

kulminaator06:08:35

we have exceeded the int32 line with multiple tables in our systems ...

seancorfield06:08:01

Yup, us too. Can be a giant pain when you suddenly have to rebuild a table on the fly without downtime 🙂

kulminaator06:08:14

and there is no good decision that sql planners can do in those situations unless you do only lookups by indexed field exact matches 😄

seancorfield06:08:07

Recently, we had a lookaside table that assumed a 16-bit unsigned max index and it was fine for years, but then someone else decided to change the strings that could go into that table and BOOM! 65535!

kulminaator06:08:19

but developers are always surprising too ...

seancorfield06:08:26

Took six weeks to completely dig out from under that one.

kulminaator06:08:45

in the sense that you have a table that has 500m records ... and then there's a type field on the records ... and it has 3-4 different values ....

kulminaator06:08:50

and developers put an index on that

kulminaator06:08:03

and then you just sit down and cry a little bit

seancorfield06:08:16

Ah yes, index design is a fine "black art" 🙂

kulminaator06:08:18

especially if they come asking why isnt the index used to make their magnificent query faster

seancorfield06:08:23

Or you have a query for which two or more different indices are a "reasonable" match but the planner picks the slightly less good one... fortunately you can provide a USES hint in the query!

seancorfield06:08:57

Sometimes I'm just amazed that any of this stuff works "at scale"...

kulminaator06:08:13

well recovering from backups kind of doesn't anymore

kulminaator06:08:24

would be unthinkable without failover databases

kulminaator06:08:09

recovering a 2 terabyte db from a dump is way too slow, by any means ...

kulminaator06:08:35

people are creative though, some use zfs snapshots of the database filesystem instead

kulminaator06:08:54

so you don't have to import a dump , you just mount the image and start the db

🙂 3
seancorfield06:08:34

(I'm out for the night)

kulminaator06:08:55

night 🙂

orestis17:08:37

Any resources for designing real-world schemas in sql? Things like blocked users, saving line item price to maintain historical records and all the other messy bits that are needed.

kulminaator13:08:59

for blocked user detection you may want to use a bloom filter & caching , so you dont penalize "good users" with unnecessary lookups

kulminaator13:08:06

not sure if i know overall good books on the overall topic though, most is domain specific