Fork me on GitHub

Are there any blogs, docs, books, etc. on the design and history of jdbc? I'm interested in “why” questions, not the “how.”


Kinda. I was hoping for even more details 🙂


I've been working with it for over a decade at this point and it still manages to surprise me. When I first learned Clojure (2010) I was writing a lot of database-heavy apps and I was surprised that no one in the Clojure world seemed to be using JDBC. clojure.contrib.sql existed but wasn't really maintained (Stephen Gilardi wrote it but then moved onto other stuff). It was stuck in Clojure 1.2's monolithic Contrib lib and didn't look like it would carry over to 1.3 and modular Contrib since no one wanted to maintain it. I needed it so I jumped up and down and eventually the core team said "Fine, if you really want to maintain it, you can" and that's how was born. I was new to JDBC itself at that point -- I'd never had to go that low-level before. OMG! What a journey and what a learning experience! And what a weird, freakin' mess it all was. Some stuff made no sense (and still barely makes sense). So much variation between databases (hello, PostgreSQL!). Several methods that database driver just throw exceptions from "Sorry, not implemented!". Prepared statements? Oh, not here honey! What table was this column from? (Oracle: no clue, not going to tell you) Return the generated keys, please! (PostgreSQL: here, have the entire row of data! Wat? Why? No other DB does that) Hey, I want an index column that is an integer and auto-increments with each insert! (databases: we're all going to do this differently and make you use different syntax... Percona: I'm going to increment the key twice every time because I don't like even numbers). Is "FOO" equal to "foo"? (databases: wait, do we actually have to agree on that?) I'm going to run some stuff in a transaction (some databases: what's a transaction?).

😮 2
👍 1

Sounds like some fun times 🙈


The left hand column links to more details @kenny?


Agh. I think it's hidden on mobile 😞


Frankly, it's amazing any of it works.


Oh, and on the "What table was this column from?" MS SQL Server won't tell you by default but you can persuade it in a very non-intuitive way. How about we try to create an index on an existing table? Pick a database, pick the syntax -- because there's no standard for that at all.


And if you think I'm joking in any way, take a look at the test suite for c.j.j or next.jdbc. Exhibit A:


Yeah. Frankly, it seems like one of those things that might be better suited to db-specific api layer, not a general abstraction. We don't just swap dbs willy nilly. But I want more details on all those decisions, if they exist.


It floors me that MariaDB and MySQL, which purport to be fairly compatible as databases, can't even agree on how to return the generated keys...


& they all have different behaviors in prod. The way you interact with the db can vastly change depending on the underlying impl.


TBH, it's why I've always rolled my eyes at Hibernate (and other ORMs) that advertise that they let you switch databases... because, yeah, we switch databases all the time... not.


… soooo why does jdbc exist? Why not N impls for each db?


Well, JDBC does provide some standardization, and it does allow us to have a library that can say "store this hash map in a table", "read a hash map from a table", "update this set of columns in these rows", and that abstraction does have value.


I think when you wander off into PG JSON territory, then JDBC and standardization no longer matters -- but even there some abstractions still kind of hold.


I wouldn't want to have to use raw JDBC.


I wonder if the standard holds dbs back. i.e., certain innovation doesn't happen because we’re constrained to a particular standard.


You know, if all the DBs actually fully implemented the standard and then added their own stuff, I wouldn't mind so much.


But you get stuff like the PG team refusing to implement multiple result sets for years, and then providing a half-assed implementation, while adding all sorts of weird and wonderful non-standard extensions.


I spent eight years on the ANSI C++ committee so I know how hard it can be to standardize stuff... and you end up with undefined, implementation-defined, and unspecified behaviors because there are some things you can't force on an implementation due to platform issues or whatever.


It's a sort of balance, I suppose.


Do you happen to know why there's an extra layer of indirection to execute* ops? i.e., Connection -> Statement -> Execute. Why add Statement in the middle and not have execute ops operate directly on the Connection?




That protects you from SQL injection attacks. You need to prepare a SQL statement with parameters and then execute that.


Also, remember that it's config -> DataSource -> Connection -> (Prepared)Statement -> execute -> ResultSet.


How does the layer help with parameterization?


You can run multiple statements on a connection, and you can execute a (prepared) statement multiple times with different parameters.




Not sure that's relevant to the abstraction level here?


(and a datasource can give you multiple connections)


Oh, and an execute can give you multiple result sets. So at each layer of the abstraction, you can "get a thing" and then do multiple things to it, or get multiple things from it. Caching, reuse, etc.


i almost just lost my mind thinking jdbc had something called a “Dave Point”


@seancorfield Not sure I follow. If you were to flatten out all method calls on a Statement, moving them to exist on a Connection instead, what is an example that would not work due to the move?


Er... that question doesn't make any sense to me... can you reword or explain what you're asking?

😅 1

Connection -> Statement is 1-1. If you took the API of Statement (i.e., execute*), and put those methods directly on the Connection, what problems would arise? e.g., instead of connection.createStatement(args1).execute(args2) you just connection.execute(args1, args2).


But you execute statements and you can have multiple statements on a connection.


Right, I'm getting radical, haha. That's the way it is now, but why is it that way?


@kenny There is some work done when you create a statement, and that work can be reused if you need to execute it multiple times with different parameters with the current API. That wouldn't be possible with the API you are proposing.


What is the work? Is it client or server side? Why would it not be possible?

Cora (she/her)19:01:32

prepared statements?


@kenny One thing JDBC chooses to do is to rewrite your SQL. For example, using SQL Server, if you send this statement: select something from something where foo = ? , it will get converted to select something from something where foo = @p0 before getting sent. The reason for that is that ? is not valid in TSQL. You may ask why we need to write ? for placeholders using JDBC in the first place. I don't know, you'd have to ask the people who made java JDBC. Maybe it is just that there was no ANSI SQL syntax for variables, and they just picked ?. For prepared statements, some databases also do query parsing / optimization / planning at this stage, which is the main thing. For SQL Server (and probably other modern databases), that is obsolete, since execution plans will be looked up based on the query text automatically. If we really wanted to avoid the need to prepare a statement, it could be cached and looked up by the library, but usually managing cache lifetimes is an application concern, not for a library to decide.


Thanks for the reply @isak. I'm still not clear as to why the nesting is necessary. e.g., connection.execute(connection.preparedStatement())


@kenny is there a particular alternative you have in mind?

Cora (she/her)21:01:36

also having types for things is nice and helps in the context of java


Flat, like the example above. connection.execute(connection.preparedStatement())


What would your alternative to this example look like?

public void updateCoffeeSales(HashMap<String, Integer> salesForWeek) throws SQLException {
    String updateString =
      "update COFFEES set SALES = ? where COF_NAME = ?";
    String updateStatement =
      "update COFFEES set TOTAL = TOTAL + ? where COF_NAME = ?";

    try (PreparedStatement updateSales = con.prepareStatement(updateString);
         PreparedStatement updateTotal = con.prepareStatement(updateStatement))
      for (Map.Entry<String, Integer> e : salesForWeek.entrySet()) {
        updateSales.setInt(1, e.getValue().intValue());
        updateSales.setString(2, e.getKey());

        updateTotal.setInt(1, e.getValue().intValue());
        updateTotal.setString(2, e.getKey());
    } catch (SQLException e) {
      if (con != null) {
        try {
          System.err.print("Transaction is being rolled back");
        } catch (SQLException excep) {


I think you aren't understanding what Sean said above, Connection -> Statement is not limited to 1 to 1


That example is great. To me, it seems like the independent ops have not been decoupled. We've got a lot of things thrown into a single concept: 1) caching of parsed queries 2) parameterization. I'd prefer to make those things very explicit.

public void updateCoffeeSales(HashMap<String, Integer> salesForWeek) throws SQLException {
    String updateString =
            "update COFFEES set SALES = ? where COF_NAME = ?";
    String updateStatement =
            "update COFFEES set TOTAL = TOTAL + ? where COF_NAME = ?";


        for (Map.Entry<String, Integer> e : salesForWeek.entrySet()) {
            con.executeUpdate(updateString, [e.getValue().intValue(), e.getKey()])

            con.executeUpdate(updateStatement, [e.getValue().intValue(), e.getKey()])
    } catch (SQLException e) {
        if (con != null) {
            try {
                System.err.print("Transaction is being rolled back");
            } catch (SQLException excep) {


One thing that is worse with that suggestion though is that the statements are cached for longer than the scope of this method. As for setting the variables via an array that is fine, but that can be built on top of the low-level API, and sometimes you care exactly which database string type is used, for example.