clojure

Bailey Kocin 2025-10-20T15:08:08.479409Z

I have a question related to the java jdbc reducible query (with a little sql/rdb, if I should ask elsewhere let me know) If I use jdbc/reducible-query in these ways

;; version 1 (uses doseq while implicitly handling a transaction)

(jdbc/with-db-transaction [tx connection]
  (doseq [batch (jdbc/reducible-query tx sql {:fetch-size 10000})]
    (write-batch (process-batch batch))))

;; version 2 (transduces from one table to another implicitly handling a transaction)

(jdbc/with-db-transaction [tx connection]
  (transduce
   (comp (partition-all 1000)
         (map process-batch)
         (map write-batch))
   (constantly nil)
   nil
   (jdbc/reducible-query tx sql {:fetch-size 10000})))

;; version 3 (uses doseq without explicitly closing a transaction)

(jdbc/with-db-connection [conn connection]
  (.setAutoCommit (jdbc/get-connection conn) false)
  (doseq [batch (jdbc/reducible-query conn sql {:fetch-size 10000})]
    (write-batch (process-batch batch))))

;; version 4 (transduces from one table to another without explicitly closing a transaction)

(jdbc/with-db-connection [conn connection]
  (.setAutoCommit (jdbc/get-connection conn) false)
  (transduce
   (comp (partition-all 1000)
         (map process-batch)
         (map write-batch))
   (constantly nil)
   nil
   (jdbc/reducible-query conn sql {:fetch-size 10000})))
Questions 1. What are the subtle differences? 2. What are the differences if an error were to occur?

p-himik 2025-10-20T15:31:23.583829Z

I'd say #sql would be more appropriate in general, but some things are universal, so... Versions 1 an 3 should fail outright because IReduce is not seq-able. Version 3 is not different from version 1 and version 4 is not different from version 2 because clojure.java.jdbc uses (.setAutoCommit con false) unconditionally in with-db-transaction. The library commits or rolls back the transaction for you.

p-himik 2025-10-20T15:31:43.321379Z

BTW if it's a new project you should most definitely use next.jdbc instead.

igrishaev 2025-10-20T15:52:57.850379Z

It really depends on the underlying database and its driver. You can easily face a situation when the driver fetches the result at once and saturates memory. Try selecting some crazy amount of data, say select x from generate_series(0, 999999999999) as seq(x) and limit the memory size of JVM, then try each approach