I have a question related to the java jdbc reducible query (with a little sql/rdb, if I should ask elsewhere let me know)
If I use jdbc/reducible-query in these ways
;; version 1 (uses doseq while implicitly handling a transaction)
(jdbc/with-db-transaction [tx connection]
(doseq [batch (jdbc/reducible-query tx sql {:fetch-size 10000})]
(write-batch (process-batch batch))))
;; version 2 (transduces from one table to another implicitly handling a transaction)
(jdbc/with-db-transaction [tx connection]
(transduce
(comp (partition-all 1000)
(map process-batch)
(map write-batch))
(constantly nil)
nil
(jdbc/reducible-query tx sql {:fetch-size 10000})))
;; version 3 (uses doseq without explicitly closing a transaction)
(jdbc/with-db-connection [conn connection]
(.setAutoCommit (jdbc/get-connection conn) false)
(doseq [batch (jdbc/reducible-query conn sql {:fetch-size 10000})]
(write-batch (process-batch batch))))
;; version 4 (transduces from one table to another without explicitly closing a transaction)
(jdbc/with-db-connection [conn connection]
(.setAutoCommit (jdbc/get-connection conn) false)
(transduce
(comp (partition-all 1000)
(map process-batch)
(map write-batch))
(constantly nil)
nil
(jdbc/reducible-query conn sql {:fetch-size 10000})))
Questions
1. What are the subtle differences?
2. What are the differences if an error were to occur?I'd say #sql would be more appropriate in general, but some things are universal, so...
Versions 1 an 3 should fail outright because IReduce is not seq-able.
Version 3 is not different from version 1 and version 4 is not different from version 2 because clojure.java.jdbc uses (.setAutoCommit con false) unconditionally in with-db-transaction. The library commits or rolls back the transaction for you.
BTW if it's a new project you should most definitely use next.jdbc instead.
It really depends on the underlying database and its driver. You can easily face a situation when the driver fetches the result at once and saturates memory. Try selecting some crazy amount of data, say select x from generate_series(0, 999999999999) as seq(x) and limit the memory size of JVM, then try each approach