Fork me on GitHub
#sql2018-11-20
kirill.salykin16:11:37

Hi all I’d like to process each out of 12,000,000 records in my table (MySQL InnoDB) It this code efficient?

(run! (fn [row]
          (async/>!! chan row))
        (jdbc/reducible-query
         db
         ["select * from table where checked_at is NULL order by postcode limit 100000"]
         {:fetch-size   10000
          :max-rows     10000
          :cursors      :hold
          :result-type  :forward-only
          :concurrency  :read-only
          :auto-commit? false}))

kirill.salykin16:11:12

because now mysql spends quite some time to process the query

kirill.salykin16:11:50

do I need the limit? do I need the cursor? (is cursor actually used?) every record first putted into channel for proccessing (channel is limited to 1000)

kirill.salykin16:11:36

should I use same limit in query as channel has? (1000)

michaellindon23:11:36

@seancorfield I am still having some misunderstandings with performing a batch insert. Consider the following

(def routing-guide (sql/query warp-db [(slurp "src/bayesianic/routing-guide.sql")] {:as-arrays? true}))
(time
 (sql/insert-multi! prem-db :routing_guide (first routing-guide) (rest routing-guide)))
(time 
 (batch-insert prem-db "routing_guide" (first routing-guide) (rest routing-guide)))
where batch-insert is a function that i wrote myself. The timings are as follows:
"Elapsed time: 26476.902023 msecs"
"Elapsed time: 98.909443 msecs"

michaellindon23:11:20

my batch insert function looks like this:

(defn batch-insert [db table colnames rows]
  (sql/execute! db [(str
                     "insert into "
                     table
                     (str " (" (clojure.string/join ", " (map name colnames)) ") ")
                     " values "
                     (apply str (clojure.string/join ", "
                                                     (map (fn [x] (str "(" (clojure.string/join ", " x) ")")) rows))))]))

seancorfield23:11:50

@michaellindon Your function isn't equivalent tho' -- it's assuming it can convert arbitrary Clojure values to legal SQL literal values with str -- whereas insert-multi! creates a PreparedStatement with a parameter for every value and sets all those values as parameters in the JDBC call.

seancorfield23:11:16

Yours will work for numbers but not strings (for example).