Fork me on GitHub
#sql
<
2023-03-01
>
teodorlu19:03:35

Hi! I'm using next.jdbc to create an SQLite database based on a JSON file. I*ve got something working, but I suspect that it can be improved. Specifically, I'm wondering: 1. Is it possible to avoid having to name each field three times? (first in SQL create table query, then when extracting from json, then when in the SQL insert query). 2. Will running one sql insert per item in a transaction give me decent performance? Are there other approaches suitable for processing large datasets quickly? Code:

(with-open [conn (jdbc/get-connection {:dbtype "sqlite" :dbname "mydb.sqlite"})]
  (jdbc/with-transaction [tx conn]
    (jdbc/execute! tx [(str "CREATE TABLE IF NOT EXISTS persons"
                            "  (name, age)")])
    (doseq [p [{:name "Alice" :age 33} {:name "Bob" :age 25}]] ; inline data here really comes from JSON
      (jdbc/execute! tx [(str "INSERT INTO persons"
                              "  (name, age)"
                              " VALUES"
                              "  (?, ?)")
                         (:name p) (:age p)]))))
Thanks!

2
1
lukasz19:03:13

SQLite accepts multiple data formats when piping into it - you might bypass Clojure entirely

👍 1
Cam Saul19:03:16

you can do multiple rows in a single statement like (?, ?), (?, ?), ... . That will be a lot faster than doing one statement per row

💯 1
Cam Saul19:03:45

yeah, I'd guess CSV would be the fastest

👍 1
lukasz19:03:21

if you want to go the next.jdbc route - next.jdbc.sql has helpers for formatting data into multi-insert IIRC

💯 1
teodorlu19:03:36

Thanks - these are great ideas! Looks like next.jdbc.sql/insert-multi! is exactly what I was looking for to reduce boilerplate. And I suspect that bypassing the JDBC driver like you suggest might be the a way to get even better performance. 🙌

👍 1
isak19:03:39

Do you have the json1 https://www.sqlite.org/json1.html#jeach enabled? If so I'd try by using json_each. E.g., something like (not tested):

insert into persons(name, age)
select a.name, a.age
from json_each(?) a

👍 1
teodorlu19:03:01

Good point - I could use native JSON support too. Part of why I'm exploring SQLite is to be able to use tools like https://sqlitebrowser.org/ for exploring the data. In that case, I get a nice table view for free.

👍 1
teodorlu20:03:26

In case other people are curious - usage examples for next.jdbc.sql/insert-multi! can be found here: https://cljdoc.org/d/com.github.seancorfield/next.jdbc/1.3.847/doc/getting-started/friendly-sql-functions#insert-multi

(sql/insert-multi! ds :address
  [{:name "Stella", :email "[email protected]"}
   {:name "Waldo", :email "[email protected]"}
   {:name "Aunt Sally", :email "[email protected]"}])

lukasz20:03:37

clojure.pprint/print-table is your friend 😉 Jokes aside, I love sqlite for this sort of stuff - like merging CSV exports from different tools (Stripe, Metabase etc) and generating reports from them on the fly, where it would be otherwise hard/annoying to get the data into a single DB.

👍 1