Fork me on GitHub
#clojure-europe
<
2019-05-22
>
jasonbell05:05:59

Very much depends on the DB @thomas

thomas06:05:23

I found out two things last night... 1) clojure.jdbc doesn't seem to use connection pooling by default. So that is something I need to fix first. and then I can do an insert-multi! instead of loads of insert!s

jasonbell06:05:14

How many records are you trying to insert?

thomas07:05:27

I am trying to use with-db-connection but it isn't clear to me what the conn is in the example. :thinking_face:

thomas07:05:38

I got it I think...

thomas07:05:32

yes that worked... with a tiny amount of test data from 13 secondsto 5 seconds

jasonbell07:05:55

If that’s not working out @thomas then it might be worth a look at the tuple list params in HugSQL, they work well. Depending on the source data (files, S3, or another db SELECT) then Embulk is worth a look too as it handles failures better.

jasonbell07:05:23

Embulk isn’t Clojure though, it’s a standalone cli application with yaml config.

thomas07:05:13

I am looking at insert-multi! at the moment... but partition is splitting my seq in lists...and not vectors...

thomas07:05:47

I think I have that one solved as well...

jasonbell07:05:20

user=> (into [] (map #(vec %) (partition 2 x)))
[[1 2] [3 4] [5 6] [7 8]]

jasonbell07:05:31

Off the top of my head but I’m sure there are better ways

jasonbell07:05:07

user=> (mapv #(vec %) (partition 2 x))
[[1 2] [3 4] [5 6] [7 8]]

jasonbell07:05:19

Yup just remembered mapv

mpenet08:05:03

(into [] (partition-all 2) x) comes to mind

jasonbell08:05:34

Oh partition-all is new to me, nice one.

Ben Hammond09:05:27

https://www.hugsql.org/#param-tuple says

BATCH INSERTS: It should be noted that Tuple List Parameter support is only support for SQL INSERT...VALUES (...),(...),(...) syntax. This is appropriate for small-ish multi-record inserts. However, this is different than large batch support. The underlying JDBC driver for your database has a limit to the size of the SQL and the number of allowed bind parameters. If you are doing large batch inserts, you should map or doseq over your HugSQL-generated insert function within a transaction.

Ben Hammond09:05:50

do you actually need a transaction, or is it good enough to use a batch?

jasonbell09:05:54

batches are fine, I’d just be very aware of heap memory bailing out with 600k rows, it’s easy done.

jasonbell09:05:24

@thomas what’s the DB server?

thomas09:05:52

it is working now... thank you all for your help.

thomas09:05:36

I tried the into [] but that didn't quite do what I wanted in my ->>

thomas09:05:45

and it is a postgres server btw