Fork me on GitHub
#sql
<
2021-05-28
>
practicalli-johnny12:05:07

For each row in a database table, I'd like to clean a specific string field in each row (using re-seq and some regex) and write each record back to the database, ideally in another table so I can compare the results. Would plan be a good approach or is this just a doseq approach again? If I understand correctly, I suspect plan may not be right as the connection would close after the query but before the insert. Happy to be corrected though 🙂 Or would I simply put the insert in the same reducing function that cleans the string field, after its has cleaned it? Time for some repl experiments 🙂 Thanks for any thoughts on this.

practicalli-johnny14:05:06

This feels a little hacky, but seems to work okay on a limited set of records in the database.

(reduce
   (fn [_ row]
     (sql/insert! db-spec :transactions_clean (assoc row :domain/field_name "result-of-clean-name-function")))
   0  ;; accumulator not used
   (jdbc/plan db-spec ["select * from datarows where interesting_rows = ?" 1234554321]))
Now to try it on the whole database 🙂 Doh! out of heap space when I try to do the whole database Oh, a doseq over the distinct customer names with the above query should do it (probably very slowly)

seancorfield15:05:26

Hmm, I’m a bit surprised at the OoM — maybe the select isn’t streaming? (how to do that is DB-dependent but there are examples for a few DBs in the docs)

seancorfield15:05:47

If you have a reduce where the accumulator is not used, you can always use run! instead:

(run! #(sql/insert! db-spec :transaction_clean (assoc % :domain/field_name (clean-up-row %))) (jdbc/plan db-spec ...))

seancorfield15:05:57

run! is a reduce under the hood.

practicalli-johnny16:05:33

Ah good point about run!, thanks The db is your favourite, postgres. So I assume the select not streaming is a postgres thing. I have a workable solution for now. Thanks for the advice.

seancorfield16:05:27

Streaming for PostgreSQL is specifically documented for next.jdbc (see Tips &amp; Tricks section — I think? It’s definitely in there somewhere).

👍 3
Michaël Salihi14:05:32

This is exactly the kind of advice I was expecting!

3
dpsutton19:05:19

how can i achieve this order-by clause with honeysql? ORDER BY date_login IS NOT NULL, date_login DESC

dpsutton19:05:08

i'm needing to control where my dates go in an order in mysql

seancorfield19:05:22

Ah, I was wondering when someone was going to request that 🙂

seancorfield19:05:20

I think you can do {:order-by [[:date_login :is-not-null] [:date_login :desc]]}

dpsutton19:05:28

ha. mysql doesn't like :nulls last and i can't seem to coax it otherwise

seancorfield19:05:35

(if you’re on v2)

seancorfield19:05:01

dev=> (sql/format {:order-by [[:date_login :is-not-null] [:date_login :desc]]})
["ORDER BY date_login IS NOT NULL, date_login DESC"]

dpsutton19:05:01

"1.0.461" 😕

seancorfield19:05:48

You can always bring in v2 just for that one query — you can use v1 and v2 side-by-side since they have different namespaces (and they are at different maven coords).

dpsutton19:05:11

ah, a nice side effect of the security changes on clojars?

seancorfield19:05:16

No, a deliberate choice to help encourage people to migrate to v2 piecemeal 🙂

👍 3
dpsutton19:05:07

we have toucan involved so can't swap out for v2 here unfortunately (or easily i guess). toucan author is coworker and principal for years

seancorfield19:05:27

Well, that’s one of the big disadvantages of layering abstractions upon abstractions I guess… At least if they’re a coworker, you can petition them to migrate Toucan to HoneySQL v2 🙂

👍 3
seancorfield19:05:38

I’m already working with the author of Gungnir to see if I can provide extension points in v2 that will address his needs (Gungnir re-implements a bunch of internal HoneySQL stuff, unfortunately).

dpsutton19:05:07

haven't heard of that one