This page is not created by, affiliated with, or supported by Slack Technologies, Inc.
2021-05-28
Channels
- # aws (1)
- # babashka (113)
- # beginners (41)
- # cider (9)
- # cljdoc (15)
- # cljs-dev (8)
- # cljsrn (5)
- # clojure (113)
- # clojure-australia (4)
- # clojure-europe (40)
- # clojure-nl (8)
- # clojure-taiwan (1)
- # clojure-uk (60)
- # clojurescript (59)
- # code-reviews (2)
- # conjure (1)
- # datahike (12)
- # events (1)
- # jobs (1)
- # kaocha (4)
- # lsp (24)
- # luminus (19)
- # malli (4)
- # off-topic (28)
- # pathom (10)
- # philosophy (1)
- # portal (6)
- # re-frame (3)
- # reagent (9)
- # remote-jobs (1)
- # shadow-cljs (31)
- # sql (24)
- # tools-deps (6)
- # xtdb (14)
For each row in a database table, I'd like to clean a specific string field in each row (using re-seq and some regex) and write each record back to the database, ideally in another table so I can compare the results.
Would plan
be a good approach or is this just a doseq approach again?
If I understand correctly, I suspect plan may not be right as the connection would close after the query but before the insert. Happy to be corrected though 🙂
Or would I simply put the insert in the same reducing function that cleans the string field, after its has cleaned it?
Time for some repl experiments 🙂
Thanks for any thoughts on this.
This feels a little hacky, but seems to work okay on a limited set of records in the database.
(reduce
(fn [_ row]
(sql/insert! db-spec :transactions_clean (assoc row :domain/field_name "result-of-clean-name-function")))
0 ;; accumulator not used
(jdbc/plan db-spec ["select * from datarows where interesting_rows = ?" 1234554321]))
Now to try it on the whole database 🙂
Doh! out of heap space when I try to do the whole database
Oh, a doseq over the distinct customer names with the above query should do it (probably very slowly)Hmm, I’m a bit surprised at the OoM — maybe the select
isn’t streaming? (how to do that is DB-dependent but there are examples for a few DBs in the docs)
If you have a reduce
where the accumulator is not used, you can always use run!
instead:
(run! #(sql/insert! db-spec :transaction_clean (assoc % :domain/field_name (clean-up-row %))) (jdbc/plan db-spec ...))
run!
is a reduce
under the hood.
Ah good point about run!, thanks The db is your favourite, postgres. So I assume the select not streaming is a postgres thing. I have a workable solution for now. Thanks for the advice.
Streaming for PostgreSQL is specifically documented for next.jdbc
(see Tips & Tricks section — I think? It’s definitely in there somewhere).
Thanks @seancorfield for the cond->
tip!
That improve readability so much IMO. 👍
https://github.com/prestancedesign/pingcrm-clojure/blob/main/src/pingcrm/models/users.clj#L16-L35
how can i achieve this order-by clause with honeysql?
ORDER BY date_login IS NOT NULL, date_login DESC
Ah, I was wondering when someone was going to request that 🙂
I think you can do {:order-by [[:date_login :is-not-null] [:date_login :desc]]}
(if you’re on v2)
dev=> (sql/format {:order-by [[:date_login :is-not-null] [:date_login :desc]]})
["ORDER BY date_login IS NOT NULL, date_login DESC"]
You can always bring in v2 just for that one query — you can use v1 and v2 side-by-side since they have different namespaces (and they are at different maven coords).
No, a deliberate choice to help encourage people to migrate to v2 piecemeal 🙂
we have toucan involved so can't swap out for v2 here unfortunately (or easily i guess). toucan author is coworker and principal for years
Well, that’s one of the big disadvantages of layering abstractions upon abstractions I guess… At least if they’re a coworker, you can petition them to migrate Toucan to HoneySQL v2 🙂
I’m already working with the author of Gungnir to see if I can provide extension points in v2 that will address his needs (Gungnir re-implements a bunch of internal HoneySQL stuff, unfortunately).