Fork me on GitHub
#sql
<
2018-09-14
>
seancorfield00:09:23

Sounds like you want reducible-query? @mattsfrey

seancorfield00:09:40

(built into clojure.java.jdbc)

mattsfrey00:09:45

ah this looks like a good lead, thanks!

seancorfield00:09:12

In particular, pay attention to the Additional Options? section because getting databases to actually fetch results in batches is sometimes non-trivial.

seancorfield00:09:48

The JDBC world has changed a lot in Clojure in the last few years.

mattsfrey00:09:50

auto-commit? should be false right?

seancorfield00:09:55

Yeah, if your DB needs that.

seancorfield00:09:18

And you may need :result-type :forward-only and probably :fetch-size (set according to however your DB wants it).

mattsfrey00:09:42

ah I see, thanks

mattsfrey01:09:10

am I correct to make the assumption that

(jdbc/query db-spec expiration-query
              {:row-fn handle-row})
will lazily query over the whole table one row at a time?

seancorfield01:09:55

It depends what you mean by "lazily" there...

seancorfield01:09:43

jdbc/query realizes the entire result, which is independent of whether the database fetches "lazily" (in batches/chunks).

seancorfield01:09:50

In your code above, it would return a (fully-realized) sequence of all the results from calling handle-row on each row fetched from the database.

seancorfield01:09:44

That's the main difference with reducible-query -- it doesn't actually do anything until you reduce it somehow, and you can control what happens with the result accumulation (so you can process a row and essentially "throw away" the result).

seancorfield01:09:30

The "lazy" in the docs is intended to mean you can process an arbitrarily large result set without bringing it all into memory.

mattsfrey15:09:47

yes that last part is what I mean, I am really just trying to scan over a large db table without loading the entire result set into memory

mpenet15:09:31

you need to use a cursor and do that manually (at least with pg) otherwise the jdbc driver will just send all the result over the wire even if you do not consume it (lazily or not)

mpenet15:09:56

declare cursor foo for select * from bar then fetch 1000 from foo for every fetch and close foo at the end. You can turn that into a lazy seq with some effort.

mpenet15:09:44

not sure reducible-query helps doing that at all, at least with pg, but I am not familiar with it

seancorfield15:09:01

@mpenet No, you do not need to manually use a cursor for this. Setting auto-commit false on the connection, setting result-type forward-only and a fetch-size is all you should need. Folks have actually verified that works on Postgres with java.jdbc BTW.

mpenet15:09:25

ah, that's new to me

mpenet15:09:09

is it documented somewhere?

seancorfield15:09:10

(I don't use Postgres -- I have to rely on others to test & verify this stuff for me -- but all that stuff in the docs, and the underlying features I added to java.jdbc, is specifically about getting the DB to stream chunks of a result set)

seancorfield15:09:50

If you look through the closed JIRA tickets for java.jdbc you should see where folks were testing and verifying the lazy processing behavior...

mpenet16:09:11

yeah, trying to find something in jira is quite hard 😛

mpenet16:09:08

damnit, can't even link to a search

mpenet16:09:00

anyway... I can't find it. I ll take your word for it. I just remember I had to do the cursor dance (quite some time ago tho)

seancorfield16:09:54

Hmm, looks like the main discussion happened somewhere else and it isn't linked in the tickets. The auto-commit : false option was definitely part of the fallout from that and was key to making Postgres stream result sets apparently...

mpenet16:09:11

I am fairly sure I had tried changing auto-commit back then.

seancorfield16:09:41

The auto-commit option on opening a connection is fairly new to java.jdbc.

mpenet16:09:20

I was using hikaricp, so probably setting that there

mpenet16:09:43

sounds like documenting how to do that might be a good idea