Fork me on GitHub
#sql
<
2018-04-20
>
uwo15:04:29

we’re using java.jdbc to connect to a microsoft sql server. I’ve got a larger query that doesn’t seem to be respecting :fetch-size. any tips?

rymndhng16:04:21

@uwo i'm not familiar with sql server, however there's often a few other knobs to turn on postgres to get cursors working: (wrappg in transaction, setting :result-type :forward-only), do you have to the same for SQL Server?

uwo16:04:32

@rymndhng thanks for the response! I have tried with :result-type :forward-only. Do we need to declare a cursor in our sql query in order to leverage cursors, or does java.jdbc do that under the covers? Also, are there certain types of queries which would would force all results to be sent to the client? I don’t know if our query could be the culprit.

rymndhng16:04:24

it usually does it under the cover -- i wrote a blog post around testing this a while ago: https://rymndhng.github.io/blog/2018/02/24/clojure-jdbc-resultset-cursors/ you should be able to test if a cursor's used by running a query on a large set of data and measuring when the first row is returned to the client

uwo16:04:05

awesome, thanks! I’ll check out the post

uwo17:04:25

I stole your idea to run (new-measurement) as the :row-fn, but curiously I’m never even seeing it fire. Here’s the code I’m using (approximately).

uwo17:04:30

(let [[base-query & params] query
        _ (prn "AutoCommit?" (.getAutoCommit sql-conn))
        stmt (jdbc/prepare-statement sql-conn base-query {:fetch-size 10000
                                                          :row-fn (new-measurement)
                                                          :result-type :forward-only
                                                          :concurrency :read-only})]
    (jdbc/query sql-conn (concat [stmt] params))))

uwo17:04:59

with new-measurement just lifted from your blog post (thanks!)

uwo17:04:38

Sean clarified for me that I should be passing :row-fn to query not prep-statement. all good here

rymndhng22:04:24

cool, did it work? 😁

seancorfield16:04:14

@uwo You might also have to set auto-commit false on the connection you use for that query -- you have to do that for PostgreSQL, as I recall.

seancorfield16:04:17

(you do that by passing :auto-commit? false to whatever you're using to open that connection)

uwo16:04:25

@seancorfield ah, yeah. thanks. I did see that and tried to setAutoCommit false on my connection object. It didn’t have any effect. (I set fetch-size to 1 and then did (bounded-count 1 result-set), and it had to pull hundreds of MBs before evaluating

seancorfield16:04:07

JDBC drivers are funny things. For some definition of "funny" 😞

seancorfield16:04:37

I've learned more than I ever wanted to know about JDBC while maintaining clojure.java.jdbc 😐

😅 16
mpenet17:04:12

If i recall fetch-size dicts how paging will work under the hood but it wont make it "lazy". With pg you can declare a cursor in a tx and use fetch for each command with a size of data to be returned

mpenet17:04:42

Then you can feed your "pages" of rows to a fn/chan or whatever and exert backpressure how you like

uwo17:04:47

@seancorfield hmm. I threw some diagnostic code in row-fn, following rymndhng’s example. But I’m never seeing it called. Anything obvious I might be doing wrong? https://clojurians.slack.com/archives/C1Q164V29/p1524245310000177?thread_ts=1524242544.000061&amp;cid=C1Q164V29

uwo17:04:52

@mpenet Interesting, thanks. That answer is why I was wondering if i had to explicitly create the cursor in the query

mpenet17:04:25

I walked entire (large) tables efficiently like that. The other ways i always had pg send me data continuously, even if i just consumed first

seancorfield17:04:16

That SO Q&A is very, very out of date regarding java.jdbc's features.

seancorfield17:04:42

Back then -- in 0.3.0 -- it didn't have support for cursors/lazy reading etc.

seancorfield17:04:13

@uwo Just to check, you're on the latest version of java.jdbc?

uwo17:04:34

bumping to 0.7.5 now. I have been using 0.7.3

seancorfield17:04:58

0.7.3 should be new enough for this discussion (but there have been performance improvements since then)

mpenet17:04:27

I tried recently with auto-commit false and fetch-size on pg, still had to rely on that trick

seancorfield17:04:18

There's definitely a "trick" to making it work. I thought you had to explicitly set :fetch-size 0 but maybe I don't recall correctly.

uwo17:04:54

same thing on 0.7.5 (just thought I’d try)

uwo17:04:07

curious why the row-fn isn’t firing though. I must be doing something silly

seancorfield17:04:44

The :row-fn should be on the query, not the prepare-statement call.

seancorfield17:04:31

(you should just be able to pass all those options to query and let it create the PreparedStatement for you)

🙏 4
seancorfield17:04:33

I did a lot of work ensuring that options flowed all the way through all the calls -- if you use a high-level function and pass in the options there. You should almost never need to call prepare-statement yourself these days.

seancorfield17:04:46

You may also find that reducible-query works better for this (assuming you plan to process all the rows returned, but you want them processed in smaller chunks).

uwo17:04:52

we do plan to use all rows. I think the only reason we’re not using reducible-query at the moment was I had some difficulty loading it onto a chan with to-chan/onto-chan. could be an easy fix. I’ll look into that again

seancorfield17:04:51

OK. If that ends up looking like a java.jdbc bug, please open a JIRA ticket with repro details. I haven't tried using it with core.async (although I recall having a hard time figuring out how to get a reducible collection into a channel when I was playing with core.async a while back).

uwo18:04:03

that’s the result of a diagnostic :row-fn. It still requires transferring every row in order to take 10 or bounded-count 1. I’ll let you know if I have a break through 🙂

uwo18:04:49

for added context that’s with opts :fetch-size tried 10000, 1 and 0 :cursor :hold and nil :result-type :forward-only :concurrency :read-only

uwo18:04:07

maybe I completely misunderstood the rules for lazily causing the result set to fetched as you go along. I’ve been consuming the return value of query. I’m just now realizing that this section says you need to use :row-fn 😊 (http://clojure-doc.org/articles/ecosystem/java_jdbc/using_sql.html (“Processing each row lazily”)

seancorfield18:04:32

Or use reducible-query? 🙂

uwo18:04:21

interestingly I just got the same result with it reducible-query

uwo18:04:00

like (into [] (take 1) (reducible-query etc

uwo18:04:33

I blame jdts:1.3.1 lol 🤷

seancorfield18:04:28

Have you tried the official MS JDBC driver?

uwo18:04:51

seems like a good idea at this point!

uwo19:04:01

darn. tried with https://www.microsoft.com/en-US/download/details.aspx?id=11774 and (once I figured out the connection string) still same behavior (including reducible query)

seancorfield19:04:29

@uwo The MS driver is available as a regular Maven dependency, and you don't need a connection string if you use {:dbtype "mssql" :dbame "mydb" :user "..." :password "..." ...} for your db spec.

seancorfield19:04:13

See the deps.edn example in the java.jdbc for the dependency

;; Note: Assumes Java 8; there's a .jre7 version as well
                         com.microsoft.sqlserver/mssql-jdbc {:mvn/version "6.2.2.jre8"}}}

seancorfield19:04:56

And the mssql-db example in the java.jdbc test file:

(def mssql-db {:dbtype   "mssql"
               :dbname   mssql-dbname
               :host     mssql-host
               :port     mssql-port
               :user     mssql-user
               :password mssql-pass})

uwo19:04:15

thanks! we had been using that artifact to connect to our mssql backed datomic instance. so I just went with what I had on hand. Also I wasn’t sure how to specify an instanceName alongside the dbname with the map spec

uwo19:04:58

I’ll certainly try with the driver available on Maven. thanks

seancorfield19:04:15

By "instanceName" are you referring to the \... portion of the string? If so, you can specify it as part of the :host:

(def mssql-host   (or (System/getenv "TEST_MSSQL_HOST") "127.0.0.1\\SQLEXPRESS"))

uwo19:04:38

ah yes! that’s the one! so \\instancename will suffice, then