Fork me on GitHub
#sql
<
2018-02-28
>
joost-diepenmaat15:02:15

hey thatโ€™s pretty cool

Will20:02:13

I have this query

(def find-checks
  "SELECT * FROM jnl WHERE sourcebid = 'CH' AND amount IN (?) AND document_no IN (?)")
(j/query dp-spec [queries/find-checks (mapv :check_amount icchecks) (mapv :check_no icchecks)])
When calling the query Iโ€™m getting an invalid datatype conversion Is it possible to pass in collections as query parameters?

seancorfield20:02:08

Each ? is a single argument substitution. You can't just pass a collection for it, you need to expand it to a comma-separated list in the SQL.

seancorfield20:02:38

So you'll have to build the SQL dynamically, based on the length of icchecks.

seancorfield20:02:40

(defn find-checks [size]
  (str "SELECT * FROM jnl WHERE sourcebid = 'CH' AND amount IN ("
      (clojure.string/join "," (repeat "?" size))
      ") AND document_no IN ("
      (clojure.string/join "," (repeat "?" size))
      ")"))
(j/query db-spec (into [(queries/find-checks (count icchecks))] (concat (mapv :check_amount icchecks) (mapv :check_no icchecks))))

seancorfield20:02:20

Note that might match more things than you think.

seancorfield20:02:58

Every check in the range will be a possible match. I don't think that's what you really want?

Will20:02:45

I want to do this because I'm doing a bulk select for all the rows I need at once rather than doing a query one at a time for 10,000 rows. Is there a better approach I should be using?

minikomi05:03:25

HugSQL has "value lists" you can use to generate these kind of queries https://www.hugsql.org/#param-value-list You would create your sql something like:

--:name find-checks
SELECT * FROM jnl WHERE sourcebid = 'CH' AND amount IN (:v:amounts) AND document_no IN (:v:document-nos) )

minikomi05:03:38

And call it like (find-checks {:amounts (mapv ...) :document-nos (mapv ... )}

seancorfield20:02:51

Are the check_no values not globally unique?

seancorfield20:02:46

The SQL you have will match all rows that have a check # in that list and that also have an amount in that list -- if you have check 1234 for $5 and check 3214 for $10 and check 1234 for $10 and you try to match the first two, you'll get the third one as well because 1234 is in the set 1234, 3214 and its amount is in the set $5, $10

Will20:02:47

That's right

Will20:02:35

I would probably have to select individually then huh ๐Ÿ˜ข

Will21:02:31

Is there a way to select where the check number has to match the corresponding amount?

seancorfield21:02:14

My SQL isn't good enough to figure that out ๐Ÿ™‚

tanzoniteblack21:02:57

if you did do it in bulk, it would look like (amount = ? AND document_no = ?) OR (amount = ? AND document_no = ?) OR ...; You could generate this kind of query fairly easily using https://github.com/jkk/honeysql

tanzoniteblack21:02:03

you'll probably want to batch it manually, i.e. do (partition-all 500 amount_docs) and mapcat the results together. Otherwise you risk running into the query size / parameter limit of whatever SQL server you're using

tanzoniteblack21:02:21

at least, that's what I'm reading this conversation as. If I missed something and am terribly, terribly wrong, let me know ๐Ÿ™‚

seancorfield21:02:25

@tanzoniteblack Agreed but I think a SELECT with 10,000 ORs in it is going to have terrible performance...

tanzoniteblack21:02:23

(sql/build :select :*
           :from :jnl
           :where [:and [:= :sourcebid "CH"]
                   (into [:or]
                         (mapv (fn [amount-document]
                                 [:and
                                  [:= :amount (:amount amount-document)]
                                  [:= :document_no (:document amount-document)]])
                               the-documents))])
is approximately what the honeysql would look like, I think

seancorfield21:02:45

(as for the query size / parameter limit -- I know MySQL is "happy" with 10,000 parameters... not sure how it would be with 20,000 which is what this query would become with 10,000 amount/check# pairs)

tanzoniteblack21:02:46

and @seancorfield you'd be surprised. Postgres at least is pretty good at making queries like that efficient; especially if you have a multi-column index

seancorfield21:02:24

Good point... Yes, if you have a composite index on check# + amount it could be fine.

tanzoniteblack21:02:46

but yeah, parameter limit is important to keep in mind. And that honeysql isn't intelligent enough to do that for you (rightfully so?)

tanzoniteblack21:02:56

I wouldn't put this in a production performance critical path though, it's definitely not going to be performant enough for that for my standards ๐Ÿ˜„

Will21:02:19

What would be more efficient? select with 10,000 parameters or calling the select 10,000 times

Will21:02:15

I may need to just look for a different approach

tanzoniteblack21:02:15

most SQL servers won't allow you to do a select with 10,000 parameters; however doing 20 selects of 500 parameters would probably be faster than calling select 10,000 times. however this all depends on how you're doing connection pooling (or doing this all in a single connection), what server you're using, how the server is configured, what kind of a delay do you have from your SQL server to your client running this query, etc. Personally? If you care about speed, then the fewer selects is probably going to be more efficient, but you can probably test that pretty easily. If you're ok with sacrificing a bit of speed, the single select and not involving honeysql would probably be simpler and less error prone

tanzoniteblack21:02:15

by "error prone", I mean writing the code to run the queries is more complicated and therefore more likely to contain a bug; but you can with enough effort get it to work right

benalbrecht22:02:03

if you use postgres (no idea about other engines), you can indeed use an array as parameter (or even a json string). so yes, it is very much possible to use collections ๐Ÿ˜€ iirc, arrays are expected to be java arrays, so you'd have to call into-array on your params and adapt your query to use amount = ANY(?)