Fork me on GitHub
#honeysql
<
2022-03-26
>
seancorfield04:03:47

I'm working on addressing https://github.com/seancorfield/honeysql/issues/398 by adding honey.sql.pg-json as a new namespace that, when required, will register the PostgreSQL JSON/JSONB operators and also provide symbolic names for the four "unwritable" operators that include @ (`@>` - at>, <@ - <at, @? - at?, and @@ - atat). Question: would it be more intuitive for all JSON/JSONB operators to have a symbolic name (even for those that can be written as a literal Clojure keyword or symbol) and, if so, what should those names be? The same as the underlying op keyword perhaps (does that make sense for ->, ->>, ||, and - which are all part of clojure.core already)?

seancorfield05:03:30

As a reminder, those operators are: ->, ->>, #>, #>>, @>, <@, ?, ?|, ?&, ||, -, #-, @?, and @@.

seancorfield05:03:21

(note that || and - are already built-in operators for HoneySQL)

dharrigan10:03:32

Personally, I think having symbolic names for only those operators that are unwriteable and then having things like pg_json/-> for those that are writeable is fine. I know I would only use (or favour really) writing pg_json/-> over a symbolic name each time. Having documentation to show just the cases were a symbolic name is necessary would cover having the mix of both.

seancorfield12:03:45

@dharrigan that would be a symbolic name - since I'd have to define -> in the pg-json namespace for that to work.

seancorfield12:03:22

You'd say :-> directly if there was no symbolic name.

seancorfield12:03:44

The unwritable operators have to have symbolic names (since you can't write them directly).

dharrigan17:03:35

Ah I see, well, I do see merit then in the symbolic name, on the other hand, learning to use :-> is fine also

seancorfield18:03:48

Yeah, maybe it's just better to be consistent and provide vars for all of them, even though only the four at ops really need a name...

seancorfield21:03:30

#>, #>>, and #- also need names -- they can be written just fine as keywords but not as symbols -- so hash instead of # 🙂

dharrigan21:03:46

glad you're using hash, not pound 😄

seancorfield21:03:16

(and this makes me realize that these operators can't be used in the datalog-style DSL for HoneySQL, i.e., as quoted symbolic forms: '(#> a b) is not valid:

dev=> (sql/format '{select ( ( (#> a b) c ) )})
Syntax error reading source at (REPL:1:31).
No reader function for tag >
;; you can fallback to a keyword:
dev=> (sql/format '{select ( ( (:#> a b) c ) )})
["SELECT a #> b AS c"]

dharrigan21:03:09

Ah yes, the reader symbol

seancorfield21:03:48

(you cannot do the same thing with @ since that is turned into a (deref ..) call by the Clojure reader -- so there are a lot of weird corner cases here)

dharrigan21:03:34

btw, I was playing around with over yesterday and I have an example of how to use it when the select columns are dynamic (coming in from a search, i.e., only return such and such columns), does this look about right? If so, then perhaps an inclusion into the docs (as the examples only show how to use over when the columsn are known in advance):

dharrigan21:03:38

(defn search
  [query-parameters]
  (let [{:keys [field1 field2 field3 fields sort-field sort-direction page limit] :or {page 0 limit 100 sort-field :field1 sort-direction :desc}} query-parameters]
    (-> {:select (merge (mapv #(->kebab-case (keyword %)) (or fields [:*]))
                        [[:over [[:count :*] {} :total-rows]]])}
        (from :service)
        (where
         (when field1 [:= :field1 field1])
         (when field2 [:= :field2 field2])
         (when field3 [:= :field3 field3]))
        (order-by [sort-field sort-direction])
        (offset (* limit page))
        (helpers/limit limit)
        sql/format)))

dharrigan21:03:54

natch, ->kebab-case is from csk.

seancorfield21:03:18

I guess that merge is honey.sql.helpers/merge -- kind of weird to see merge on vectors 🙂

dharrigan21:03:31

it's the standard clojure merge

seancorfield21:03:22

That's... an odd usage then...

dharrigan21:03:25

I tried firstly to use the honeysql helpers, i.e, (select), but that just caused it to do something like select field1 as field2 and so on, since the mapv returns a vector, i.e., [:field1 :field2 :field3]`. So had to try using the {} variant

dharrigan21:03:06

Happy to understand if there is a better way if the fields are unknown coming in

dharrigan21:03:33

(obs, it does work, as the sql it spits out is valid)

seancorfield21:03:38

This doesn't seem right:

dev=> (let [fields [:a :b :c] ->kebab-case identity]
 #_=>   {:select (merge (mapv #(->kebab-case (keyword %)) (or fields [:*]))
 #_=>                         [[:over [[:count :*] {} :total-rows]]])})
{:select [:a :b :c [[:over [[:count :*] {} :total-rows]]]]}

seancorfield21:03:05

Maybe I'm just misunderstanding what you're trying to do?

seancorfield21:03:44

merge is meant for hash maps -- that it works on vectors is an accident of its implementation using conj

dharrigan21:03:58

It's a way of returning the total number of rows in a table, whilst also only being interested in a subset of the rows in the actualy query

seancorfield21:03:19

Forget the SQL -- I'm talking about the Clojure data structure.

dharrigan21:03:24

oh sorry 😄

dharrigan21:03:15

It looks quite like the example in the docs

dharrigan21:03:26

(sql/format {:select [:id
                             [[:over
                               [[:avg :salary]
                                {}
                                :Average]
                               [[:max :salary]
                                nil
                                :MaxSalary]]]]
                    :from [:employee]})

seancorfield21:03:13

Not relevant to variable column names though...

dev=> (let [fields [:a :b :c]]
 #_=>   (sql/format (apply select fields)))
["SELECT a, b, c"]
dev=> (let [fields [:a :b :c]]
 #_=>   (sql/format {:select fields}))
["SELECT a, b, c"]

seancorfield21:03:06

I'm confused about what you're trying to do -- the DSL is "just" a data structure so you can construct it however you want with literals, honey.sql.helpers calls, Clojure functions to manipulate data...

dharrigan21:03:44

Well, from the outside world coming in, someone may be doing ?fields=field_foo1,field_bar2 which ends up as {,,,,, :fields ["field_foo1", "field_bar2"]}, (just realised that having the kebab-case is unecessary), but anywhoo, I have to take those fields, could be N, and cconstruct the SQL from that but also end up with a select that included the {:over} clause as well, what I could get to work was {:select ...} above.

dharrigan21:03:15

So, I get this:

dharrigan21:03:19

(search {:fields ["name" "address"]})
;; ["SELECT name, address, COUNT(*) OVER () AS total_rows FROM service ORDER BY field1 DESC LIMIT ? OFFSET ?"
;;  100
;;  0]

dharrigan21:03:27

as an exmple

dharrigan22:03:08

Hmm, could use concat

dharrigan22:03:17

(-> {:select (concat (mapv keyword (or fields [:*])) [[:over [[:count :*] {} :total-rows]]])}

seancorfield22:03:21

Ah, OK, so this is all about taking a collection of strings and wanting to treat them as column names (keywords or symbols)?

dharrigan22:03:35

yes, that is right

dharrigan22:03:19

So, this looks better

dharrigan22:03:21

(defn search2
  [query-parameters]
  (let [{:keys [field1 field2 field3 fields sort-field sort-direction page limit] :or {page 0 limit 100 sort-field :field1 sort-direction :desc}} query-parameters]
    (-> {:select (concat (mapv keyword (or fields [:*])) [[:over [[:count :*] {} :total-rows]]])}
        (from :service)
        (where
         (when field1 [:= :field1 field1])
         (when field2 [:= :field2 field2])
         (when field3 [:= :field3 field3]))
        (order-by [sort-field sort-direction])
        (offset (* limit page))
        (helpers/limit limit)
        sql/format)))

dharrigan22:03:38

(search2 {:field1 "rocking" :fields ["field1" "field2"]})
;; ["SELECT field1, field2, over AS COUNT(*)  total_rows FROM service WHERE field1 = ? ORDER BY field1 DESC LIMIT ? OFFSET ?"
;;  "rocking"
;;  100
;;  0]

dharrigan22:03:23

Actually that's wrong now, the concat doesn't construct the vector to create the over correctly

seancorfield22:03:37

So now all I need to do is invoke your app with ?fields=some,SQL,injection,here and your database is toast 🙂

dharrigan22:03:25

I'm going to try that, looks fun!

seancorfield22:03:53

The simple and obvious injection is detected and blocked:

dev=> (let [fields ["foo; drop user"]]
 #_=>   (sql/format {:select (mapv keyword fields)}))
Execution error (ExceptionInfo) at honey.sql/format-entity (sql.cljc:206).
suspicious character found in entity: foo; drop user
but more sophisticated ones can be constructed fairly easily.

dharrigan22:03:34

Perhaps it's easier then to return all fields, i.e., select *, then filter after that?

dharrigan22:03:53

not sure, since it would be helpful for the db to only return fields that you're interested in, after all, it's good at that 🙂

dharrigan22:03:22

and if you don't know in advance which fields the client will want returned, interesting!

dharrigan22:03:38

cough*graphql*cough

seancorfield22:03:11

Outside the scope of HoneySQL but sanitizing requested field names into column names is important and would be the correct approach here.

dharrigan22:03:04

True, no need to discuss it here.

seancorfield22:03:01

(now I see what you're really asking -- it's something we do in code at work but we validate input against a known list of valid fields that can be exposed... after all, we don't want passwords and credit card info returned in a query where users can specify arbitrary columns to return)

dharrigan22:03:06

Yes, I was starting to ponder that too. I can accept an input of fields from the outside world, and strip away any fields that I don't allow and are not part of the table schema.

dharrigan22:03:37

A nice little problem to ponder on over the next few days. This is for a personal project of mine, so nothing work related 🙂

dharrigan22:03:11

Thanks Sean! I'm offski to bed now.

seancorfield22:03:28

dev=> (let [fields ["password from user union select 1 as password"]]
 #_=>   (sql/format {:select (mapv keyword fields) :from :table}))
["SELECT password from user union select 1 as password FROM table"]
🙂

seancorfield22:03:13

There's a ticket about detecting/blocking this sort of thing: #394

👍 1