honeysql 2021-03-16 | Slack Archive

Here is the latest v2 "puzzle": The SQL I'm trying to generate is:

LEAD(g) OVER (PARTITION BY fid ORDER BY m)

I've tried this:

(sql.helpers/over [[:lead :g]
                   (-> (sql.helpers/partition-by :fid) (order-by :m))])

which generates:

[[:over [[:lead :g] {:partition-by [:fid], :order-by [:m]}]]]

which sql/format chokes on

seancorfield01:03:40

Can you provide a bit more context? This works:

user=> (sql/format (-> (select (over [[:lead :g] (-> (h/partition-by :fid) (order-by :m))]))))
["SELECT LEAD(g) OVER (PARTITION BY fid ORDER BY m ASC)"]

seancorfield01:03:03

Here’s a more complex example from the docs:

user=> (sql/format (-> (select :id
  #_=>                                (over [[:avg :salary] (-> (h/partition-by :department) (order-by :designation)) :Average]
  #_=>                                      [[:max :salary] :w :MaxSalary]))
  #_=>                        (from :employee)
  #_=>                        (window :w (h/partition-by :department))))
["SELECT id, AVG(salary) OVER (PARTITION BY department ORDER BY designation ASC) AS Average, MAX(salary) OVER w AS MaxSalary FROM employee WINDOW w AS (PARTITION BY department)"]

dcj02:03:50

I definitely looked at the docs that I could find in https://github.com/seancorfield/honeysql/blob/v2/doc/clause-reference.md The broader context is that the snipped above is part of a with CTE, here is the raw version of it (that works):

[:E [:raw "(SELECT DISTINCT ON (fid) fid, id AS a, id + 1 AS b, ST_3DDistance(ST_MakeLine(g, LEAD(g) OVER (PARTITION BY fid ORDER BY m)), pca_raw) AS dist FROM D ORDER BY fid, 4)"]]

dcj02:03:16

Here is what I tried:

[:E (-> (sql.helpers/select-distinct-on :fid
                                        :fid
                                        [:id :a]
                                        [[:raw "id + 1"] :b]
                                        [[:ST_3DDistance [:ST_MakeLine :g
                                                                        (sql.helpers/over [[:lead :g]
                                                                                          (-> (sql.helpers/partition-by :fid)
                                                                                              (order-by :m))])] :pca_raw] :dist])
        (from :D)
        (order-by [:fid 4]))]

seancorfield03:03:38

[I was wrong: HoneySQL treats :over as special syntax so it should work in any expression]

seancorfield03:03:23

Oh, I see the issue @dcj :select-distinct-on expects the first argument to be a sequence of column names per the docs: "Similar to :select-distinct above but the first element in the sequence should be a sequence of columns for the DISTINCT ON clause"

seancorfield03:03:43

Also, the order-by should have a direction (`:asc` or :desc) instead of 4 -- can you explain what you're trying to do there?

seancorfield03:03:18

Maybe this is what you're aiming for:

dev=> (sql/format (-> (h/select-distinct-on [:fid]
 #_=>  #_=>                                         :fid
 #_=>  #_=>                                         [:id :a]
 #_=>  #_=>                                         [[:+ :id 1] :b]
 #_=>  #_=>                                         [[:ST_3DDistance [:ST_MakeLine :g
 #_=>  #_=>                                                                         (h/over [[:lead :g]
 #_=>  #_=>                                                                                  (-> (h/partition-by :fid)
 #_=>  #_=>                                                                                      (h/order-by :m))])] :pca_raw] :dist])
 #_=>  #_=>                 (h/from :D) (h/order-by :fid [[:inline 4]])))
["SELECT DISTINCT ON(fid) fid, id AS a, id + ? AS b, ST_3DDISTANCE(ST_MAKELINE(g, (LEAD(g) OVER (PARTITION BY fid ORDER BY m ASC))), pca_raw) AS dist FROM D ORDER BY fid ASC, 4 ASC" 1]

Not sure whether the id + 1 is better for you as an expression rather than raw like that?

dcj16:03:23

1. Thank you for the syntax fix! 2. Clearly I need to read the docs more carefully 3. Is there some way I can run my proposed query DSL through Spec, to veriify it is correct? 4. Thank you for [[:+ :id 1] :b] that is way better

dcj16:03:58

Also, the order-by should have a direction (:asc  or :desc) instead of 4 -- can you explain what you're trying to do there?

The short answer is that I can't explain this, the query I am working on was created by an extremely experienced PostGIS developer, and I don't understand how it works 😞 That being said, I checked the Postgres docs on order-by: https://www.postgresql.org/docs/9.5/queries-order.html Note this:

A sort_expression can also be the column label or number of an output column, as in:

    SELECT a, max(b) FROM table1 GROUP BY a ORDER BY 1;

So, at this point, here is what I have:

(pprint tqe)

{:select-distinct-on
 [[:fid]
  :fid
  [:id :a]
  [[:raw "id + 1"] :b]
  [[:ST_3DDistance
    [:ST_MakeLine
     :g
     [[:over [[:lead :g] {:partition-by [:fid], :order-by [:m]}]]]]
    :pca_raw]
   :dist]],
 :from [:D],
 :order-by [[[:raw "fid, 4"]]]}

(println (sql/format tqe {:pretty true}))

[
SELECT DISTINCT ON(fid) fid, id AS a, id + 1 AS b, ST_3DDISTANCE(ST_MAKELINE(g, (LEAD(g) OVER (PARTITION BY fid ORDER BY m ASC))), pca_raw) AS dist
FROM D
ORDER BY fid, 4 ASC
]

I haven't tested this yet, need to head to a meeting, back in a couple hours

seancorfield17:03:39

I showed an order-by that would work for :fid, 4: (h/order-by :fid [[:inline 4]]) — you don’t need :raw here.

dcj19:03:37

Yes, this is working, thank you! Now, pne last giant blob of :raw to convert....

seancorfield17:03:13

Re: #3 — there’s an issue on GitHub for a Spec for the DSL but I’m not sure how tractable that really is since it would need to pretty much match the entire recursive descent “formatter”: SQL is extremely complex as a language and Spec isn’t really suited to describing language grammars.

dcj19:03:35

Could each helper fn check its args? Not saying this would enable checking everything.... I will try and give this some thought

seancorfield20:03:59

Not really: the helpers are nearly all generic — they mostly just assoc/merge arguments into the underlying data structure (they’re nearly all just calls to generic under the hood).

seancorfield20:03:41

And the problem is that the syntax of SQL is recursive: SELECT expressions can contain SELECT expressions in all sorts of complex ways.

seancorfield20:03:03

Any Spec that I end up writing is only going to catch the most egregious mistakes.

seancorfield20:03:26

There are a few clauses where the structure required is unique enough that Spec could catch some errors (`:select-distinct-on` is a particular case where it expects a sequence where the first element must be a sequence — but you already got an error from that… just not a very informative one).

seancorfield20:03:22

I can try to add more “specific error checking” in cases where, if you pass in bad data today you get a generic error (either an NPE or a CCE), but many of those would be hard to predict…

seancorfield20:03:33

#Also sent to the channel

There are also two distinct classes of “user error” here: one that blows up the formatter and one that formats successfully but produces bogus SQL. The former I can probably do something about, at least in terms of narrowing down where the error occurred, but the latter is pretty much impossible to detect.

seancorfield20:03:33

replied to a thread:Re: #3 — there’s an issue on GitHub for a Spec for the DSL but I’m not sure how tractable that really is since it would need to pretty much match the entire recursive descent “formatter”: SQL is extremely complex as a language and Spec isn’t really suited to describing language grammars.

dcj20:03:18

Understood! I'm thinking more about the "user error" case....

dcj20:03:49

Today's v2 puzzle, trying to convert this from :raw

FROM
    (SELECT fid, pca, dca FROM P) P
    INNER JOIN
    (SELECT fid, m, ST_3DDistance(g,pca) AS dist FROM D WHERE (fid, id) IN (SELECT fid, a FROM E)) A ON P.fid = A.fid
    INNER JOIN
    (SELECT fid, m, ST_3DDistance(g,pca) AS dist FROM D WHERE (fid, id) IN (SELECT fid, b FROM E)) B ON P.fid = B.fid

My approach to these tasks is to try and build "up" from an innermost. So in this case, I started trying to write:

(SELECT fid, m, ST_3DDistance(g,pca) AS dist FROM D WHERE (fid, id) IN (SELECT fid, b FROM E)) B

The part I can't figure out is WHERE (fid, id) IN (SELECT fid, b FROM E), specifically the IN part...

seancorfield20:03:43

user=> (sql/format {:where [:in [:composite :fid :id] {:select [:fid :b] :from :e}]})
["WHERE (fid, id) IN (SELECT fid, b FROM e)"]

seancorfield20:03:14

(assuming you want the data structure version, not the helper version)

seancorfield20:03:57

user=> (sql/format (where [:in (composite :fid :id) (-> (select :fid :b) (from :e))]))
["WHERE (fid, id) IN (SELECT fid, b FROM e)"]

dcj21:03:14

Nice! Thank you again!

seancorfield21:03:19

It’s probably worth pointing out that this is one of those places where the V2 support for symbols instead of keywords makes for a more readable piece of code

user=> (sql/format '{where [in [composite fid id] {select [fid b] from e}]})
["WHERE (fid, id) IN (SELECT fid, b FROM e)"]

seancorfield21:03:51

(that was not possible in V1 although symbols were accepted in a few places)

seancorfield21:03:17

Obviously you can’t use a quoted structure if you have Clojure variables in the structure but…

seancorfield21:03:40

(although you could use named parameters for that)

enn21:03:38

perfect use case for quasi-quote

dcj22:03:57

Recall my "today's V2 puzzle" above... Here is what I've got now:

(-> (select :fid :pca :dca)
    (from :P)
    (inner-join [(-> (select :fid :m [[:ST_3DDistance :g :pca] :dist])
                     (from :D)
                     (where [:in (composite :fid :id) (-> (select :fid :a) (from :E))]))
                 :A]
                [:= :P.fid :A.fid])
    (inner-join [(-> (select :fid :m [[:ST_3DDistance :g :pca] :dist])
                     (from :D)
                     (where [:in (composite :fid :id) (-> (select :fid :b) (from :E))]))
                 :B]
                [:= :P.fid :B.fid]))

Which results in this:

[
SELECT fid, pca, dca
FROM P
INNER JOIN (SELECT fid, m, ST_3DDISTANCE(g, pca) AS dist FROM D WHERE (fid, id) IN (SELECT fid, a FROM E)) AS A ON P.fid = A.fid INNER JOIN (SELECT fid, m, ST_3DDISTANCE(g, pca) AS dist FROM D WHERE (fid, id) IN (SELECT fid, b FROM E)) AS B ON P.fid = B.fid
]

This seems really close, but I can't figure out how to get the first SELECT to end with AS P It is supposed to be:

(SELECT fid, pca, dca FROM P) P

seancorfield23:03:59

You don’t need it to be aliased tho’ since the table name is already P, right?

seancorfield23:03:39

I’m not even sure how to get an alias on that without nesting it in a SELECT … SELECT * FROM (SELECT fid, pca, dca FROM P) AS P which would be (-> (select :*) (from [(-> (select :fid :pca :dca) (from :p)) :p]) ...) I think @dcj?

seancorfield23:03:27

user=> (sql/format (-> (select :*) (from [(-> (select :fid :pca :dca) (from :p)) :p])))
["SELECT * FROM (SELECT fid, pca, dca FROM p) AS p"]

seancorfield23:03:40

(SELECT .. FROM ..) AS alias INNER JOIN .. is certainly not legal SQL as far as MySQL is concerned.

dcj23:03:21

AFAICT, it is supposed to be:

(SELECT some-stuff FROM OTHER_P) P
INNER JOIN ...
INNER JOIN ...

AFAIK this is legal Postgres, my raw query above works fine

dcj23:03:48

Had to edit the above 3 times to get it right

seancorfield23:03:23

There’s no way to generate that with HoneySQL — the closest you’ll get is the nested select I showed above.

seancorfield23:03:46

I can’t even think how to write that syntax in the DSL.

seancorfield23:03:09

SELECT some-stuff FROM OTHER_P AS P would be “equivalent” though, wouldn’t it?

seancorfield23:03:21

(and that is expressible in the DSL)

dcj23:03:41

That makes me feel better since I spent the last couple hours trying and failing

dcj23:03:34

I also think your point that "you already have P, so why alias it again" bears some thought, seems like I ought to be able to make my query work without that

seancorfield23:03:05

I can’t see how it’s legal based on the Postgres docs either but, hey, maybe PG accepts syntax that isn’t documented?

dcj23:03:49

Don;t know if it matters, but all this ends up in a from

seancorfield23:03:49

The FROM can alias the result — but it would be the entire result, not just that select/`from` part.

seancorfield23:03:26

Oh, I see what you mean. So it is (-> .. (from [ (-> (select :fid :pca :dca) (from :other_p)) :p ]) (inner-join .. ) )

seancorfield23:03:00

because overall your query is select stuff from (select fid, pca, dca from p) as p ... right?

dcj23:03:46

yes the original P was part of a with/CTE above

dcj23:03:19

but the inner joins apply to the select/from within the larger from

seancorfield23:03:33

I’d have to see the whole raw query to be able to help further at this point.

2021-03-16

Channels