Fork me on GitHub
#sql
<
2023-01-06
>
Cam Saul19:01:10

To people who have switched from Honey SQL 1 to Honey SQL 2: Suppose you're generating a query like

SELECT count(*), (field_b + 2) AS my_expression
FROM my_table
GROUP BY (field_b + 2)
ORDER BY (field_b + 2) ASC
in most SQL databases that's perfecty fine. However, something like
SELECT count(*), (field_b + ?) AS my_expression
FROM my_table
GROUP BY (field_b + ?)
ORDER BY (field_b + ?) ASC

-- params: 2, 2, 2
often is not, because some databases cannot recognize parameterized expressions as being the same thing the way they can with numeric literals. When https://github.com/jkk/honeysql/pull/122 was merged in to Honey SQL 1 we were able to work around issues we ran in to relating to this by doing
(extend-protocol honeysq.format/ToSql
  Number
  (to-sql [x] (str x)))
But in Honey SQL 2, there's no equivalent protocol, right? How do people work around things like this? I suppose I can just walk the Honey SQL maps before calling sql/format and wrapping all numeric literals in :raw or something like that. But that's a lot less elegant than just extending a protocol to tell it how I'd like it to compile a given class. Also before anyone says it, I know that we could also generate the query a little differently, e.g. like
SELECT count(*), my_expression
FROM (
  SELECT (field_b + 2) AS my_expression
) t
GROUP BY t
ORDER BY t ASC
but it's not entirely in my control. At Metabase we have third-party driver authors writing drivers for different databases and I don't want to introduce breaking changes to their code if I can avoid it

seancorfield19:01:49

:inline is likely what you want here?

Cam Saul20:01:34

So, it's not a huge deal to make sure any numbers we're sticking in the Honey SQL we personally generate in this situation is wrapped in :inline, the problem is more that unless our 3rd party database driver authors change all their code to make sure they do it too, switching to Honey SQL 2 is going to introduce a breaking change for them here's a rough example. Suppose we have a method so different databases can tell us how to convert an integer column storing a unix timestamp in milliseconds to something like timestamp or timestamp with time zone so we can use date extract functions on it. One database might have a unix_timestamp_seconds() function, so the driver method would be something like

(defmethod unix-timestamp->timestamp :my-driver
  [expression]
  ;; unix_timestamp_seconds(expression / 1000)
  (hsql/call :unix_timestamp_seconds [:/ expression 1000])) 
They'll have to update that to
(defmethod unix-timestamp->timestamp :my-driver
  [expression]
  [:unix_timestamp_seconds [:/ expression [:inline 1000]]]) 
when we pull the trigger and switch to Honey SQL 2. I'm come to terms with hsql/call => plain vector -- it's a little easier to catch places where you forgot to do this since we can enforce Kondo :discouraged-var rules against hsql/call or just remove the dependency on Honey SQL 1 entirely -- but having to remember to do [:inline 1000] instead of 1000 is harder to remember I think, harder to enforce, and more error-prone. It's a problem because something like
SELECT unix_timestamp_seconds(my_field / ?) -- 1000
FROM table
would work, but
SELECT unix_timestamp_seconds(my_field / ?) -- 1000
FROM table
GROUP BY unix_timestamp_seconds(my_field / ?) -- 1000
probably would not -- it's not as immediately obvious that you made a mistake

Cam Saul20:01:00

I don't want to make people's lives hard and break things for people if I can avoid it

seancorfield20:01:07

You could always stick with 1.x...

seancorfield20:01:27

BTW, hsql/call still exists to make migration easier (it's just not documented - except in one change log entry)

Cam Saul20:01:10

👍 oh, that's great, I think I forgot about sql/call but now that you mention it I do remember seeing it in the differences documentation

seancorfield20:01:51

FYI, I won't be able to give very detailed answers to anything for a week or so. I'm in Hawaii for a friend's memorial so I don't have a computer - just my phone.

Cam Saul20:01:50

👍 Hawaii sounds nice! Thanks for taking the time to answer my questions anyway. I was just wondering if there was some easy way to work around this problem that wasn't obvious to me. I think maybe I'll just hook into where we set parameters for PreparedStatements and if someone tries to set a Number I'll have it error so it will catch things like this. Our test suite would basically be able to enforce usage of :inline for numbers then. We do want to make the switch over to Honey SQL 2, even tho it's going to be a little tricky. I'll figure out how to make it go as smooth as possible

Cam Saul20:01:32

There are some things we're doing in Metabase with Honey SQL that we could probably push upstream at some point -- for example we've added a type system to expressions so you can optionally include type information for a clause. This is important for BigQuery for example, because if you want to add an interval to a timestamp you need to use timestamp_add and if you want to add an interval to a datetime you need to use datetime_add , a date needs date_add, etc. So we can write code that knows and keeps track of a given expression type. Another thing we use it for is avoiding unnecessary casts -- we can say for example we need to wrap some expression in a cast(expr AS timestamp) unless it's already a timestamp or timestamp with time zone .