2024-02-07 sql | Clojure Slack Archive

sql

k3nj1g 2024-02-07T08:01:37.118259Z

I need to process a large set of results. I use jdbc/plan for this. Like this:

(require '[next.jdbc :as jdbc] '[next.jdbc.result-set :as rs])

(with-open [db-conn (raw-connection ctx)]
  (run! (fn [row]
          (let [result (-> (select-keys row [:jsonb-field])
                           (update :jsonb-field #(-> % (str) (json/decode keyword))))]
            (some-side-effect! result)))
        (jdbc/plan db-conn query-sql {:builder-fn rs/as-unqualified-lower-maps})))

As you can see, one column is json. So I parse it with cheshire. Is there a more convenient way to get the result set in the form of clojure structures for this kind of scenario?

igrishaev 2024-02-07T08:53:02.664189Z

Is it Postgres? If yes, you need to extend some protocols and types for automatic conversion. See this link: https://github.com/seancorfield/next-jdbc/blob/44b3cc206fe4635702331814da0860e63261e9ec/doc/tips-and-tricks.md?plain=1#L364

k3nj1g 2024-02-07T10:13:05.799619Z

It's Porstgres, right. Thanks for the tip, I'll take a look.

James Amberger 2024-02-07T17:49:27.328309Z

So—if I’m stuck on Oracle, the nice things next.jdbc does with Datafy/Nav basically don’t work or don’t work out of the box because of the unqualified keywords?

seancorfield 2024-02-07T18:07:55.093269Z

Oracle's JDBC drivers do not provide table name information (as noted in the Oracle-specific Tips & Tricks page). Nothing next.jdbc can do about that, unfortunately.

James Amberger 2024-02-07T18:11:25.607569Z

Yes

James Amberger 2024-02-07T18:12:15.259739Z

and, just reading up on datafy/nav basically for the first time, but it seems to me your implementations of nav would rely on that, wouldn’t they?

seancorfield 2024-02-07T18:14:37.874849Z

It depends on your FK naming conventions, not the qualifiers.

seancorfield 2024-02-07T18:15:10.731149Z

If table foo has bar_id, then datafy/`nav` assumes that will join to table bar column id.

seancorfield 2024-02-07T18:16:22.614339Z

If you need to map table/column pairs in order to define a custom :schema mapping, then no it won't work without qualifiers. But if you have been consistent in your column naming conventions, it should all work.

James Amberger 2024-02-07T18:16:42.867179Z

okay, nice. Thanks @seancorfield

James Amberger 2024-02-07T18:17:24.217249Z

(too bad they are not my conventions 😞 )

seancorfield 2024-02-07T18:34:30.237989Z

What are your conventions? The latest next.jdbc allows you to specify more patterns.

Mark Wardle 2024-02-07T07:53:01.088829Z

Hi all. Using SQLite with next.jdbc, and I'm returning namespace qualified keys based on the table name. However, I'm doing a calculation - and using either a join or a view to get it, but get one key without a namespace.

"select * from uk_composite_imd_2020_mysoc a left join (select lsoa,ntile(4) over(order by UK_IMD_E_rank) as UK_IMD_E_pop_quartile from uk_composite_imd_2020_mysoc) b on a.lsoa = b.lsoa where a.lsoa=?"

{:uk_composite_imd_2020_mysoc/lsoa "95ZZ06W1",
 :uk_composite_imd_2020_mysoc/UK_IMD_E_pop_decile 1,
 :uk_composite_imd_2020_mysoc/UK_IMD_E_pop_quintile 1,
 :uk_composite_imd_2020_mysoc/UK_IMD_E_rank 1,
 :uk_composite_imd_2020_mysoc/UK_IMD_E_score 123.00849692191194,
 :UK_IMD_E_pop_quartile 1}

This same result occurs if I create a view and SELECT on that instead. I presume SQLite is not providing enough information behind the scenes to link a column to a table. Is there a way of providing a default, or am I better just using plan, or a non-namespaced key result set builder, and then change the namespaces afterwards?

igrishaev 2024-02-07T08:50:46.031659Z

The simplest solution would be to skip namespaces by passing this argument:

{:as next.jdbc.result-set/as-unqualified-maps}

Pay attention that the namespaces are fetched using an extra query. It means, each time you select something, you perform two queries under the hood.

👍 1

Mark Wardle 2024-02-07T09:18:43.506939Z

Thank you

Mark Wardle 2024-02-07T09:19:26.044279Z

I didn't know about the double fetch thing - I assumed it came in the ResultSet internally but didn't dig into this.

seancorfield 2024-02-07T18:10:36.358339Z

> Pay attention that the namespaces are fetched using an extra query That is false.

seancorfield 2024-02-07T18:11:59.969139Z

Either the ResultSetMetaData provides that information or it doesn't. e.g., Oracle simply doesn't implement that in its drivers. MS SQL Server only provides it if certain settings are present.

seancorfield 2024-02-07T18:12:31.649319Z

Computed columns never provide that information, which is why that column in the OP's query doesn't have a qualifier.

seancorfield 2024-02-07T18:13:36.397509Z

For performance reasons, you are better using plan in all cases since that avoids constructing Clojure hash maps from ResultSet objects.

👍 1

Mark Wardle 2024-02-07T18:16:23.047379Z

Thanks @seancorfield - I appreciate your comments and advice. I've kept it as simple as possible and delegated giving the namespaced keys into my application rather than next.jdbc which is working well. Thank you.

seancorfield 2024-02-07T18:17:32.866049Z

As you can probably tell, I get rather annoyed when people just throw out the recommendation to use the unqualified builders... That's not what people should turn to as a solution.

👍 1

seancorfield 2024-02-07T18:18:11.243859Z

If you use plan for performance, the column names are unqualified, because no builder is used -- which is where the performance boost comes from.

Mark Wardle 2024-02-07T18:18:39.557919Z

Yes I understand that. I think in my case using plan makes sense given what I'm doing and it makes it explicit. Thank you again.

igrishaev 2024-02-07T18:23:53.828849Z

> That is false I haven't checked this case with other databases, but for Postgres, it's true. Fetching column names from the RSMetadata triggers this query:

igrishaev 2024-02-07T18:25:16.811439Z

igrishaev 2024-02-07T18:26:03.810849Z

Which returns the following result:

igrishaev 2024-02-07T18:26:59.410549Z

One can ensure when running Postgres with -E flag (log everything).

seancorfield 2024-02-07T18:30:56.849969Z

Is that specifically for the .getTableName() call or for ResultSetMetaData overall? Does it run it every time the main query is run or cache it as part of the plan or other data about a query?

igrishaev 2024-02-07T18:32:35.756559Z

I believe it gets executed once when getTableName is called, and then cached

igrishaev 2024-02-07T18:33:23.243979Z

But in fact, every time we select something with f.q. keys, it doubles the number of queries

seancorfield 2024-02-07T18:33:54.742339Z

Given the focus on performance from the metosin folks when they were helping with the initial implementation of next.jdbc, I'm surprised they didn't call that out as a problem (but we were all probably more focused on plan which is the recommended approach anyway).

igrishaev 2024-02-07T18:42:08.048769Z

Maybe because this query is done under the hood by the driver. To spot it, one should enable logging all the expressions with the -E flag. I always set it to true and observe in the Docker console all the expressions executed. Often, there are some really weird ones produced by the drivers.

seancorfield 2024-02-07T18:44:58.462819Z

Looking at the MySQL driver, it seems that all that information is pulled directly as part of what is needed to support even calling .getValue() on a row in a ResultSet -- it uses the field definition which has the table name and column type information all baked in. So I'm guessing different drivers handle this very differently...

seancorfield 2024-02-07T18:45:38.498069Z

(so if MySQL ends up running multiple queries, it isn't .getTableName() that triggers it as far as I can tell from the source code)

igrishaev 2024-02-07T18:47:40.125399Z

It looks like MySQL passes the names of the columns that participate in the result. But Postgres doens't. They only pass OIDs: oid of the type, of the table, and the column. To resolve the name of the tables, one should query the pg_catalog table

igrishaev 2024-02-07T18:49:21.034899Z

It's out of the scope of the origin question as it's about SQLite, but just in case

seancorfield 2024-02-07T18:50:44.712809Z

Interesting... I'll create an issue to add notes to the Tips & Tricks docs and maybe some of the mentions of qualified columns names for PG's behavior. Definitely not what I would have expected, given how PG prides itself on being so advanced 😉 where other DBs / drivers can handle this without the extra query 🙂 TIL.

igrishaev 2024-02-07T18:53:11.225679Z

yeah... another problem with Postgres is enum types. All the standard types have hardcoded OIDs so they're defined in the code. But when someone defines their own enum, it gets a random oid, and on the driver's level, it's unclear what is it.

igrishaev 2024-02-07T18:53:43.860479Z

you only have type_oid = 8883, no name.

seancorfield 2024-02-07T18:55:20.657999Z

Ouch. So user-defined enums have a performance overhead there?

seancorfield 2024-02-07T18:55:25.082049Z

https://github.com/seancorfield/next-jdbc/issues/275

seancorfield 2024-02-07T18:55:55.134249Z

PG weirdness is the bane of my life, both with next.jdbc and with HoneySQL! 🙂

igrishaev 2024-02-07T19:04:15.237569Z

No I just meant it's unclear how to parse these unknown types. Having the name of the type would be great, and the client could provide a mapping like <type_name = parsing function>.

igrishaev 2024-02-07T19:04:51.222089Z

but we have integer oids which might differ depending on the server

Clojurians Log v2

sql