This page is not created by, affiliated with, or supported by Slack Technologies, Inc.
2019-04-11
Channels
- # announcements (14)
- # beginners (119)
- # boot (9)
- # calva (7)
- # cider (12)
- # cljdoc (2)
- # cljsrn (28)
- # clojure (127)
- # clojure-dev (2)
- # clojure-europe (3)
- # clojure-italy (2)
- # clojure-losangeles (9)
- # clojure-nl (6)
- # clojure-spec (15)
- # clojure-uk (39)
- # clojurescript (35)
- # community-development (4)
- # cursive (9)
- # datascript (8)
- # datomic (5)
- # duct (3)
- # emacs (10)
- # fulcro (45)
- # graphql (3)
- # jobs (1)
- # kaocha (8)
- # luminus (2)
- # off-topic (121)
- # onyx (3)
- # pathom (15)
- # pedestal (31)
- # planck (5)
- # reagent (25)
- # reitit (3)
- # remote-jobs (1)
- # shadow-cljs (48)
- # slack-help (1)
- # sql (142)
- # tools-deps (78)
Are there any SQL ORMs for Clojure? The frameworks I’ve seen for Clojure (Luminus) encourage you to manage your database more or less with SQL (e.g. HugSQL). This is quite a different approach to what I’ve seen in things like Django or Rails. Is it just an OOP vs FP thing? Is there a broader philosophical reason that is specific to Clojure? Are there good ORMs that I simply haven’t come across?
have you checked? https://github.com/metabase/toucan
I don't think there is any 'real' ORM for Clojure. Probably in large part because Clojure is dynamic, and thus it's fine to work just with maps. I myself find it quite liberating to be able to work with something like just clojure.java.jdbc
instead of having to define types for everything.
Clojure, as a community, is pretty opinionated against ORMs (my first job using clojure, where I learned Clojure, was working in a Clojure ORM - like thing, it wasn't particularly popular)
when in doubt we prefer plain data over customized Objects, so much of our code is already turning some specialized object type from some Java lib into plain data
with clojure.java.jdbc, we already have a view as data going both ways, and that fits idiomatic clojure dev perfectly
(by plain data I mean hash-maps, lists, vectors, sets, numbers, strings, etc.)
I could imagine if spec becomes final a need might arise to work more easily with specced data with a relational database.
@ben606 back to your original question, I think it is an OOP vs. FP thing - we tend to design in terms of maximally abstract data and data transformations, rather than specialized models
even with spec it's about asserting some properties of plain data, not coercing into an opaque model
clojure.java.jdbc or hugsql doesn't need to change its behavior to be spec compatible
No, but I could imagine something that takes a collections of spec's and outputs the SQL to update the table to store those.
oh, like ddl via spec?
I guess, not sure how feasible it is through, with maps-in-maps and vectors-in-maps and such. Will probably be pretty hard to support all that's possible, so maybe just a subset.
I've gone the other way, generating specs from a database (using the column metadata you get from jdbc)
also anything about sql schemas and ddl is either very weak or very non-portable
eg. I could imagine a postgres<->spec lib being much more useful than a jdbc<->spec lib
This is a really interesting discussion, thanks. I guess I’m still wondering why something that covers some of the duties of an ORM, while still being idiomatic hasn’t arisen. For example, it seems to me like being able to define your schema in clojure (I guess integration with spec here would be nice) and generate migrations automatically would both be nice, useful features. Am I still missing a trick?
I dunno, I find sql to be really good at being sql, where most dsls that wrap it do it poorly
> I think it is an OOP vs. FP thing I’m not entirely sure about what the ML langs do, but it could be an interesting point of comparison
clojure's model is so close to being relational having a mapping also doesn't feel like it buys you much
@ben606 I think the main reason that doesn't exist is that it's simpler to deal with the sql schema as what it is, and turn that into something idiomatic for clojure (lazy-seq for streaming results, vectors for eager, sets for rows), and creating something on top of that to impose a reified data model has a much lower benefit in a data-first language like clojure
I’ve seen people generate Haskell/Elm types from a database before which implies they do something similar to clojure, though
they must generate types to be able to effectively work with the data in their language
yeah - because fp isn't about making a black box for side effects and state, and a lot of what makes an ORM interesting is mapping from static boring data to containers for behaviors and mutation
It’s not as though automatic migration generation wouldn’t be useful, it’s just that you’re kind of approaching the problem from a whole different perspective
right - wanting auto migration generation makes pragmatic sense when you want to pretend a db holds a complex object with black box behaviors and internal state
@seancorfield yes, still think nav should be opt in
@ghadi I did some benchmarks on the difference between just building a hash map vs. also adding metadata and it was about 10ns difference on an operation that takes about 3 micro seconds.
After spending a fair bit of time working a variant of RowBuilder
into the code last night, my next step will be breaking apart the low-level execute/reduce parts so I can address the three use case drivers separately and as optimally as possible...
I take it the nav support isn't eagerly reading ResultSetMetadata and trying to be clever
1. reducible!
will not realize rows (unless you apply a function that requires a hash map, like assoc
, in which case it will)
2. execute-one!
will realize just one row from the ResultSet
as a hash map (and will not do the reduce
thing at all)
3. execute!
will default to realizes an entire ResultSet
-- and do it via a ResultSetBuilder
protocol that has start/add-row/end hooks, much like RowBuilder
has start/add-column/end hooks.
nav
support is lazy -- the only overhead of being datafiable is the with-meta
call to add the datafy
protocol with a function value.
Only if you actually call datafy
will it then attach metadata for the nav
protocol, and only if you then call nav
will it actually do anything.
The caveat there is that datafy
/`nav` has to happen in the same "connectable" context as the original query -- so if you do it from a Connection
, that has to stay open for datafy
/`nav` to work (but if you do it from a DataSource
, the nav
operation will get a new connection from that as needed.
My intent is that you only pay for what you imply you're going to need -- so case #1 will delay
metadata fetch / column name creation until you actually force a hash map operation; case #2 will take that hit up front, as will case #3. #1 will use a delayed hash map row builder, #2 will use an eager hash map row builder, and #3 will use an eager result set builder.
The result set builder will have variants for vector of hash maps and vector of cols-plus-row-values (`as-arrays`) and I'll add an example in the benchmark tests for vector of records (since @ikitommi seems very interested in that approach).
I think something should be able to provide [cols rows*]
, but something in user space, not in the library
Hahaha... I know, I know... I have never used it in production code (just grep'd for it!) but quite a few people do based on the feedback around it.
@seancorfield as long as the API doesn't contort itself to support that weird case
It's just a result set materialization strategy 🙂
I am liking the idea of providing as an argument a function that returns a RowBuilder, rather than providing the RowBuilder directly
Like this? https://github.com/metosin/porsas/blob/master/src/porsas/core.clj#L234-L241
function that returns a reified RowCompiler, that, given an seq of fields will return a function of result-set => value
In porsas, The columns can be read at query compile time and the cost to realize a row into a map is close to zero. For both maps & records.
Current sketch of RowBuilder
in next.jdbc
: https://github.com/seancorfield/next-jdbc/blob/master/src/next/jdbc/result_set.clj#L56-L65 (which example of hash map row builder below). Subject to change (lots of change) but that's made several of the benchmark tests noticeably faster -- I haven't yet updated the recorded timings in the test file tho').
It uses .getObject
with a column index. So it needs to know how many columns to iterate over.
This way avoids the user of RowBuilder
needing to know anything about the names of columns -- that's all encapsulated in the builder.
we have different things we're conflating: 1) how do you get the names for columns in a resultset? 2) how do you extract a row from the RS? 3) how do you build something based on what is extracted?
Well, yes, ResultSet
is horrible. We can all agree on that 🙂
Users of clojure.java.jdbc
/ next.jdbc
in general want control of column name creation strategies. The RowBuilder
can get the names from the result set's metadata if it wants them (which is why in next.jdbc
right now it is passed the ResultSet
when the strategy is created -- so it can do whatever it wants/needs).
So that's the answer to 1.
I do. Read the code 🙂
It's right below the code I linked to above...
Yes, the intent is to allow users to be able to override the column naming just by writing their own RowBuilder
.
I am in pain 😐
So, #2 is done by reducing on the row builder: call row
to get a fresh row, repeatedly call with-column
to add column values, call build
to complete the row (yes, horrible names and they'll change when I write ResultSetBuilder
).
Not sure what you're asking with #3?
The execute!
(multi-row) function will use the ResultSetBuilder
in a similar way, with a call to get a fresh result set (generally a transient vector), a call to add each row (built per the above), and a call to complete the result set (make it persistent, most likely). That way as-arrays
can get column names in the "fresh result set" call and return a vector containing a vector of column names, and the row builder portion will just return vectors of column values -- no names, and those will be added to the result set...
@ghadi here’s how to use that:
;; 730ns
(let [query (p/compile
"SELECT * FROM fruit"
{:connection connection
:row (p/rs->map)
:key (p/unqualified-key str/lower-case)})]
(title "porsas: compiled map, unqualified")
(bench! (query connection)))
1) extract the resultset metadata at a query-compile time. it build a RowCompiler
which is called for all rows. Perf is 10x better than realizing rows with idiomatic clojure code.
it’s the p/compile
, you basically prerun the query once to get the metadata to produce an optimal mapper
same that @seancorfield is doing at request-time (first thing in all requests I believe)
;; 9200ns
(let [query #(j/query {:connection %} ["SELECT * FROM fruit"])]
(title "java.jdbc")
(bench! (query connection))
the results are from both:
[{:id 1, :name "Apple", :appearance "red", :cost 59, :grade 87.0}
{:id 2, :name "Banana", :appearance "yellow", :cost 29, :grade 92.2}
{:id 3, :name "Peach", :appearance "fuzzy", :cost 139, :grade 90.0}
{:id 4, :name "Orange", :appearance "juicy", :cost 89, :grade 88.6}]
@seancorfield @ikitommi I think optimizing for the case of not going over the network (most SQL dbs are remote), or not consuming a query fully, is folly
I think you'd be very surprised at the overhead of turning a ResultSet
into a vector of hash maps for large result sets...
I’m looking for “how fast we can go with Clojure”, kinda pushing the limits for no good reason.
@seancorfield if you're using transients and conj!, and not doing a lot of repeated work in the middle of the loop, it will be good enough
here’s a good example of clojure jdbc perf compared to other langs, in a silly web benchmark: https://www.techempower.com/benchmarks/#section=data-r17&hw=ph&test=fortune
clojure.java.jdbc
is "slow" at result set building -- I've benchmarked it against (remote) Percona/MySQL instances to compare against next.jdbc
-- the improvements around faster result set building are definitely worthwhile.
for some reason, the fastest clojure version is 4x slower than the fastest ones. the code is kinda same, just the rowmapping is doing a lot more work.
When the speed of row mapping was first brought up, I was very skeptical that it made enough of a difference compared to network/JDBC time spent that it would be worth addressing... but, yeah, it really does.
mapping to maps (qualified or not) is about as fast as records, which can be made as fast as java.
happy to contribute some of the perf code to whatever will be the next jdbc wrapper.
Next thing planned is to hide the compilation phase into the query itself, e.g. create the mapper on fist query and store it for next queries. The first call would be 3x slower, rest fast.