Fork me on GitHub
#sql
<
2019-07-22
>
orestis15:07:59

I came back to writing raw SQL after many many years, and was surprised to see that not only I had forgotten most of the syntax, I wasn’t able to do seemingly “simple” queries without a massive amount of Google and StackOverflow. What books/resources would people suggest so that I can freshen up on those things? Syntax is easy to get from the various references, but data modelling and advanced queries is much more nebulous.

kenj17:07:03

I think for me it was reading the “Learning SQL” Oreilly book, and then just going through the Postgres docs, learning about every language feature that I could. It’s really hard to find good resources on things like recursive CTEs, when to use views vs materialized views, vs temp tables, etc. Once you have an idea of what to look for feature wise, you can google for blog posts about it.

kenj17:07:39

just trying to do performant TZ aware datetime queries was painful as minor little details with how the query is written could have large consequences

orestis15:07:02

My context is things like Business Intelligence — e.g. having a users table and various other tables that relate back to users through a user_id, filter and count things and add the count fields to the user, so then you can filter users based on those foreign counts (and other stuff).

astrashe18:07:34

@orestis I don’t know if this is exactly what you’re asking for, but I liked Joe Celko’s SQL for Smarties, though I haven’t looked at it for a very long time.

astrashe18:07:13

This site is a little cheesey, but it helped me to think about indexing things a little better: https://use-the-index-luke.com/

jmclavijo18:07:43

@seancorfield Great job on next.jdbc I have a question. I’ve tried to use either both sql and prepared function for adding a vector of rows. Using insert-multi! I run into the error of my # of rows being past INT/max, and there doesn’t seem to be any way that I know around that other than maybe trying to partition the vector and pmap a call to connect and insert, but my data is getting lost along the way. The other option is using a execute-batch! which I can set the :large option or :batch-size but then I get an error inserting date values. Where I wasn’t getting that error with insert-multi!. Is there a (set-parameter ) that I should be using? I use the clj-time to properly set the sql time for the values in that column?

seancorfield18:07:56

@jclavijo If you have a very large number of rows to insert, execute-batch! with :batch-size will be the way to go there. If you're getting errors inserting dates, you'll need to extend SettableParameter to whatever type you're using for date values.

seancorfield18:07:53

(I would avoid clj-time these days since Joda Time is deprecated and both the Joda Time project and the clj-time project recommend you migrate to Java Time, or a wrapper library, instead)

jmclavijo18:07:07

@seancorfield I’ll look into clojure.java-time I didn’t know there was a recommendation to use that instead. I’ll see about extending SettableParameter and see how that goes. Thank you

seancorfield18:07:51

clj-time's readme has this to say: > A date and time library for Clojure, wrapping the Joda Time library. The Joda Time website says: > > > Note that from Java SE 8 onwards, users are asked to migrate to java.time (JSR-310) - a core part of the JDK which replaces this project. > > If you are using Java 8 or later, consider using the built-in Java Time instead of Joda Time -- and look at clojure.java-time if you want a Clojure wrapper for that, or cljc.java-time for a thin Clojure(Script) wrapper. See Converting from Joda Time to java.time for more details about the similarities and differences between the two libraries.

4
seancorfield18:07:00

(I pasted it here to remind folks that the Joda Time project wants folks to migrate to Java Time -- it's not just the clj-time maintainers trying to push folks to other libs! 🙂 )

seancorfield18:07:16

We're heavy users of clj-time at work but we're also trying to move away from it. The JDBC interop is one reason to avoid Joda Time, in addition to the deprecation.

jmclavijo21:07:39

@seancorfield Am I doing this right?

(extend-protocol prepare/SettableParameter
     Object
     (set-parameter [v ^PreparedStatement s ^java.time.LocalDate i]
     (.setObject s i v)))

jmclavijo21:07:07

Thanks ahead of time. I don’t see many examples on this.

seancorfield21:07:20

No, you should extend it to the specific type you want.

seancorfield21:07:30

Also i is the integer index.

seancorfield21:07:41

v is the value -- the date in your case.

seancorfield21:07:36

(extend-protocol prepare/SettableParameter
  java.time.LocalDate
  (set-parameter [v ps i] (.setDate ps i (java-date v))))
appropriately type-hinted to avoid reflection and with an appropriate java-date function to convert java.time.LocalDate to java.util.Date (or whatever java.sql.* type you want).

seancorfield21:07:40

Or you may want setTimestamp.

jmclavijo21:07:04

awesome…. Thanks so much

seancorfield21:07:44

Note that java.sql.Date, java.sql.Time, and java.sql.Timestamp all extend java.util.Date -- and if you're doing something DB-specific you may want to convert java.time.LocalDate to a DB-specific type and then use setObject instead... so there are a lot of possibilities and not necessarily any One True Way(tm) unfortunately...

seancorfield21:07:07

I'm toying with the idea of adding a namespace for Java Time conversions but it's not entirely clear what the default behavior should actually be...

seancorfield21:07:15

It's worth noting that https://docs.oracle.com/javase/8/docs/api/java/sql/Timestamp.html has support for converting to/from java.time.LocalDateTime

seancorfield21:07:59

Oh, and one final point about SettableParameter, you can use the new-in-1.10 per-object approach to add metadata to enable an individual parameter to be processed via the protocol...

seancorfield21:07:03

(with-meta obj {'next.jdbc.prepare/set-parameter (fn [v ps i]...)})

gklijs22:07:40

It's not easy. In a new service I now use java.time.ZonedDateTime but for mongo I have to Convert to/from UTC, and for Protobuf another conversion is needed.

seancorfield22:07:18

I've updated the docs to show an example with java.time.LocalDate and java.time.LocalDateTime https://github.com/seancorfield/next-jdbc/blob/master/doc/prepared-statements.md#prepared-statement-parameters

seancorfield22:07:57

(caution: untested at the moment!)

jmclavijo22:07:53

Thanks… That makes more sense. Though how it ties in with a execute-batch! in a vector of rows is unclear. I think it’s adding a (mapcat (fn [group] run! #(.addBatch (set-parameter % ps 2)) group))

seancorfield22:07:07

Not sure what you're talking about there...

seancorfield22:07:23

You don't need any of that code.

seancorfield22:07:01

execute-batch! and set-parameters and set-parameter are all affected by the protocol directly.

jmclavijo22:07:04

error at or near "dates"
  Position: 46  Call getNextException to see other errors in the batch.
	at org.postgresql.jdbc.BatchResultHandler.handleCompletion(BatchResultHandler.java:166)
	at org.postgresql.jdbc.PgStatement.executeBatch(PgStatement.java:778)
	at org.postgresql.jdbc.PgPreparedStatement.executeBatch(PgPreparedStatement.java:1563)
	at next.jdbc.prepare$execute_batch_BANG_$fn__3943.invoke(prepare.clj:172)
	at clojure.core$map$fn__4781$fn__4782.invoke(core.clj:2633)
	at clojure.core.protocols$fn__6755.invokeStatic(protocols.clj:167)
	at clojure.core.protocols$fn__6755.invoke(protocols.clj:124)
	at clojure.core.protocols$fn__6710$G__6705__6719.invoke(protocols.clj:19)
	at clojure.core.protocols$seq_reduce.invokeStatic(protocols.clj:31)
	at clojure.core.protocols$fn__6738.invokeStatic(protocols.clj:75)
	at clojure.core.protocols$fn__6738.invoke(protocols.clj:75)
	at clojure.core.protocols$fn__6684$G__6679__6697.invoke(protocols.clj:13)
	at clojure.core$transduce.invokeStatic(core.clj:6601)
	at clojure.core$into.invokeStatic(core.clj:6614)
	at clojure.core$into.invoke(core.clj:6604)
	at next.jdbc.prepare$execute_batch_BANG_.invokeStatic(prepare.clj:167)
	at next.jdbc.prepare$execute_batch_BANG_.invoke(prepare.clj:140)
	at global_affairs_sparkling.postgres_sink$db_insert.invokeStatic(postgres_sink.clj:63)
	at global_affairs_sparkling.postgres_sink$db_insert.invoke(postgres_sink.clj:52)
	at global_affairs_sparkling.postgres_sink$dne_table_insert.invokeStatic(postgres_sink.clj:70)
	at global_affairs_sparkling.postgres_sink$dne_table_insert.invoke(postgres_sink.clj:69)
	at global_affairs_sparkling.core$insert_partition_to_db.invokeStatic(core.clj:39)
	at global_affairs_sparkling.core$insert_partition_to_db.invoke(core.clj:36)
	at clojure.core$comp$fn__4727.invoke(core.clj:2460)
	at sparkling.function.VoidFunction.call(VoidFunction.java:9)
	at org.apache.spark.api.java.JavaRDDLike$$anonfun$foreachPartition$1.apply(JavaRDDLike.scala:219)
	at org.apache.spark.api.java.JavaRDDLike$$anonfun$foreachPartition$1.apply(JavaRDDLike.scala:219)
	at org.apache.spark.rdd.RDD$$anonfun$foreachPartition$1$$anonfun$apply$29.apply(RDD.scala:929)
	at org.apache.spark.rdd.RDD$$anonfun$foreachPartition$1$$anonfun$apply$29.apply(RDD.scala:929)
	at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:2074)
	at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:2074)
	at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
	at org.apache.spark.scheduler.Task.run(Task.scala:109)
	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:345)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)
Caused by: org.postgresql.util.PSQLException: ERROR: syntax error at or near "dates"
  Position: 46
	at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2477)
	at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:2190)
	at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:300)
	at org.postgresql.jdbc.PgStatement.executeBatch(PgStatement.java:774)
	... 35 more

jmclavijo22:07:37

Still getting this…

seancorfield22:07:27

Looks like a syntax error in your SQL code.

seancorfield22:07:18

@jclavijo You'll need to share your code -- most likely you need to quote columns that are reserved SQL words.

jmclavijo22:07:55

wow… I totally missed that, specially since I was also getting other time errors before.. Fixed it. THANK YOU! 😀

seancorfield23:07:34

OK, I updated both the SettableParameter and ReadableColumn docs (on master) to try to make things clearer. I need to figure out a reasonable way to test the protocol extensions (part of the problem is that they change the behavior of all code running in the app, i.e., for all the tests).

seancorfield23:07:03

Ran some tests. Fixed the examples and pushed them.

bananadance 4