This page is not created by, affiliated with, or supported by Slack Technologies, Inc.
2023-11-10
Channels
- # announcements (3)
- # asami (19)
- # babashka (38)
- # beginners (42)
- # cider (19)
- # clojure (17)
- # clojure-europe (34)
- # clojure-hungary (3)
- # clojure-nl (1)
- # clojure-norway (53)
- # clojure-uk (7)
- # clojuredesign-podcast (34)
- # conjure (2)
- # cursive (7)
- # data-science (13)
- # datalevin (3)
- # datomic (19)
- # dev-tooling (1)
- # events (1)
- # honeysql (2)
- # hyperfiddle (31)
- # integrant (16)
- # juxt (39)
- # missionary (14)
- # nrepl (14)
- # off-topic (57)
- # overtone (22)
- # podcasts-discuss (1)
- # practicalli (32)
- # reitit (12)
- # releases (2)
- # ring (13)
- # ring-swagger (2)
- # sql (85)
- # squint (75)
Hi all. I just ran into an issue with java.sql.Timestamp
and PostgreSQL's timestamp. I was following https://cljdoc.org/d/com.github.seancorfield/next.jdbc/1.3.894/doc/getting-started/tips-tricks#times-dates-and-timezones, and I think it pointed me in the wrong direction. Specifically:
> For example, with Postgres we recommend always storing dates in a Postgres TIMESTAMP
(without time zone) column, storing all such timestamps in UTC, and applying your time zone logic separately using application logic. The TIMESTAMP WITH TIME ZONE
column type in Postgres stores its date in UTC anyhow, and applications that need to deal with time zones typically require richer functionality than simply adjusting the time zone to wherever the database happens to be hosted.
Turns out there is a pretty big issue with this. Timestamp
interprets Y-M-D H:m:S
in the client's time zone if there is no zone information available. This caused the timestamps to be 8 hours off because I was querying a PostgreSQL server in UTC from my dev REPL in PST. Even though TIMESTAMP WITH TIME ZONE
uses UTC, it records that fact, so the conversion to the client's time zone will be correct. With plain TIMEZONE
, the UTC timestamp "2023-11-10 01:50:20.031"
is being treated as Pacific Time as opposed to be treated as UTC and converting it to "2023-11-09T17:50:20.031-08:00"
.
I think the docs should recommend TIMESTAMP WITH TIME ZONE
whenever possible so JDBC can handle the conversion correctly.
I have also seen a number of recommendations (Stack Overflow) to set the JVM's time zone to be UTC, but that's more fragile than simply having PostgreSQL include the UTC zone information so JDBC can automatically do the "right" thing.
I don't know about you, but I don't have my dev machine in UTC and I like the convenience of having my JVM using my local zone for my REPL.
Overall, using next.jdbc has been a joy! Thanks @U04V70XH6 for a delightful library!
I didn't write the PG recommendation because I don't use PG at all. However, I do have my local MySQL DB (in Docker) on UTC and I set my local JVM to UTC while doing dev/test work -- and all our QA/prod machines are UTC.
Those PG-specific recommendations came from other PG users here although I don't remember who, specifically, contributed that section of the docs...
Would it be OK for me to write up something different? What's a good path for suggesting alternative text?
@U04V70XH6 Does next.jdbc have a way of specifying the server time zone?
I'm not comfortable with a single person just changing what a group of people agreed was the generally correct approach so I'd suggest discussing it in the channel with other PG users first and making sure there's enough of a consensus. Specifying the server time zone can be done at the connection level I believe so you can specify it as part of the hash map that describes the "db-spec" (and it'll get passed into the connection string).
(that will also depend on whether you're using connection pooling libraries I suspect)
I'm not using a pool right now. I was looking through the next.jdbc docs I don't see it in the list of options mentioned in get-datasource
.
You can pass any additional connection string values as keys in the db-spec hash map.
For example: {:dbtype "mysql" :dbname "thedb" :useSSL false}
will pass useSSL=false
in the connection string (that example is in Getting Started)
But that stuff is generally DB-specific (or driver-specific) so next.jdbc
can't really document what's possible there... but it could perhaps make it clearer that you can add extra key/value pairs...
Oh, I see. It's DB specific. So next.jdbc can pass it through, but there isn't a "standard" way.
@U04V70XH6 Thank you!
I updated the Getting Started section (develop branch) to explicitly call out that you can put arbitrary keys in the db-spec hash map and they'll just get passed through to the JDBC driver.
@U5FV4MJHG UTC everywhere, all the time - you can configure the JVM to use it, without affecting anything else in your dev environment
@U04V70XH6 Thank you!
@U0JEFEZH6 Yeah. I'm considering setting it for my dev repl.
I suggested the current documentation on timestamps in Postgres to Sean. It’s a complex area with numerous possible approaches so I would welcome contributions from other Postgres users should there be a better way to handle timestamps and time zones in Postgres. For me, saving all time stamps as just time stamps - and storing context-specific timezone data elsewhere - works well, and reduces the cognitive load when it comes to reasoning about time within the database. However there may of course be downsides to this approach that I have not come across or have failed to understand.
Let’s take for example an e-commerce transaction for an American merchant, performed by a UK customer. The time of the transaction is well understood - it happened at a particular instant. For me, merging this value with a timezone can confuse things, as you need to choose which frame of reference is the definitive one. We probably know the timezone of the customer, and the timezone of the merchant - and IMHO it makes more sense to store that context separately rather than complecting it with the timestamp itself.
Having built an event scheduler (+ all the timezone math that comes with it, recurring events etc) - you never want to store zoned date times in your database. Instants + per user/fallback timezone should be stored separately + used for time calculations and displaying local times, otherwise it's a world of pain. I remember that back in the days Mixpanel API would provide all event data but in... eastern time :-) and we haven't discovered that until UK customers started wondering why some of the chart data doesn't align with other data points we were collecting
Agreed. I built a system one time that ended up being used in practically every country in the world, and time zones get really tricky really quickly. We used oracle at the time, and I distinctly remember having an hours-long outage for all of our customers in Lebanon due to some weirdness in how that country treats timezones - the way they transitioned from winter to summer time triggered a bug on that particular day that rendered our system unusable. These sorts of issues can always bite you, but having tried both approaches, you’ll be less likely to produce such bugs once timestamp and timezone are separate - they are different things, after all. In our current system, all time stamps are read as java.time.Instant
s, and in the (rare) occasions the timezone becomes relevant, it’s simply attached to the instant using ZonedDateTime/ofInstant
or something similar.
Ah yes, which reminds me that updating the TZ database that the JVM uses is an exercise left to the reader, and very few people mention it 🤷
You’ve just got to hope that new timezones are sufficiently esoteric that you won’t encounter them until you can handle them 😱
I don't work on scheduling anymore (or for now 😂) so I'm not that worried, but if you do - you have to subscribe to the IANA mailing list and know when to update the db with a script
I presume they are typically announced some period in advance? Although that’s probably not a good assumption, as countries can introduce whatever they want at any time - again it’s probably a case of inferring that countries that do this stuff on a whim are not going to be hugely commercially important for most developers of commercial software.
yeah usually it's enough time, there were some odd cases in the past but limited to individual countries
@U02HPS0397S @U0JEFEZH6 Thanks for the reply. I also agree that the database should always store the UTC instant. PostgreSQL does indeed do that. Even when you use TIMESTAMP WITH TIME ZONE
, it just stores the UTC instant. However, due to the issue with the JDBC driver I described above, if you use TIMESTAMP
(without time zone), it actually becomes impossible to get an accurate epoch millis without setting the JVM default time zone to UTC. By using TIMESTAMP WITH TIME ZONE
, you can get an accurate epoch millis whether or not the JVM default time zone is UTC. Since TIMESTAMP WITH TIME ZONE
works in more cases, I think that is a better recommendation.
I do agree that it's fair to recommend using UTC as the default time zone too. Nonetheless, TIMESTAMP WITH TIME ZONE
works in all cases, so I see that as a separate consideration.
Yeah, the timestamp
vs timestamptz
in Postgres is confusing, in addition to all of the things that were mentioned here, working with time is not easy
Indeed! Let's say I was very surprised when I asked PG for the millis (via psql) and then I got the millis from the java.sql.Timestamp and they did not match! It actually ends up changing the value of the instant!
I try to avoid working on things that need millisecond precision to match between systems 😉
Thanks @U5FV4MJHG for the insight. This is interesting to me, as I rely heavily on my current technique for timestamp storage, and if there is a technical weakness in the approach then it’s something I would want to know about. I have been trying to reproduce the issue that you have outlined, but I must be missing something as I have been unable to do so. I’m in Ireland, and I often interact with servers in different timezones (particularly continental Europe as that’s where our cloud services are located) and I haven’t seen issues with timestamps as you have seen. I’m trying to reproduce an issue in as simple case as possible. Here is what I have:
(TimeZone/setDefault
(TimeZone/getTimeZone "America/New_York"))
(let [db (ts/pool)
_ (jdbc/execute! db ["create table tz_test (id serial primary key, no_tz_ts timestamp,
tz_ts timestamp with time zone)"])
result (jdbc/execute! db ["INSERT INTO tz_test (no_tz_ts, tz_ts) VALUES (?, ?)" (Instant/now) (Instant/now)])
select (jdbc/execute! db ["SELECT * FROM tz_test"])
drop-table-result (jdbc/execute! db ["DROP table tz_test"])
]
{:insert-result result
:drop-table-result drop-table-result
:select-result select})
In this code, I create a simple tz_test
table, with a TIMESTAMP WITH TIME ZONE
column, and a raw TIMESTAMP
column. I then insert a row (using Instant
s) , and read the data to see how the timestamps come back. Right now Ireland’s timezone happens to correspond to UTC, so I update the default timezone to America/New_York
for the purposes of the test. In each case, the timestamp I have written is returned in UTC, as I would expect. (e.g. the result of a test I just ran in the repl looks like this:
{:insert-result [#:next.jdbc{:update-count 1}],
:drop-table-result [#:next.jdbc{:update-count 0}],
:select-result [#:tz-test{:id 1,
:no-tz-ts #inst"2023-11-14T20:00:12.236541000-00:00",
:tz-ts #inst"2023-11-14T20:00:12.236542000-00:00"}]}
). Any insight on what you might be doing differently that triggers the problem that you are seeing?@U02HPS0397S Hm...let me put together a way to reproduce it. Which version of the driver are you using? I'm using org.postgresql/postgresql 42.6.0
. Also, just to be 100% sure there isn't something cached, try launching the JVM with -Duser.timezone="America/New_York"
. For example:
clojure -J-Duser.timezone="America/New_York" -M:repl
It looks like the driver has a static initializer which caches the default time zone. Mutability! Ugh! https://github.com/pgjdbc/pgjdbc/blob/636a97f68188438ffdb8c10fa2eb5b3f748350be/pgjdbc/src/main/java/org/postgresql/jdbc/TimestampUtils.java#L105-L125
Hmm. I’ve restarted the REPL with -J-Duser.timezone="America/New_York"
, and I get the same result
{:insert-result [#:next.jdbc{:update-count 1}],
:drop-table-result [#:next.jdbc{:update-count 0}],
:select-result [#:tz-test{:id 1,
:no-tz-ts #object[java.time.Instant 0x3ce2217b "2023-11-14T20:51:28.190634Z"],
:tz-ts #object[java.time.Instant 0x3d16211d "2023-11-14T20:51:28.190639Z"]}]}
for that code snippet. I’m using postgres 42.5.4
,so a tad behind your one. I’m running Postgres locally for this test, but that should be ok once I change the default JVM TZ if I understand correctly. There must be something else different between our environments that I’m not getting… I’m using openjdk-15.0.2
…and when I run (next.jdbc.date-time/read-as-local)
, the result looks like:
{:insert-result [#:next.jdbc{:update-count 1}],
:drop-table-result [#:next.jdbc{:update-count 0}],
:select-result [#:tz-test{:id 1,
:no-tz-ts #object[java.time.LocalDateTime 0x2448fe9f "2023-11-14T16:08:48.543563"],
:tz-ts #object[java.time.LocalDateTime 0x6fe4fbb4 "2023-11-14T16:08:48.543564"]}]}
as in, I get both columns in America/New_York time (it’s 21:09 here in Ireland right now so 5 hours ahead) as I would expect.Are you using some kind of coercion layer? I get an exception when I run your code:
Execution error (PSQLException) at org.postgresql.jdbc.PgPreparedStatement/setObject (PgPreparedStatement.java:1051).
Can't infer the SQL type to use for an instance of java.time.Instant. Use setObject() with an explicit Types value to specify the type to use.
In my experience, you want to avoid LocalDateTime (and all Local*) classes - ZoneDateTime is much easier to reason about because of the baked-in TZ information
@U02HPS0397S I get this:
{:tz_test/id 1, :tz_test/no_tz_ts #inst "2023-11-15T00:26:01", :tz_test/tz_ts #inst "2023-11-14T21:26:01"}
@U5FV4MJHG do you have any extensions for JDBC to deal with timestamps etc? Here's what I use to always get Instants, and ignore the java.util.Date and sql.Timestamp mess:
(extend-protocol jdbc.result-set/ReadableColumn
PgArray
(read-column-by-index [val _meta _idx]
(vec (.getArray val)))
PGobject
(read-column-by-index [val _meta _idx]
(pgobject-json-to-value val))
java.util.Date
(read-column-by-index [val _meta _idx]
(.toInstant ^java.util.Date val))
java.sql.Timestamp
(read-column-by-index [val _meta _idx]
(.toInstant ^java.sql.Timestamp val))
java.sql.Time
(read-column-by-index [val _meta _idx]
(str val)))
(extend-protocol jdbc.prepare/SettableParameter
Date
(set-parameter [value statement idx]
(.setTimestamp ^PreparedStatement statement idx
(Timestamp/from (.toInstant value))))
;; java.time.ZonedDateTime
;; (set-parameter [value statement idx]
;; (.setTimestamp ^PreparedStatement statement idx
;; (Timestamp/from (.toInstant value))))
IPersistentMap
(set-parameter [value statement idx]
(.setObject ^PreparedStatement statement idx
(value-to-json-pgobject value)))
IPersistentVector
(set-parameter [value statement idx]
(.setObject ^PreparedStatement statement idx
(value-to-json-pgobject value))))
ZonedDateTime was commented out because... things got really confusing really fast, so I'd rather get an exception if I accidentally try to store it and force the whole system to operate on java.time.Instant and storing TZ ID somewhere else
@U02HPS0397S Things seem to go haywire when calling .getTime
on the java.sql.Timestamp object.
| :no_tz_ts | :tz_ts |
|----------------------------------------------------+----------------------------------------------------|
| 2023-11-14 13:36:51.534875 | 2023-11-14 13:36:51.534875 |
| 1699997811534 | 1699997811534 |
| 2023-11-14T13:36:51.534-08:00[America/Los_Angeles] | 2023-11-14T13:36:51.534-08:00[America/Los_Angeles] |
From this code:
(let [{:tz_test/keys [no_tz_ts tz_ts]} (first (jdbc/execute! db ["SELECT * FROM tz_test"]))]
(print-table
[{:no_tz_ts no_tz_ts
:tz_ts tz_ts}
{:no_tz_ts (.getTime no_tz_ts)
:tz_ts (.getTime tz_ts)}
{:no_tz_ts (local-dt (.getTime no_tz_ts))
:tz_ts (local-dt (.getTime tz_ts))}]))
Loading that namespace I referred to earlier does change the state in the jvm, so may be worth a REPL restart or two if things get too weird.
The code...
(let [_ (jdbc/execute! db ["create table tz_test (id serial primary key, no_tz_ts timestamp,
tz_ts timestamp with time zone)"])
_ (TimeZone/setDefault (TimeZone/getTimeZone "UTC"))
_ (jdbc/execute! db ["INSERT INTO tz_test (no_tz_ts, tz_ts) VALUES (now(), now())"])
_ (TimeZone/setDefault (TimeZone/getTimeZone "America/Los_Angeles"))
_ (jdbc/execute! db ["INSERT INTO tz_test (no_tz_ts, tz_ts) VALUES (now(), now())"])
result (jdbc/execute! db ["SELECT *, EXTRACT(EPOCH FROM no_tz_ts) AS no_tz_ms, EXTRACT(EPOCH FROM tz_ts) AS tz_ms FROM tz_test"])
_ (jdbc/execute! db ["DROP table tz_test"])]
result)
The results...
[{:tz_test/id 1, :tz_test/no_tz_ts #inst "2023-11-15T06:33:02", :tz_test/tz_ts #inst "2023-11-14T22:33:02", :no_tz_ms 1.700001182493082E9, :tz_ms 1.700001182493082E9}
{:tz_test/id 2, :tz_test/no_tz_ts #inst "2023-11-14T22:33:02", :tz_test/tz_ts #inst "2023-11-14T22:33:02", :no_tz_ms 1.699972382511892E9, :tz_ms 1.700001182511892E9}]
From the PostgreSQL console:
# select * from tz_test;
id | no_tz_ts | tz_ts
----+----------------------------+-------------------------------
1 | 2023-11-14 22:38:41.883737 | 2023-11-14 22:38:41.883737+00
2 | 2023-11-14 14:38:41.899396 | 2023-11-14 22:38:41.899396+00
Ah ha, there it is...
# SELECT *, EXTRACT(EPOCH FROM no_tz_ts) AS no_tz_ms, EXTRACT(EPOCH FROM tz_ts) AS tz_ms, now() FROM tz_test;
id | no_tz_ts | tz_ts | no_tz_ms | tz_ms | now
----+----------------------------+-------------------------------+-------------------+-------------------+-------------------------------
1 | 2023-11-14 22:38:41.883737 | 2023-11-14 22:38:41.883737+00 | 1700001521.883737 | 1700001521.883737 | 2023-11-14 22:43:34.227883+00
2 | 2023-11-14 14:38:41.899396 | 2023-11-14 22:38:41.899396+00 | 1699972721.899396 | 1700001521.899396 | 2023-11-14 22:43:34.227883+00
Here's something more succinct.
(let [; A client that is careful to record the TIMESTAMP WITHOUT TIME ZONE in UTC time.
_ (TimeZone/setDefault (TimeZone/getTimeZone "UTC"))
now (java.sql.Timestamp/from (java.time.Instant/now))
_ (jdbc/execute! db ["create table tz_test (id serial primary key, no_tz_ts timestamp,
tz_ts timestamp with time zone)"])
_ (jdbc/execute! db ["INSERT INTO tz_test (no_tz_ts, tz_ts) VALUES (?, ?)" now now])
; A client that is not in UTC
_ (TimeZone/setDefault (TimeZone/getTimeZone "America/Los_Angeles"))
result (first (jdbc/execute! db ["SELECT *, EXTRACT(EPOCH FROM no_tz_ts) AS no_tz_sec, EXTRACT(EPOCH FROM tz_ts) AS tz_sec FROM tz_test"]))
_ (jdbc/execute! db ["DROP table tz_test"])]
;; The timestamp on the server matches (it's all UTC). The issue is when
;; the client interprets it. Notice how `.getTime` does *not* match the server!
(let [{:tz_test/keys [no_tz_ts tz_ts] :keys [no_tz_sec tz_sec]} result]
(pprint
{"no_tz_ts" no_tz_ts
" tz_ts" tz_ts
"no_tz_client" (.getTime no_tz_ts)
" tz_client" (.getTime tz_ts)
"no_tz_server" (long no_tz_sec)
" tz_server" (long tz_sec)})))
Sample output:
{"no_tz_ts" #inst "2023-11-15T07:34:26.221119000-00:00",
" tz_ts" #inst "2023-11-14T23:34:26.221119000-00:00",
"no_tz_client" 1700033666221,
" tz_client" 1700004866221,
"no_tz_server" 1700004866,
" tz_server" 1700004866}
So, the writer was really careful to write it down in UTC, but it doesn't matter. If the reader doesn't have the time zone set to UTC, the reader can't get an accurate timestamp. There really is no recourse once you get a java.sql.Timestamp
. The epoch millis will be off, so everything else in client will be off.
This is all due to how the JDBC driver works. It's interpreting the "2023-11-14 23:38:36.227249"
as being in the client's time zone. It's not using the epoc millis to construct the java.sql.Timestamp
.
You can see that java.sql.Timestamp
is immune to time zone changes after being constructed.
(let [_ (TimeZone/setDefault (TimeZone/getTimeZone "UTC"))
now-ms (System/currentTimeMillis)
ts-now (java.sql.Timestamp. now-ms)
ms-before (.getTime ts-now)
_ (TimeZone/setDefault (TimeZone/getTimeZone "America/Los_Angeles"))
ms-after (.getTime ts-now)]
[now-ms ms-before ms-after])
Hopefully that code makes it clear. You should always use TIMESTAMP WITH TIME ZONE
if you're able to. If, for some reason you are stuck with TIMESTAMP
, then you have to make sure that all the clients are using the same time zone (ideally UTC).
So I think the docs page should definitely recommend using TIMESTAMP WITH TIME ZONE
(aka "timestamptz") whenever possible.
@U02HPS0397S I'm have to sign off for today too, but I'll check back in tomorrow.
This code matches the client and server precision. It's a little easier to read the output.
(let [; A client that is careful to record the TIMESTAMP WITHOUT TIME ZONE in UTC time.
_ (TimeZone/setDefault (TimeZone/getTimeZone "UTC"))
now (java.sql.Timestamp/from (java.time.Instant/now))
_ (jdbc/execute! db ["create table tz_test (id serial primary key, no_tz_ts timestamp,
tz_ts timestamp with time zone)"])
_ (jdbc/execute! db ["INSERT INTO tz_test (no_tz_ts, tz_ts) VALUES (?, ?)" now now])
; A client that is not in UTC
_ (TimeZone/setDefault (TimeZone/getTimeZone "America/Los_Angeles"))
result (first (jdbc/execute! db ["SELECT *, EXTRACT(EPOCH FROM no_tz_ts) AS no_tz_raw, EXTRACT(EPOCH FROM tz_ts) AS tz_raw FROM tz_test"]))
_ (jdbc/execute! db ["DROP table tz_test"])]
;; The timestamp on the server matches (it's all UTC). The issue is when
;; the client interprets it. Notice how `.getTime` does *not* match the server!
(let [{:tz_test/keys [no_tz_ts tz_ts] :keys [no_tz_raw tz_raw]} result]
(pprint
{"no_tz_ts" no_tz_ts
" tz_ts" tz_ts
"no_tz_client" (.getTime no_tz_ts)
" tz_client" (.getTime tz_ts)
"no_tz_server" (long (* no_tz_raw 1000))
" tz_server" (long (* tz_raw 1000))})))
For example:
{"no_tz_ts" #inst "2023-11-15T07:53:15.584837000-00:00",
" tz_ts" #inst "2023-11-14T23:53:15.584837000-00:00",
"no_tz_client" 1700034795584,
" tz_client" 1700005995584,
"no_tz_server" 1700005995584,
" tz_server" 1700005995584}
I mean, you have to account for round trips and when NOW()
is actually run in Postgres - depending if it's a transaction or not, it's called at different time (you can also pass transaction start time). Plus of course clock drift on different machines. Working with time is a fascinating thing
Ah ok, I see - and can reproduce using your code. It’s when the TS is inserted using a different TZ to what the client uses to read it. That’s fair enough, and given what you are saying I think you are right to suggest a change in the documentation. Thanks for the insight, that is useful.
@U02HPS0397S No problem! Thanks for putting together that code snippet so I had a convenient starting point!
@U04V70XH6 Take a look if you have a moment. I'm happy to summarize too. Just let me know what you need.
@U02HPS0397S @U5FV4MJHG If you two want to work on a PR for the PG timestamp hints & tips, that would be great. It seems that having the client JVM in UTC also addresses this, yes? So that part of the recommendation -- everything in UTC, which is many people's general recommendation for all databases -- still holds true (and for DBs without the TZ support it is the only sane way to go).
@U04V70XH6 I'd be happy to work with @U02HPS0397S. Yes, having all JVM clients in UTC is the workaround.
Thanks for the discussion!
Just found this thread because I was starting to pull my hair out in an attempt to figure out what's wrong, when it all boiled down to using a plain DATE
without a time zone.
FWIW, I have never had any problems with using WITH TIMEZONE
types as a zoned timestamp has all the information one needs to construct an instant, whereas an instant doesn't have it and must be used within an assumption of the timezone.
Hereby I vow to never use dates/times without a timezone when I can help it. :D Carefully making sure that every single piece in the orchestra is in UTC and stays in UTC is just asking for trouble IMO.