Fork me on GitHub

My understanding of RDS is that you’re still on the hook for some aspects of maintenance/high availability etc. Amazon is pushing Aurora for an even more managed solution.


In this case, we've had to open support tickets with them, and convince them that the issue is on their sides.


But I’d love to just being able to pay someone and they deal with all the ops stuff, giving me the option on which cloud to run. Mongo is doing that with Atlas and they support AWS, Azure, GCP - very nice.

Jakub Holý (HolyJak)06:05:17

With, what is the optimal way to select a count?

      (map :count)
      (plan conn ["SELECT count(*) as count from mytable"))


It depends on how obsessed you are about "optimal"...

Jakub Holý (HolyJak)16:05:10

I guess I shouldn't obsess to much, Clojure time is likely negligible compared to the DB call. What is idiomatic? Execute-one?


(reduce (fn [_ row] (reduced (:count row))) nil (plan conn ["SELECT COUNT(*) AS count FROM mytable"]))
is probably going to "do" the least but it's pretty ugly.


Frankly, I'd probably just do (:count (execute-one! conn ["SELECT COUNT(*) AS count FROM mytable"]))


If folks think that's a really common use case, I can add a convenience function for it, I guess.

Jakub Holý (HolyJak)17:05:20

You mean selecting a single column from a single row? I guess it is common but e ecute-one! is likely sufficient for that use case, it isn't worth increasing the API surface...


It would not be part of the core API. Not sure where I'd put it, but it would be like query but under the hood it would use plan and reduce and apply a function to the row.


(query-one conn :count ["select count(*) as count from mytable"])
something like that (where :count could be any function of a row).


In an early version, I actually did have something like that...

Jakub Holý (HolyJak)20:05:31

Perhaps :count would be unnecessary, using rs.getObject(int columnIndex) to simply grab the first column?


The general case would be to select a specific named column -- or perform some operation on the first row.

👍 4

Otherwise, you're dealing with a really special case of "the first column of the first row of the result set"... and it would have to be a different type of machinery because the result set is already exposed as a lazy hash map by plan.


next.jdbc very deliberately doesn't expose the mutable ResultSet object directly, because of the problems it can cause (with consumer laziness etc). So once you are reducing, you get a thing that "looks like" a hash map but doesn't actually "get" any column values until you ask for them.


You'd probably have to write a very specialized RowBuilder to get around that abstraction like that. Definitely not worth it at that point 🙂

👍 4

I think you’d use execute-one for that? Just browsed the docs once so perhaps I’m wrong.

Jakub Holý (HolyJak)09:05:41

that returns a map {"count" 0}, I need the number, not a map, it is wasteful to create it.


Aren't you creating it in your example above anyway?

Jakub Holý (HolyJak)16:05:02

No, it returns just a number


But you are doing map :count so you are creating a data structure to get the count out of


It’s not creating a map, it’s getting a reducible thing. But I think for one element only the map creation overhead is negligible?

Jakub Holý (HolyJak)16:05:57

Compared to the DB call time it likely is. Good point


you could just (reduce #(:count %2) nil rs-plan) I am not sure transduce is useful here at all. reduce will go through the IReduceInit path too


if you're really crazy about efficiency I guess you can create a rowbuilder (or whatever it's called in that just yields count() value and nothing else, but I guess all of this probably accounts to noise ultimately, I am not sure it's worth doing that

👍 4

Updates on the TFB DB benchmarks: reitit+`immutant-nio`+`jsonista`+`porsas` is now on par with the Java Undertow entry (with Jackson & plain JDBC).


next steps: go async, should double the throughtput.


async at what level (db access, http server ?)


@mpenet both, currently it's synchronous jdbc via hikari-pool at a worker thread pool, will go for a non-blocking postgresql java client, running in a undertow nio-pool.


last time I check the async postgres client was not really good


but maybe that changed


or you're thinking about another client than the one I tried back then


You can use the java client directly


should be about 700k tps, based on vert.x results.


sounds good


but will wrap it in into porsas, so it feels Clojure.


yeah it's not the one I tried, for sure it wasn't from vert.x


the sales pitch from vertx-sql-client sounds really good, eager to see what you come up with


(reduce (fn [_ row] (reduced (:count row))) nil (plan conn ["SELECT COUNT(*) AS count FROM mytable"]))
is probably going to "do" the least but it's pretty ugly.


Any ideas as to why a SQL query that usually takes 0.5s to run would slowly start taking longer and longer to run until it takes 1.5mins to finish? It works its way up to 1.5mins after 20-30 mins of running the query quickly. I can consistently run the query from my REPL in 0.5s.


@kenny Do you mean inside a running application? Could you have a memory leak somewhere and the JVM is running GC all the time, so it just seems like it's the query running more slowly?


Yes - running in an app. Perhaps. It's curious that the only thing that has slowed down is the Postgres query. I would think that other things would start running slower too. The JVM Heap is at only 25% capacity. I would think it'd be higher in the case of a memory leak.


Are you using a connection pool or are you opening/closing a new connection each time?


Connection pool - c3p0


The thread count continues to climb up and was at ~132 at the time of a slow query. Don't think that number of threads is a problem though.


Perhaps you're not closing connections properly and you're forcing c3p0 to do the work of scavenging for connections?


Ooo maybe! Lemme see.


Definitely no with-open around my queries. Given a {:datasource pooled-datasource}, how are you supposed to close the connection after running a query?


If that's your db-spec, and you're just passing that to query for example, it will open and close the connection for you.


At this point, I'd have to see your code to offer any further guidance...


Yeah, that's how I'm using it.


{:datasource pooled-datasource} get passed as the db-spec directly to jdbc/query.


You'll probably want to get some thread dumps over time from startup to slow down and look at where the new threads are coming from.


Could extraneous threads be a cause for a slowdown in queries?


Without a lot of debugging information and access to your application and code, I couldn't say.


I doubt the query is actually running slower -- but you could dial up logging in the database to figure that out.


Agreed - I don't think the query is actually running slower either. Especially given I can run it quickly from the REPL.


Though it is so strange that 95+% of the processing time comes directly from a postgresql.query execution, not from anything in my code. Could be a red herring, I suppose.


Is the query producing larger results over time?


No - always 5000 or less records.


Are you paginating through results? i.e., starting further and further into the results each time?


I'm surfacing the entire result set into memory for processing - no paginating.


Pretty sure it's not memory related given how little memory is actually used.


Time for some thread dumps then 🙂


Never done that before 🙂 How do you typically do that on a remote server? Do you actually ssh onto the box?


I'd have to Bing/Google the command for getting thread dumps of the JVM. And, yes, you'll need to be local to wherever the JVM is running.


jstack - or in a *nix tty Control-\


I have a guess that I don't totally understand yet... The long SQL queries align with the spikes in DataFileRead on this graph. Not entirely sure what DataFileRead means. Perhaps the disk is the bottleneck here.


I forget the signal that Control-\ sends, but you can also send that via kill as an equivalent to jstack


Definition for DataFileRead: > In this wait event, a session is reading data from Aurora Storage. This may be a typical wait event for I/O intensive workloads. SQL statements showing a comparatively large proportion of this wait event compared to other SQL statements may be using an inefficient query plan that requires reading large amounts of data. Perhaps my SQL query is inefficient? The weird thing to me is that I can consistently run the queries from the REPL and they run quickly.


Oooo, this might be it: My dev environment RDS instance is provisioned with a "General Purpose SSD" which has "the ability to burst to 3,000 IOPS for extended periods of time". I'm guessing that after I use up all the available "burst," the inefficient query starts to run way slower.


Ah, you didn't mention that this is up on AWS... 🙂


That would make it much harder to do the sorts of debugging I would normally do. But, yeah, if it's spiking in some sort of DB read on the filesystem, you're running into cloud IOPS limitations.


Definitely worth doing an explain on your query tho', just to make sure it's as efficient as possible. Just because it runs fast at first on AWS, doesn't mean it's an efficient query I guess...


Is there a way to use explain with jdbc?


Yes, query takes an :explain? option.


(it should be in the docs)


Doesn't show up in query docs AFAICT.


Yeah - that's what I see.


Oh, I see it there in the Spec.


Added back in the 0.6.2 Alpha days (which became 0.7.0)


Is there a good way to view the query plan? It comes out unformatted in my REPL. May be because of keywords with spaces in them.


Specify :explain-fn prn and you should get a more readable version?


Or :explain-fn clojure.pprint/pprint perhaps


Is there a doc on what that fn takes? The spec is just fn?.


(I've used this so rarely that I haven't even considered it for next.jdbc -- I think explicitly executing an EXPLAIN SQL statement is probably better and dealing with the result set...)


:explain-fn accepts a function of a single argument (the plan) and does whatever you want with it.


If I was doing that in REBL, I'd probably pass tap> 🙂


Ah, guessing it's slow because of the "Index Scan Backward"... cost=0.69..1415021.08 facepalm


Hmm, I wouldn't expect that to be too bad... but I guess it depends on what your data looks like and what the index actually is...


Anything in the plan that suggests temporary tables or table scans or file reads...?


Nope. This is the full explain:

Limit  (cost=0.69..190586.09 rows=5000 width=110)
  ->  Index Scan Backward using unique_time_slice_60000 on time_slices_60000  (cost=0.69..1415021.08 rows=37123 width=110)
        Index Cond: ((workload_id = '5cc897c9-3bae-4d90-9521-0bcb29188589'::uuid) AND ("timestamp" <= '2019-05-28 14:09:26.662'::timestamp without time zone) AND (statistic = 'avg'::text))
        Filter: ((identifier = '[:time-slice/customer-metric #uuid "5cc897c9-1fc4-47d8-b511-2460bcd85914"]'::text) OR (identifier = '[:time-slice/customer-metric #uuid "5cc897c9-2cab-4167-bf27-53268c4f3de8"]'::text) OR (identifier = '[:time-slice/customer-metric #uuid "5cc897c9-494f-4655-b610-c7d65a28ae0c"]'::text) OR (identifier = '[:time-slice/customer-metric #uuid "5cc897c9-0d57-4914-8854-971be0fc1221"]'::text) OR (identifier = '[:time-slice/customer-metric #uuid "5cc897c9-ee8f-46dd-aa58-b4fef7510def"]'::text))


37123 rows is not that many :thinking_face:


BTW I switched back to a different query (doesn't do exactly what I want) and the slow query problem goes away. So it definitely is an inefficient query.


Wow, that explain looks very different to the MySQL ones I'm used to!


Added a DESC index and got this plan. Not sure if that's better or worse.


The query runs faster in the REPL. It's tough to understand all the tradeoffs it's making.


Those ORs are probably going to kill you... My old DBA used to be death on OR -- he nearly always wanted us to rewrite it using UNION or IN ...


"BitmapOr (cost=2986.99..2986.99 rows=38467 width=0)"


(the sort is expensive too .. much more .. ugh, SQL is hard sometimes!)


Oh really? I'm not super familiar with SQL. I'll read about UNION. All that my query is doing is finding all the rows between two dates and then filtering that by a bunch of =s.


When originally writing this, I either had to write 1 query or N queries and join them together in then end. I figured 1 query would be more efficient ¯\(ツ)


Maybe it'd be better to write N queries and only try combining them if perf becomes a problem.


It seems UNION basically does the N queries and joins them together 🙂


I'll give UNION a go!


UNION increased the cost


Did you add unique?


(if so, I'll warrant that is what increased the cost)


No. Guessing that comes from UNION.


Switching to UNION ALL removes the Unique.


Doing each query individually is significantly faster haha. There's probably a way to improve the perf of the single SQL query but may not be worth it.


Hahaha... yeah, we have some monster queries where the optimization turned out to be: run multiple queries and combine them in Clojure!

✔️ 12