Fork me on GitHub
#sql
<
2022-02-08
>
plexus13:02:05

We're looking at API pagination for a client, backend is postgres. The old school way is to do this with OFFSET/LIMIT, but that doesn't yield stable results. I was considering keeping ResultSet instances in memory since they can fetch results incrementally. Is this a bad idea? Would this eat up my database connections?

plexus13:02:48

doing some googling and reading but not a lot of great results so far, but maybe folks here know more about these things :)

Lennart Buit13:02:17

Bit of a curve ball: You can try to look as to how GraphQL relay connections are implemented in Postgres based systems. They deal with similar problems as you are describing πŸ™‚. (Not quite the direct answer you were looking for, I guess, but may be easier googleable ^^)

plexus13:02:22

from what I gather you would have to keep the connection exclusive to the resultset as long as you plan to use it, since cursors are automatically closed on transaction boundaries https://www.postgresql.org/docs/11/plpgsql-cursors.html

Lennart Buit13:02:36

Yeah I got a bit intrigued, definitely not expert on this; One solution I found is encoding primary key / sort order in your cursors, then you won’t have limit/offset style problems. But not super satisfactory yet

nbardiuk13:02:31

another concern I have is that read transaction can block write transactions, depending on isolation level. Long reads bring overhead to writes

mjw13:02:55

Out of curiosity, is the reason the result set is "unstable" that the database is constantly being updated and therefore paging includes previous results?

plexus17:02:00

not sure what the rate of update is in this case @UGTAV6LR2, but yes that is the general issue with limit/offset style pagination, results across pages are not consistent in the face of inserts/deletes

πŸ‘ 1
emccue20:02:08

cursor pagination

isak23:02:37

For GraphQL relay they make the client deal with that. Every node has a stable cursor for the basic query arguments, so that if one is deleted and you get an error while paginating, you can use an adjacent cursor and recover. Storing the whole resultset seems like it would be expensive to me, and I'd avoid it, especially for public APIs.

pesterhazy14:02:25

If I understand you correctly, this would limit your ability to scale horizontally as you need to direct your client to the same physical machine. And you need to keep a connection open, which is a precious resource. For those reason alone I would avoid such an approach

pesterhazy14:02:18

Have you considered a continuation token?

mccraigmccraig19:02:37

are there any string escaping fns in next.jdbc ? (i've got a complex templated query with some dynamic literal tables with some values which need escaping, and i can't see how to construct it as a prepared statement. at least, not without a lot of additional complexity)

seancorfield19:02:59

@mccraigmccraig Can you provide a bit more specific? I'm not sure what you're really asking... Maybe if your table names are dynamic, build the SQL with HoneySQL?

mccraigmccraig20:02:15

the VALUES table expressions contain some literal values which need any quotes in them escaping... i can do this myself ofc, but wondered if there are any utility methods around jdbc.next to do it for me

mccraigmccraig20:02:48

the number of rows in the VALUES expressions varies though, and the number of VALUES expressions also varies

seancorfield20:02:42

Sounds like you should be building the SQL with HoneySQL -- it'll lift those VALUES out as parameters so next.jdbc can use a prepared statement.

seancorfield20:02:40

By the time you've built the string, it's too late for next.jdbc to do anything about it. Hence, build the SQL string and its parameters with HoneySQL and then pass the result to next.jdbc.

mccraigmccraig20:02:06

oh, yeah, i want to escape while i'm building the string, not after the fact

seancorfield20:02:47

You can basically give HoneySQL your data structure as part of the overall DSL and let it figure stuff out for you.

mccraigmccraig20:02:48

and i don't really want to use honeysql because (at least last time i used it) it was hard to get it to generate exactly the sql i wanted

seancorfield20:02:01

Was that with the 1.x version?

mccraigmccraig20:02:39

don't know. it was a while back. it worked ok, but it was quite hard work

seancorfield20:02:02

You'll find the 2.x version is a lot more flexible and a lot more consistent. And it supports a lot more syntax than 1.x used to.

seancorfield20:02:31

I guess I could ask "was I maintaining HoneySQL last time you tried it?" πŸ™‚

mccraigmccraig20:02:20

lol, don't know

mccraigmccraig20:02:55

hmm. looks like honeysql 2.x is the solution i want for the future then... but sadly i don't have time to convert these queries now, so i'll have to hack some string escaping together for now

seancorfield21:02:06

Maybe you could leverage HoneySQL 2.x just to build the VALUES substring?

dev=> (sql/format {:values [["abc'def" "x" 42 nil]]} {:inline true})
["VALUES ('abc''def', 'x', 42, NULL)"]

mccraigmccraig21:02:56

ooo - now that's a good idea - thanks @U04V70XH6!

mccraigmccraig09:02:28

hmm. this was trivial, until uuids - honeysql formats them without any quotes by default:

(sql/format {:values [["abc'def" (clj-uuid/v1) 42 nil]]} {:inline true})
;; => ["VALUES ('abc''def', c28fa6e0-8986-11ec-8ca3-a50d2de755f7, 42, NULL)"]
but i need them quoted as strings for my postgresql schema. is there any way to get it to quote the uuid values as a string (i couldn't see anything around the manual) ? no biggie if not, just minor awkwardness - worth paying for the avoidance of manually escaping string values

seancorfield16:02:34

Create an issue - it's a bug (well, an omission).

seancorfield16:02:25

What's our to inline today is a pretty small set of types. I can add support for more.

mccraigmccraig17:02:01

submitted, with a thought on a possible difficulty: https://github.com/seancorfield/honeysql/issues/385

seancorfield18:02:28

fixed πŸ™‚

mccraigmccraig18:02:49

speedy! πŸ˜„