reitit

2024-08-28T12:45:23.245789Z

hi everyone, I'm using ring + reitit + muuntaja and am trying to construct a response where I stream a clojure.java.jdbc query result as json, I was wondering if there's any prior art for that somewhere?

juhoteperi 2024-08-28T12:47:35.405629Z

next.jdbc has nicer streaming support

juhoteperi 2024-08-28T12:48:39.968839Z

Jsonista has three open PRs trying to expose streaming API to write (and read): https://github.com/metosin/jsonista/pulls

2024-08-28T12:49:02.029669Z

sweet - yeah the stack i'm working in uses a lot of older libraries, I need to be judicious in one parts I upgrade because each upgrade can have a wide-ranging impact. would be great to use next.jdbc though

juhoteperi 2024-08-28T12:49:02.477309Z

So right now you pretty much need to just take the object-mapper from Jsonista, and then use Jackson directly

juhoteperi 2024-08-28T12:49:11.373399Z

Muuntaja probably isn't really useful here

juhoteperi 2024-08-28T12:49:56.973639Z

If you need a real streaming response, you might use to get a OStream where you can write the encoded JSON

juhoteperi 2024-08-28T12:50:27.299469Z

So you'll build the ring response map directly in the handler code and Muuntaja isn't really used to encode Clj -> JSON

juhoteperi 2024-08-28T12:51:10.271609Z

I think I've also previously just called JDBC directly, going around clojure.java.jdbc to get better performance in a such case. I don't think I have public example anywhere.

juhoteperi 2024-08-28T12:52:44.194309Z

Next.jdbc has some notes about connection options that are required for streaming result sets (it depends on the DB): https://cljdoc.org/d/com.github.seancorfield/next.jdbc/1.3.939/doc/getting-started/tips-tricks#streaming-result-sets-1 Even if you do everything else, without these options the DB driver will load full result set from DB and you might run out of memory

juhoteperi 2024-08-28T13:01:04.728019Z

Writing streaming JSON looks something like this:

(def ^com.fasterxml.jackson.databind.ObjectMapper mapper (json/object-mapper))

(defn streaming-json-array-response [db _opts]
  (ring-io/piped-input-stream
    (fn [^OutputStream output-stream]
      (let [^com.fasterxml.jackson.databind.ObjectWriter writer (.writerFor mapper Object)
            ^com.fasterxml.jackson.databind.SequenceWriter seq-writer (.writeValuesAsArray writer output-stream)]
        (...
          (.write seq-writer (into {} row)))
        (.close seq-writer)
        ))))


  {:status 200
   :headers {"Content-Type" "application/json;charset=utf-8"}
   :body (streaming-json-array-response db {})}
With next.jdbc you can use plan function to get a Reducible you can go over with reduce and then call .write with each row. And you probably should catch exceptions, because when you are streaming the response, 200 OK is already sent if you get an exception when writing JSON for some item, so you can change the response status. I think we got this working with ~2GB JSON output with -Xmx 100M or something

2024-08-28T13:03:49.853349Z

wow this is perfect, thank you!

juhoteperi 2024-08-28T13:15:36.163089Z

It might be also important to create a transaction for the query, even though you only run one read-only query. Might be specific to Postgres.

2024-08-28T13:19:32.710969Z

I have the pleasure of using oracle 😭

πŸ™ˆ 1
valtteri 2024-08-28T13:55:50.587359Z

It can be done without Jackson quite easily IF you have a lot of JSON documents that you need to stream and the result can be line-delimited JSON (JSON-LD). However, if you need a valid (single) JSON value, then Jackson is the way to go. πŸ™‚

valtteri 2024-08-28T14:04:43.389719Z

Also perhaps worth mentioning: next.jdbc doesn’t conflict with clojure.java.jdbc so you can use both in the same project.

πŸ‘ 1
Ben Sless 2024-08-28T17:26:23.119359Z

Maybe you'll find this useful https://github.com/bsless/jsonista.streams/

2024-08-28T17:33:21.329059Z

that looks useful! thank you. what i'm hoping for, which might not be realistic, is to stream a data structure that looks something like

{:key-1 "some-value"
 :key-2 "some-value"
 :database-records streaming-sequence-of-values-as-array}
I can kinda see how this would be possible using the jackson API, but haven't gotten there yet

Ben Sless 2024-08-28T17:57:43.841219Z

that's possible but hard you can do it manually with the httpkit channel api

Ben Sless 2024-08-28T18:09:57.005239Z

also, this response format would probably be harder for clients to stream process

2024-08-28T18:10:33.275779Z

hm that's a good point πŸ˜•

Ben Sless 2024-08-28T18:11:38.664129Z

It's probably better to return a JSON array or JSON-LD

Ben Sless 2024-08-28T18:15:46.869929Z

you can refer to how analytic databases like Druid or Clickhouse allow users to format the responses

juhoteperi 2024-08-28T18:24:05.801709Z

It can be done with the Jackson API. I might have an example somewhere.

❀️ 1
juhoteperi 2024-08-29T06:58:34.408629Z

On read side you can use .readTree to read JSON object with some large arrays:

(defn read-geojson-features
  "Try reading geojson file without loading full features array to memory"
  [^Reader f]
  (let [^ObjectMapper om json/default-object-mapper
        ^JsonNode tree (.readTree om f)
        ^JsonNode node (.get tree "features")]
    (map (fn [node]
           (.treeToValue om ^JsonNode node ^Class Object))
         node)))

juhoteperi 2024-08-30T09:10:41.125159Z

I don't think switching to StreamableResponses alone is enough to stream JDBC results to the response, without having the full collection in memory

juhoteperi 2024-08-30T09:29:41.317369Z

But you could probably replace piped-output-stream with custom StreamableResponseBody implementation, that does the streaming JDBC query

juhoteperi 2024-08-30T09:30:23.731129Z

Because you need to keep the DB TX open until everything is written to the client, which doesn't work if you just return a lazy seq from the handler

2024-08-30T13:14:50.605879Z

I got something working πŸ™‚ streaming a large data set placed in an arbitrary position in a clojure data structure - will share soon

juhoteperi 2024-08-29T07:01:28.352949Z

I think writing can be done with ObjectWriter, you can get from the object-mapper

juhoteperi 2024-08-29T07:21:32.390899Z

I can't found a example. Maybe I tested it somewhere, but it didn't end up being used. I think it is something similar to that read example, you need to write the top level properties yourself, but you can use jsonista to convert the values of those values. Then you use some calls to stream the large array into the JSON output.

2024-08-29T12:08:53.137899Z

If I put something together I'll be sure to share it πŸ™‚

2024-08-29T12:24:34.646739Z

ooooh!

Ben Sless 2024-08-29T12:25:17.386069Z

your sound effect for this post https://youtu.be/2P5qbcRAXVk

πŸ˜‚ 1