reitit

jussi 2025-03-19T07:17:18.808239Z

Hi, wondering if anyone has done proxying with Reitit/ring -combo for very large data sets? We have a situation where we would like to just proxy a request and its response, but Jetty dies after a naivistic clj-http/get call with (without decompression ofc)

; java.lang.OutOfMemoryError: Required array length 2147483638 + 4484 is too large
So, I assume the response size demands streaming and I looked at a very old ring-proxy lib, but kind of didn't want to take it into use, so decided to ask if someone has dealt with a similar issue.

opqdonut 2025-03-19T07:27:11.960179Z

it could just work out if you hand the :body inputstream you get from the ring request to the ring response

opqdonut 2025-03-19T07:27:43.042729Z

anyway, that's how ring output streaming works, just have your response :body be an instance of InputStream

opqdonut 2025-03-19T07:28:28.621519Z

http://ring.util.io/piped-input-stream can help you construct an inputstream if you need some processing as well

opqdonut 2025-03-19T07:30:01.752729Z

though you might need to use async ring to get all the benefits...

opqdonut 2025-03-19T07:30:16.057879Z

I've done response streaming with sync ring and it's worked out fine, even for multi-gigabyte responses.

👍🏻 1
fuad 2025-03-19T20:53:29.219689Z

Is there a recommended approach to keep the original ring request body around, perhaps as a byte array, when using the reitit/muuntaja decoding middleware? I need the original body for a signature verification but I can’t assume the body is a string for all cases, https://github.com/metosin/muuntaja/issues/143.

fuad 2025-03-19T20:54:56.221439Z

I guess I could read the body as a string (e.g. using ring.util.request/body-string), so that I get a consistent value, no matter what the original body was (e.g. original body could be string, input stream, etc) and then stash a byte array copy somewhere for the signature verification and replace the :body key with an input stream derived from the string.

fuad 2025-03-19T20:55:08.121389Z

Seems a bit convoluted, so I’m looking for potential alternative approaches.

Patrix 2025-03-20T02:21:41.814489Z

I’ve done it based on some previous advice from here or GitHub. I’ll find it later and post back

2025-03-20T02:40:10.135329Z

I can't speak from a reitit perspective, but turning the body from a an input stream into a string and then into a byte array is kind of silly, you should just go right to a byte array.

2025-03-20T03:17:10.606049Z

request/body-string is problematic because it builds a string using the default jvm encoding, not the encoding that may have been specified for the request.

Patrix 2025-03-20T13:01:54.779869Z

Ok so I wasn't using muuntaja or facing the exact same issue as you but I did have a need to have both the raw body and the string/parsed body, to use for signature verification. I've achieved it with a middldeware. It's similar to your link but does a few more things.

(defn wrap-raw-body [handler]
  (fn [request]
    (let [buffered-body (slurp (:body request))
          body-stream (io/input-stream (.getBytes buffered-body))
          parsed-body (json/read-str buffered-body :key-fn keyword)]
      (handler (assoc request :body body-stream :raw-body buffered-body
                      :body-params parsed-body)))))
You can inspire yourself from that to achieve your specific goal. See below for previous discussion AND for a link to the discussion before that https://clojurians.slack.com/archives/C7YF1SBT3/p1704477941426189 https://clojurians.slack.com/archives/C0K65B20P/p1696055080976369 (this one was pedestal rather than reitit but the principle is the same)

fuad 2025-03-20T13:10:55.018549Z

Thanks y’all for the tips! I think my scenario is slightly more convoluted, will document it here for posterity (and also to make sure I understand what I’m doing): • I have endpoints in my router that expect the body encoded as a string (e.g. utf-8 json string) and others that expect data as binary. • muuntaja is applied for all incoming requests in all routes. If it detects a string encoding and a known content type, it will: ◦ take the value of body as a string (I’m not entirely sure how it does that, I assume there’s a protocol or multimethod somewhere that deals with both strings and streams). ◦ parses it accordingly (e.g. json, edn, etc) ◦ the original body is not discarded, but if it was a stream, it will be empty now • I have this one specific route which needs access to the original body value. If I want to preserve the original value, I need a middleware that runs before muuntaja. • However, muuntaja doesn’t appear to deal with :body being a byte-array. It expects string or stream. So if I want to keep using muuntaja as it is, I need to copy my byte array somewhere else (e.g. a :original-body key) and then create a new stream or string from the byte array and put it back into :bodyring.util.request/body-string is no good indeed also because of the use case for accepting binary data

fuad 2025-03-20T13:11:58.982899Z

As a stopgap solution I ended up splitting my middleware stack into two and running the special middleware that always converts body to a string only for the specific endpoint that needs access to the original data. This works fine but is not super elegant, so I’m trying to find a solution that harmonizes all use cases in a single middleware stack.