Hi,
wondering if anyone has done proxying with Reitit/ring -combo for very large data sets? We have a situation where we would like to just proxy a request and its response, but Jetty dies after a naivistic clj-http/get call with (without decompression ofc)
; java.lang.OutOfMemoryError: Required array length 2147483638 + 4484 is too large
So, I assume the response size demands streaming and I looked at a very old ring-proxy lib, but kind of didn't want to take it into use, so decided to ask if someone has dealt with a similar issue.it could just work out if you hand the :body inputstream you get from the ring request to the ring response
anyway, that's how ring output streaming works, just have your response :body be an instance of InputStream
http://ring.util.io/piped-input-stream can help you construct an inputstream if you need some processing as well
though you might need to use async ring to get all the benefits...
I've done response streaming with sync ring and it's worked out fine, even for multi-gigabyte responses.
Is there a recommended approach to keep the original ring request body around, perhaps as a byte array, when using the reitit/muuntaja decoding middleware? I need the original body for a signature verification but I can’t assume the body is a string for all cases, https://github.com/metosin/muuntaja/issues/143.
I guess I could read the body as a string (e.g. using ring.util.request/body-string), so that I get a consistent value, no matter what the original body was (e.g. original body could be string, input stream, etc) and then stash a byte array copy somewhere for the signature verification and replace the :body key with an input stream derived from the string.
Seems a bit convoluted, so I’m looking for potential alternative approaches.
I’ve done it based on some previous advice from here or GitHub. I’ll find it later and post back
I can't speak from a reitit perspective, but turning the body from a an input stream into a string and then into a byte array is kind of silly, you should just go right to a byte array.
request/body-string is problematic because it builds a string using the default jvm encoding, not the encoding that may have been specified for the request.
Ok so I wasn't using muuntaja or facing the exact same issue as you but I did have a need to have both the raw body and the string/parsed body, to use for signature verification. I've achieved it with a middldeware. It's similar to your link but does a few more things.
(defn wrap-raw-body [handler]
(fn [request]
(let [buffered-body (slurp (:body request))
body-stream (io/input-stream (.getBytes buffered-body))
parsed-body (json/read-str buffered-body :key-fn keyword)]
(handler (assoc request :body body-stream :raw-body buffered-body
:body-params parsed-body)))))
You can inspire yourself from that to achieve your specific goal.
See below for previous discussion AND for a link to the discussion before that
https://clojurians.slack.com/archives/C7YF1SBT3/p1704477941426189
https://clojurians.slack.com/archives/C0K65B20P/p1696055080976369 (this one was pedestal rather than reitit but the principle is the same)Thanks y’all for the tips!
I think my scenario is slightly more convoluted, will document it here for posterity (and also to make sure I understand what I’m doing):
• I have endpoints in my router that expect the body encoded as a string (e.g. utf-8 json string) and others that expect data as binary.
• muuntaja is applied for all incoming requests in all routes. If it detects a string encoding and a known content type, it will:
◦ take the value of body as a string (I’m not entirely sure how it does that, I assume there’s a protocol or multimethod somewhere that deals with both strings and streams).
◦ parses it accordingly (e.g. json, edn, etc)
◦ the original body is not discarded, but if it was a stream, it will be empty now
• I have this one specific route which needs access to the original body value. If I want to preserve the original value, I need a middleware that runs before muuntaja.
• However, muuntaja doesn’t appear to deal with :body being a byte-array. It expects string or stream. So if I want to keep using muuntaja as it is, I need to copy my byte array somewhere else (e.g. a :original-body key) and then create a new stream or string from the byte array and put it back into :body
• ring.util.request/body-string is no good indeed also because of the use case for accepting binary data
As a stopgap solution I ended up splitting my middleware stack into two and running the special middleware that always converts body to a string only for the specific endpoint that needs access to the original data. This works fine but is not super elegant, so I’m trying to find a solution that harmonizes all use cases in a single middleware stack.