This page is not created by, affiliated with, or supported by Slack Technologies, Inc.
2023-09-13
Channels
- # announcements (15)
- # babashka (48)
- # beginners (5)
- # biff (4)
- # calva (3)
- # cider (10)
- # clerk (16)
- # clj-kondo (6)
- # cljdoc (20)
- # cljs-dev (13)
- # clojure (117)
- # clojure-argentina (1)
- # clojure-brasil (5)
- # clojure-europe (40)
- # clojure-nl (1)
- # clojure-norway (111)
- # clojure-uk (5)
- # clojurescript (16)
- # cursive (20)
- # datascript (2)
- # datomic (106)
- # etaoin (2)
- # events (3)
- # funcool (1)
- # graphql (1)
- # helix (8)
- # hyperfiddle (36)
- # leiningen (12)
- # matrix (1)
- # nrepl (1)
- # off-topic (61)
- # other-languages (10)
- # polylith (22)
- # practicalli (1)
- # reagent (28)
- # reitit (11)
- # remote-jobs (3)
- # ring (12)
- # shadow-cljs (109)
- # slack-help (6)
- # solo-full-stack (23)
- # squint (7)
- # xtdb (11)
I have successfully got multipart working, at least as long as the submitter uses content type multipart/form-data
. Now I have another client that sends data as multipart/mixed
. I can see that ring.middleware.multipart-params/multipart-params-request
only parses if content type is multipart/form-data
, and I can't find any Clojure example of other multipart types such as mixed.
Anybody has a clue how to parse multipart/mixed
would be wonderful!
Can you open an issue for this on the Ring repository? It sounds like the sort of thing we'd need to patch.
Looking into this further, it looks like multipart/mixed
data doesn't necessarily have anything that keys the multipart data, and it looks like the Commons FileUpload library used by Ring for its multipart parsing only supports formdata (unless we use some of the lower-level classes instead), as far as I can tell. If you want multipart/mixed
, you'll probably need to grab a Java multipart library and write some middleware to parse it yourself.
Thanks @U0BKWMG5B! Can't almost find any information about multipart/mixed
, but it seems I can convince the sender to encode the data as multipart/form-data
instead, so we can use the standard libraries on the backend.
Had a look and found that the header contains a boundary that separates the parts:
Content-Type: multipart/mixed; boundary=3d3edb89-ba76-4ed5-b199-eb3f6f278646
Hence I wrote this middleware to parse it:(ns xxxx.multipart-mixed
(:require [cheshire.core :as cheshire]
[cuerdas.core :as str])
(:import ( InputStream)
(java.util Arrays)
( IOUtils)))
(defn array-find
"Search a byte array for a pattern, return the offset found"
[^bytes data ^bytes pattern & [start-offset]]
(let [pattern-length (alength pattern)
length (- (alength data) pattern-length)]
(if (pos? length)
(loop [i (or start-offset 0)]
(when (<= i length)
(if (Arrays/equals pattern (Arrays/copyOfRange data i (+ i pattern-length)))
i
(recur (inc i))))))))
(defn array-split
"Split an array on the pattern"
[^bytes data ^bytes pattern]
(let [pattern (if (string? pattern)
(.getBytes pattern)
pattern)
pattern-length (alength pattern)]
(loop [data data
start-offset 0
acc []]
(if-let [offset (array-find data pattern start-offset)]
(let [acc (if (= start-offset offset)
acc
(conj
acc
(Arrays/copyOfRange data start-offset offset)))]
(recur data (+ offset pattern-length) acc))
acc))))
(defn parse-header
[s separator]
(into
{}
(keep
#(let [parts (str/split % separator)]
(if (= 2 (count parts))
[(keyword (str/lower (str/trim (first parts))))
(str/trim (second parts) "\n\t\f\r \"")]))
s)))
(defn get-content-type-info
[content-type]
(let [parts (str/split content-type ";")]
(assoc
(parse-header parts "=")
:content-type (first parts))))
(defn wrap-multipart-mixed
[handler]
(fn [{{content-type "content-type"} :headers
:as request}]
(let [content-type-info (get-content-type-info content-type)]
(if (= (:content-type content-type-info) "multipart/mixed")
(let [content-type-info (get-content-type-info content-type)
boundary (str "--" (:boundary content-type-info))
body (IOUtils/toByteArray ^InputStream (:body request))
multiparts (array-split body boundary)]
(handler
(reduce
(fn [request multipart]
(let [multipart-rows (array-split multipart "\r\n")
multipart-info (parse-header (map #(String. %) multipart-rows) ":")
content-type-info (get-content-type-info (:content-type multipart-info))
content-type (:content-type content-type-info)]
(case content-type
"application/json" (update-in
request
[:parameters :body]
merge
(cheshire/parse-string (String. (last multipart-rows)) true))
"image/jpeg" (let [content-disposition-info (get-content-type-info (:content-disposition multipart-info))
k (keyword (:name content-disposition-info))]
(assoc-in
request
[:parameters :body k]
{:content-type content-type
:filename (:filename content-disposition-info)
:bytes (last multipart-rows)}))
request)))
request
multiparts)))))))
I have tested it with this data in the body:
--3d3edb89-ba76-4ed5-b199-eb3f6f278646
Content-Type: application/json; charset=utf-8
Content-Length: 27
{"name":"Andreas"}
--3d3edb89-ba76-4ed5-b199-eb3f6f278646
Content-Disposition: form-data; name="image"; filename="image.jpg"
Content-Type: image/jpeg
Content-Length: 1814
<binary data for the image>
--3d3edb89-ba76-4ed5-b199-eb3f6f278646--
Don't think it's generic enough to be in any library yet, but somebody searching for multipart/mixed
is welcome to capture inspiration from the code.
Now I get confused, happened to see the raw input of a multipart/form
request that Ring already so nicely parses, and found the content very similar with boundary separators.
And reading the header of org.apache.commons.fileupload
which ring.middleware.multipart-params
uses for it's parsing, it says:
> This class handles multiple files per single HTML widget, sent using multipart/mixed
encoding type, as specified by RFC 1867 . Use parseRequest(RequestContext) to acquire a list of FileItems associated with a given HTML widget.
So maybe most of my code isn't necessary at all? Gonna see what happens if I just change the content type from multipart/mixed
to multipart/form-data
a bit later on.
Yes! So actually if I just bypass the check in ring.middleware.multipart-params/multipart-form?
with another middleware before that changes content type from multipart/mixed
to multipart/form-data
Ring parses out the image as well.
Feels like I've been going around in circles, @U0BKWMG5B what is you're understanding about the differences about the two formats? And as far as I can read in the header of commons.FileUpload
it supports multipart/mixed
; where did you read that it doesn't?
My understanding is that the sole difference is that multipart/form-data
must set a name
attribute in the Content-Disposition header, while multipart/mixed
has no such restriction. This means that form data can be loaded into a map of parameters, while mixed data cannot, as it doesn't have a guaranteed key for each multipart section. Mixed data is more generally used for things like email attachments.
The FileItem
interface in FileUpload requires a name attribute to be constructed, however now that I look at the actual parsing code, I see that this name attribute may actually be null. So I believe I was wrong before, and that it's possible to use FileUpload to parse multipart/mixed
directly, without dropping down to the lower level API of the library. You'll get an iterator of FileItem
instances where getFieldName
may return null.
Of course, this isn't compatible with Ring because it expects each multipart section to be named, so it has keys for the map it produces.