Fork me on GitHub
#yada
<
2016-03-04
>
imre12:03:22

I noticed there seems to be a problem with large files in multipart uploads

imre12:03:50

are there any details available?

malcolmsparks13:03:56

Yes. There has been a bug that manifests itself intermittantly. It is due to a missing conditional in one part of rhe multipart code which means a part is mislabelled as a preamble

malcolmsparks13:03:14

It's been fixed in the latest release this morning.

malcolmsparks13:03:53

We've done some extensive property-based testing (which is how we found the bug in the first place)

malcolmsparks13:03:06

It all looks good now, BUT...

malcolmsparks13:03:19

You MUST set raw-streams option to true when starting the aleph server. I can't stress this enough. If you don't enable the raw streams then the data can arrive with some corriptions.

malcolmsparks13:03:14

I have added a yada.yada/server convenience function on master which starts aleph for you with the correct setup

malcolmsparks13:03:24

@imre tl;dr upgrade to this morning's release. Thank you dear beta programme members!

imre13:03:12

thanks for the explanation simple_smile

imre14:03:19

I did the upgrade but missed the raw-streams thing

imre14:03:33

gonna try that now

imre14:03:40

it did the trick, thanks!

imre14:03:41

on a different note just as that got fixed I realized multipart is probably not what I want to use in my case...

imre14:03:36

will yada support plain binary uploads like an "application/pdf" with binary contents in the body?

imre15:03:26

something like

POST /avatars HTTP/1.1
Host: localhost:3000
Content-Type: image/jpeg
Content-Length: 284

raw image content

malcolmsparks15:03:15

It does already

imre16:03:05

(yada/resource
                                                  {:consumes       [{:media-type #{"application/txt"}}]
                                                   :produces       [{:media-type #{"text/html"}
                                                                     :charset    #{"UTF-8"}}]
                                                   :access-control {:allow-origin  "*"
                                                                    :allow-methods "POST"}
                                                   :methods        {:post {:parameters {:body   schema.core/Any
                                                                                        :header #_{(schema.core/optional-key "content-type") schema.core/Any}}
                                                                           :response   (fn [ctx]
                                                                                         (prn "raw:" (get-in ctx [:parameters]))
                                                                                         "Hello")}}
                                                   :responses      {415 {:produces [{:media-type #{"text/html"}
                                                                                     :charset    #{"UTF-8"}}]
                                                                         :response (fn [ctx]
                                                                                     (prn (:request ctx))
                                                                                     (:response ctx))}}})

imre16:03:26

returning me 415 unsupported media type when I post application/txt to it

imre16:03:58

I guess I need to fire up the debugger

imre16:03:00

a-ha, it's the missing process-request-body impl

imre16:03:41

I guess I should probably just go with octet-stream and get metadata like file extension in a header

malcolmsparks16:03:57

I think I need to spend some time polishing off the upload feature, but it implemented and tested, but badly needs some finishing and docs - I'll see what I can do - in the meantime, let me know how you get on and if you manage to figure it out - I can always give pointers if you get stuck

imre16:03:43

thanks a million

imre16:03:55

I have a load of problems right now which really starts from me being unsure about what API design approach would be the best simple_smile

malcolmsparks16:03:50

API design is really hard

imre16:03:22

I can only agree with you on that simple_smile

malcolmsparks16:03:01

I keep hearing nightmarish stories about developers struggling with poor API designs - that of course can never be fixed because it's too late and its all hardcoded and distributed in microsrevices and whatever

imre16:03:47

yeah, exactly

malcolmsparks16:03:42

i think one of the best reasons to stay true to a RESTful approach is it helps avoid most of the really horrible pitfalls because people have thought hard about them

malcolmsparks16:03:54

not saying REST is the answer, just a good default

imre16:03:05

one reason I work really hard to keep all the http and domain stuff separate so if I mess up the domain interface I can refactor internally while keeping the same http api

imre16:03:29

and if I mess up the http api I can "easily" express the whole domain as a V2 in another http api

malcolmsparks16:03:35

yes, I worry this generation is creating such a painfully huge legacy for the next

malcolmsparks16:03:05

keeping all that separate is an art - not easy to do, especially for less experienced devs

imre16:03:53

it does result in some duplication but I don't really mind it

imre16:03:43

I think I made up my mind: gonna go with application/octet-stream and the Content-Disposition header

malcolmsparks16:03:13

@imre I think you're just saying 'I try not to complect the domain with the transport' - which is an excellent principle to hold to

imre16:03:48

yeah, that's the aim!

malcolmsparks16:03:32

yeah - good idea, not sure yada handles the Content-Disposition header for you there, but you can do all this by hand, it's not hard (and then you've built a reusable yada resource you can use in other situations!)

imre16:03:20

does yada support any pre-body-parsing validation on an arbitrary header value?

imre16:03:46

so for example I want to only allow certain file types through

imre16:03:09

so I'd write a schema for the Content-Disposition header

imre16:03:34

I guess I could do that with application/foo content types

imre16:03:00

but I think I would prefer to go the other way

imre16:03:23

also, is there a way of specifying a max content-length for a request

imre16:03:41

so that I don't even need to read the body in if I know it's too much

malcolmsparks17:03:49

not currently 😞 but good feature request simple_smile - there is an http status code for this: 413

imre17:03:57

hmm, interesting!

imre17:03:11

413 wasn't in my active http code knowledge

imre18:03:48

a question about octet-stream requests: I haven't found any tests in yada that send an octet-stream in the body and I seem to be slow in figuring out where such a body would be available to the request handler

imre18:03:06

the :parameters map doesn't seem to be receiving it

malcolmsparks18:03:02

You don't specify it in parameters, you just process it and make it available in the request context. Let me look. Not on my laptop right now

imre18:03:49

seems to be working that way, don't worry about it

imre18:03:10

I can (get-in ctx [:request :body]) and get the stream

malcolmsparks18:03:50

will it be a stream? you mean a manifold stream right?

malcolmsparks18:03:11

remember you're using raw-streams true, so you'll get a bunch of netty byte buffers

imre18:03:25

hmm true

imre18:03:34

let me try turning raw streams off

malcolmsparks18:03:50

you don't ever want to do that - it'll cripple the hyperdrive

malcolmsparks18:03:02

let me explain - it's really easy

malcolmsparks18:03:22

you see that yada.interceptors/process-request-body ?

malcolmsparks18:03:10

see

(rb/process-request-body
             ctx
             (stream/map bs/to-byte-array (bs/to-byte-buffers (:body request)))
             (:name content-type))

malcolmsparks18:03:28

that stream/map thing is the most magical line in yada

malcolmsparks18:03:52

it converts all the manifold stuff into normal byte-buffers, and handles all the netty reference counting for you

imre18:03:02

(I'll need to read up on manifold)

malcolmsparks18:03:04

and gives you a (deferred) stream of byte buffers

malcolmsparks18:03:22

so you just need to add a defmethod for your content-type

imre18:03:47

which is what you do for json and edn and all that

imre18:03:44

and application/octet-stream

imre18:03:58

which is what I'm using right now to accept a raw file

malcolmsparks18:03:04

the body-stream here is a manifold stream of byte arrays

malcolmsparks18:03:32

(defmethod process-request-body "application/octet-stream"
  [ctx body-stream media-type & args]
  (d/chain
   (s/reduce (fn [acc buf] (inc acc)) 0 body-stream)
   ;; 1. Get the body buffer receiver from the ctx - a default one will
   ;; be configured for each method, it will be configurable in the
   ;; service options, since it's an infrastructural concern.

   ;; 2. Send each buffer in the reduce to the body buffer receiver

   ;; 3. At the end of the reduce, as the body buffer receiver to
   ;; provide the context's :body.
   (fn [acc]
     (infof ":default acc is %s" acc)))
  ctx)

malcolmsparks18:03:44

it isn't implemented - it just counts the number of byte-arrays in the stream

malcolmsparks18:03:59

ok I need to fix that

imre18:03:07

that was the strange part yeah simple_smile

malcolmsparks18:03:31

it's a placeholder - but the best example currently is in multipart.clj

malcolmsparks18:03:55

ok, looks like i know what i'm doing this weekend 😉

malcolmsparks18:03:10

it's not a big job because multipart.clj already works

imre18:03:05

sorry for breaking your stuff

malcolmsparks18:03:23

the idea is that you have some reducing function declared in your resource model - the reducing function will get a stream of all the byte buffers (as they are realized)

malcolmsparks18:03:31

so it's all still glorious async

malcolmsparks18:03:55

this is a pull model for development - I leave gaps and wait for people to complain

imre18:03:42

acceptance test driven dev

imre18:03:53

no ticket? no feature

malcolmsparks18:03:44

I'll try to expain: the intention is that you declare some function in your resource model which is a reducing function - if all you want to do is delegate the stream to some /tmp storage, you'll get called every time there's a byte-array to deal with, which you could append to that /tmp file - or S3 - or anything you like - it's a reducing function, you could be calculating a hash or anything

malcolmsparks18:03:33

anyway, i'm glad you spotted this, I had something at the back of my mind that there was still something left to do with async uploads - we've been hard at work smoking out a bug in the multipart code this week

malcolmsparks18:03:30

ok, please wait for monday, normal service will be resumed

malcolmsparks18:03:47

can you tell me what you're trying to do ? is it a file upload to store to disk?

imre18:03:53

for now though no one in my org complained about the api being too sync so I guess I'll just go with a crude implementation:

imre18:03:59

(defmethod yada.request-body/process-request-body "application/octet-stream"
  [ctx body-stream & _]
  (assoc-in ctx [:body] (yadabs/to-byte-array body-stream)))

imre18:03:09

that I snatched from the json one

malcolmsparks18:03:21

exactly - the transit code needs inputstreams

malcolmsparks18:03:16

and as you know, you can substitute my stub defmethods for your proper ones

malcolmsparks18:03:34

sorry for the excuses, i'm a bit embarrassed about that application/octet-stream stub now, had a mentally hard few weeks and misremembered it wasn't complete!

malcolmsparks18:03:36

but having see yada hammered hard this week, and standing up to some punishment, I'm confident all this netty/aleph/manifold stuff is fundamentally sound and we'll be ok as we scale up

malcolmsparks18:03:49

(so long as you keep the hyperdrive switched on)

imre18:03:32

file upload exactly

imre18:03:33

attaching documents to a resource

imre18:03:50

sorry I was under Bray Head

imre18:03:59

good to hear that

malcolmsparks18:03:00

yep, i actually understand what that means simple_smile