Fork me on GitHub
#aws
<
2016-01-20
>
cfleming09:01:52

@ricardo: Here are some numbers from the Lambda testing I've been doing:

cfleming09:01:19

Called every 60 seconds

  Interval 1   Interval 2   Interval 3      Total   Duration        RTT
        4596         2678          121       7395       7678       9511
        4936         2737          121       7794       8114      10579
          55          256           61        372        372        457
           6          154           39        199        199        238
         119          319           29        467        467        497
           7          150           59        216        216        266
           7          152           23        182        182        227
           6          103           40        149        149        186
          19          141           40        200        199        228
           7          183           74        264        264        324

Called every 5 mins

  Interval 1   Interval 2   Interval 3      Total   Duration        RTT
        4457         3141          161       7759       8102       9279
        4680         2840          101       7621       8018       9602
         163          200           66        429        429        577
         146          194           33        373        383        497
          18          165           43        226        227        387
           6          156           51        213        213        368
          14          127           37        178        178        320
          24           78          100        202        203        357
          11          129           29        169        169        315
           8          192           30        230        229        419

Called every 15 mins

  Interval 1   Interval 2   Interval 3      Total   Duration        RTT
        4442         2664          120       7226       7517       8781
         103          160           49        312        324        464
        4839         3123           99       8061       8354      10834
          19          162           42        223        223        412
         160          161           36        357        357        514
           6          172           36        214        214        370
          37          131           30        198        199        351
        4720         2959          162       7841       8135       9499
         115          159           24        298        298        434
          15          184           29        228        228        358

cfleming09:01:41

Interval 1 is creating the DynamoDB client and obtaining a table. Interval 2 is reading a single record from it. Interval 3 is updating a record atomically. Total is the total time measured inside the function. Duration is the billed duration from CloudWatch. RTT is the total round trip time.

cfleming09:01:52

The Lambda is invoked directly from an EC2 instance in the same region. It's a 512 MB JVM instance running Java.

cfleming09:01:03

I did a bunch of different runs testing this, and even calling the lambda every minute it's not uncommon to see an instance spun down between calls. The first two calls are almost always cold.

cfleming09:01:15

API gateway has a hard 10 sec timeout. This makes Lambda almost useless as an API gateway backend if you want to use more AWS services from it, unless the client is prepared to handle regular timeouts and retry. This is pretty consistent from what I've seen in the Cursive licence generation too. Perhaps the instance would be kept warm more reliably under heavier load, I haven't tried that yet.

cfleming09:01:30

I had assumed that the instance spin-up time when cold is included in the billed duration, which would mean that the difference between Total and Duration is the instance spin-up time, around 300ms or so. That would mean that the difference between RTT and Duration is the RPC latency, looks like about 150ms avg but I haven't had time to calculate it precisely. However that value is much larger on the cold instances, so perhaps the instance spin-up isn't billed? That makes the spin-up time for a cold instance from 1-2 sec, looks like.

cfleming09:01:52

Even when the instance is warm RPC overhead is around 150ms.

cfleming09:01:15

I guess if the spin-up time isn’t billed, that 300ms might be classloading.

ragge10:01:21

@cfleming: thanks for sharing! was this using clojure or just java?

ricardo10:01:19

@cfleming: Oh, it looks like you’re getting the same behavior I got, where often the first two calls behave as if it was cold.

ricardo10:01:04

I didn’t know what to make of that, so I didn’t mention it on my article. Plus I figured it’s a lambda peculiarity, so it wasn’t directly related to what I was testing.

alandipert14:01:01

@cfleming: thanks for sharing! @ragge iirc, java

curtosis16:01:16

has there been any work on pulling the AWS SDK for Javascript into the CLJSJS system? (or, alternatively, has anyone successfully integrated it manually w/externs in a Clojurescript app?)

bhagany18:01:36

@curtosis: if you're on node, I use lein-npm to install the SDK, and then just node/require it from cljs

bhagany18:01:40

no externs or anything

bhagany18:01:49

I don't have any experience doing that from the browser though

curtosis18:01:22

thanks.... unfortunately I need to do it from the browser (or write a Lambda block to do it from API Gateway (or directly))

curtosis18:01:07

I think the aws-specific code looks the same, but the extern/compiler bits are still voodoo to me.

cfleming20:01:56

@ragge: This is using Java, I wanted to isolate the JVM effects without muddying the water with Clojure startup time

cfleming20:01:05

@ragge: @ricardo: @alandipert: https://forums.aws.amazon.com/thread.jspa?threadID=199685 This thread reaches the same conclusion - Lambda is basically unusable on the JVM unless you have enough volume to keep your instances hot always, or can handle retrying.

cfleming20:01:23

I’m planning to re-run those tests with a) JVM instance with 1.5G and b) Node

cfleming20:01:30

I’ll report back with numbers

bhagany21:01:29

these lambda tests are really eye-opening

bhagany21:01:51

I'm using it, but I don't care how long they take, so I never systematically timed it

alandipert21:01:43

@cfleming: interesting - still haven't used jvm lambdas in conjunction w/ api proxy myself

alandipert21:01:12

working great for us in a high-volume S3-event role, and also with the 'scheduled' event for running every 30 mins

cfleming21:01:16

@alandipert: I guess with your scheduled event you don’t care if the operation takes 10 seconds, right?

alandipert21:01:35

@cfleming: right - the thing starts an EMR cluster which takes like 15 mins

cfleming21:01:36

Yeah. I hesitate to call Lambda totally useless, but a lot of use cases are ruled out by those times.

alandipert21:01:34

yeah, and certainly the ones that were the most exciting when i first heard about it

alandipert21:01:12

as time goes on we find ourselves less and less attracted to the various "add-on" aws services

alandipert21:01:31

i am marginally for, micha is increasingly against

cfleming21:01:16

And if you stick to just the core ones, IMO AWS gets less appealing when compared to some bare-metal cloud offering, especially for big systems.

micha21:01:49

does anyone have experience with that kind of thing?

alandipert21:01:41

@cfleming yeah, altho we can definitely build anything we need out of some combination of s3, EC2, and dynamo

cfleming21:01:28

@alandipert: Right, but how much it costs you at scale is more what I’m referring to.

cfleming21:01:49

It works well for me because at my pitiful scale it’s all basically free.

alandipert21:01:20

yeah we definitely use it for some things that we should not be, if we wanted to be maximally efficient

alandipert21:01:55

we have discussed going back to having a data center, for the ad servers

alandipert21:01:11

we just don't need the "aglilty" of cloud for the things we do that are very well defined and rarely change

cfleming21:01:21

Right, and they must have pretty high volume and care about response time.

alandipert21:01:26

yeah. very predictable volume also

alandipert21:01:55

we could easily estimate our hardware needs for a 6 month period, buy the boxes, and still be way cheaper than aws

alandipert21:01:24

but we'd grow sysadmins and multi region would be a little tougher

cfleming22:01:01

@alandipert: What about softlayer or something similar?

alandipert22:01:36

@cfleming: could be an option, but haven't investigated

cfleming22:01:56

It’s probably getting a bit heretical for discussion in here too.

alandipert22:01:08

take it to #softlayer, traitor!!

alandipert22:01:26

but yeah i think ultimately we realized we could save loads just by using AWS smarter

alandipert22:01:40

so, we're starting with that, then maybe down the road take another look at providers