Fork me on GitHub
#aws
<
2020-09-11
>
orestis09:09:17

An interesting discussion on Twitter about Clojure deployment on AWS https://twitter.com/pesterhazy/status/1304131064835772416?s=20

orestis09:09:03

I’m doing a deployment now and I’m seeing these metrics: • Env. update is starting -> registering first batch to load balancer and waiting to be healthy: 40s • First batch passed health checks: 5m • Deploying to second batch and waiting for health checks: 36s • Second batch passed health checks: 5m

pesterhazy11:09:57

@orestis thanks for bringing this up here

pesterhazy11:09:15

what deployment config are you using?

orestis11:09:05

Just two instances, JDK11 / Coretto, Rolling updates, 1 instance at a time.

pesterhazy11:09:07

I've written up my findings and experiments, with some alternatives sketched (but I haven't figured this out yet by any means): https://gist.github.com/pesterhazy/d0030f559f600d0ce1b3a090173c9c9c

pesterhazy11:09:05

Any comments appreciated

orestis11:09:50

“Currently we use Rolling policy with BatchSize=1” -> that means you’re doing the update one-by-one, have you tried using a percent-based batch size? 25% would do two instances at a time, so it would half your deployment time.

orestis11:09:19

I feel your pain though. We were hosted on rackspace before, using plain old VMs and updates took seconds.

orestis11:09:55

There’s a few things I mean to try but it’s low priority for us ATM: 1. There’s a new split traffic update mechanism https://aws.amazon.com/about-aws/whats-new/2020/05/aws-elastic-beanstalk-traffic-splitting-deployment-policy/ 2. There’s a (literally yesterday) ability to share a non-EB load balancer between different EB environments: https://aws.amazon.com/blogs/containers/amazon-elastic-beanstalk-introduces-support-shared-load-balancers/

pesterhazy12:09:44

Do you think these have the potential of speeding up deployments?

orestis13:09:41

The split traffic will probably fix the “mixed results” problem. I’m using session stickyness to overcome it, but I don’t like session stickyness in general.

orestis13:09:34

The shared ELB gives you great flexibility, since you can mix-and-match environments — but I don’t think you can get faster deployments without significant engineering investment in automation, so it’s probably not that relevant.

👍 3
orestis11:09:58

Elastic Beanstalk was nice to get us some peace of mind with minimal ops investment, as we migrated to AWS. But it feels creaky. OTOH, it is Dockerless, and there was some movement lately which makes me hopeful that it’s actively developed and improved.

pesterhazy12:09:07

> have you tried using a percent-based batch size Yeah that's definitely something we'll try, along with RollingWithAdditionalBatch. I figured we'd try Immutable first, on the assumption that it'd be faster on principle, because it spins up all 8 instances concurrently. But that doesn't seem quite true

pesterhazy12:09:59

Another thought I've had is this: • in the normal day-to-day deployment case, 26 min is probably acceptable; so we can keep using Rolling deployments (or Immutable deployments) • when there's a problem, however, and you need to deploy a hotfix or roll back a change, 26 min is unacceptable. In this case we could manually switch to AllAtOnce before uploading the new app version

orestis13:09:01

The Immutable spins up new instances from scratch which takes some time.

orestis13:09:28

Don’t you have to wait to do a configuration deployment to specify AllAtOnce?

orestis13:09:01

Ah wait — the configuration policy is different than the new application version policy. So you could keep the configuration policy at AllAtOnce indefinetely.

pesterhazy12:09:54

Just tested this. You can manually switch to AllAtOnce in the console. It takes about a minute. Then deploying a new application version takes less than a minute. The downside is that you have downtime, in our case 12 minutes

pesterhazy12:09:49

This tradeoff may be acceptable when hotfixing a customer facing bug

ghadi13:09:44

I use Fargate to deploy services

ghadi13:09:51

ask me anything

orestis14:09:25

@ghadi how long does it take for traffic to hit a new version of code, and how fast is rolling back? Also, does Fargate necessitate Docker? Does it have a nice console to get started? Is it a good fit for “monoliths”? How do you develop locally? What about monitoring things like traffic, memory use etc etc? Do you pick instance sizes eg if you want large machines with lots of memory? What about static assets, do you bundle nginx together with a JVM or do you make separate containers?

orestis14:09:07

I’m clueless and suspicious about containers and beanstalk gave me an entry point where many things are reasonably automated, but we’re outgrowing it :)

ghadi14:09:08

traffic: 2-5 minutes

ghadi14:09:13

yes fargate necessitates docker

ghadi14:09:40

Fargate is "hostless" docker, where AWS manages scheduling your containers magically

ghadi14:09:11

monolith is broad, so I can't assess if it's a good fit, but if you can containerize your app, it's a start

ghadi14:09:56

I use an ALB as a load balancer, connecting to Pedestal/Jetty on the containers

orestis14:09:12

A few more questions: do you use AWS cli to deploy new versions? Is a new version a fresh docker image that you push to a registry, then notify Fargate to run it? How does auto scale work? Is it also suitable for “background” jobs eg if I have some cron jobs can I keep a Fargate container running forever?

orestis14:09:09

I think I just need to sit down and read the Fargate docs :)

orestis14:09:15

What about the subjective stuff instead, are you happy with it? Would you choose it for a new project?

ghadi14:09:58

I do everything through cloudformation... I haven't taken up the CDK yet.

ghadi14:09:35

upstream build job makes a docker image, downstream deployment job updates cloudformation stack

ghadi14:09:20

since java is so easy to deploy, sometimes I use a stock container that downloads the jar upon startup

ghadi14:09:32

other times I bake the jar into the container

ghadi14:09:44

yes Fargate containers can run indefinitely

ghadi14:09:00

e.g. we have some queue pollers that are in fargate

ghadi14:09:47

Subjectively, I like Fargate, but I've used it for several years now and have some expertise

ghadi14:09:07

It used to be really expensive compared to EC2, but now it's not as much of a premium

ghadi14:09:30

I despise managing machines, and prefer to only manage my application

ghadi14:09:41

don't have to worry about security updates, or ssh access with a container

ghadi14:09:53

well, not as much

kenny15:09:27

We used to use Fargate, and I quite liked it. Had to switch due to lack of persistent storage at the time. It's gotten some very nice features since, however. They added capacity providers so using the spot market is easy. You can now attach EFS volumes to a task for persistent storage. We use Datadog for metrics, logs, APM. If you use something similar, all tasks need a sidecar container running which is a small additional added cost per task replica. Deploy everything using Pulumi. Overall Fargate experience has been fantastic. Would also recommend.

orestis15:09:16

Ah, right, so different vendors. Interesting POV, I thought that using AWS for everything was the norm but it seems not.

kenny16:09:36

I don't know how folks can possibly stand using CW logs. It's terrible compared to Datadog's offering.

orestis16:09:30

It’s effectively free ;)

kenny16:09:13

People time is usually far more expensive tho

orestis15:09:46

To be honest I don’t want to do anything with this kind of thing so the more hands off the better :)