clojure-uk 2018-09-20 | Slack Archive

I think it's 'cos I've seen friends I've not seen for a long time and they have kids my age but they are home all the time, more than anything if I am honest.

maleghast08:09:27

That and I didn't sleep well last night (despite lying to my AirBnB host about it) and I feel a little out of phase with reality this morning.

maleghast08:09:52

Nothing a nice relaxing weekend in Scotland and a walk in the hills won't fix, I expect 🙂

maleghast08:09:04

(and some time with the family, clearly)

mccraigmccraig08:09:06

it's gotta be quite tough commuting from scotland to london @maleghast...

maleghast08:09:34

It can be, that's for sure. I'll be honest though, @mccraigmccraig, most of the time it feels easier on my system than the daily commute from Sevenoaks used to be. Having to be on trains and Tubes and in the car for upwards of 2 hours each way every day was harder.

😂 4

maleghast08:09:39

I don't miss that at all.

grease08:09:25

JVM, in my case.

claudiu08:09:54

haven't tried. If you're not interested in SSR and just want the fulcro server ("/api") for lambdas. There you can use fulcro server or just do your own thing and use https://github.com/wilkerlucio/pathom for the parsing the queries.

grease09:09:11

Thanks. Let me try that out

maleghast08:09:15

I could do with having the money to stay a short walk from our London office when I come down, so that it really is only one journey down and one journey home.

maleghast08:09:32

That would be ideal

alexlynham08:09:01

whoa, okay that's an insane commute 😞

alexlynham08:09:15

Sorry you're feeling low dude

mccraigmccraig08:09:14

:hugging_face: @maleghast

maleghast08:09:21

@alex.lynham - Thanks 🙂 In fairness to all concerned, the work-time this trip has been awesome, and I'm not just saying that 'cos some of those people lurk in this channel, but for some reason being away from home has just been harder this week. I am sure it will pass.

Thx @mccraigmccraig

💤

morning

Anyone got any experience with using AWS EC2 to ship in large files (15-25 GB) and then push them to S3? I am noticing that doing this on an ad hoc, manual basis I am being hit with network throttling, which I have overcome in the short-term by just killing one instance and launching a new one, but this is not a scalable solution...

maleghast10:09:49

Worst case scenario I could__ just acquire the files MUCH more slowly (7-9 hours instead of 15 minutes) after the throttling kicks in (assuming that the throttling is not lifted at some point), but I would rather find out if there is a reliable way to provision infra that can consistently move large files around at speed..?

maleghast10:09:55

Also, I need to make sure that I am not (in the fullness of time) paying network traffic fees to move files from EC2s in my Default VPC to S3 - it has been suggested to me that this might be a sleeping cost waiting to come and bite me if I don't take steps to mitigate it.

firthh10:09:19

What sort of instance types are you using?

maleghast10:09:33

At the moment, 'cos I am doing it manually, spinning one up for >1hr I am just using t2.micro-s

firthh10:09:59

That could be part of the problem, bigger instances get dedicated network

maleghast10:09:07

Yeah, I figured

maleghast10:09:36

The thing is, until I have this side of things automated I don't want to pay the hourly for great big, crunchy instances

firthh10:09:23

Could you look at spot instances?

maleghast10:09:56

I am nearly done with this brief bit of manual data shipping - the last of three files is going to S3 as I type and is nearly there - but I need to start approaching the problem of automating this whole process and I just want to be certain that the instance I deploy to will not get its network connectivity throttled

maleghast10:09:59

when I do it.

mccraigmccraig10:09:32

i've never noticed any network throttling... we don't generally ship files as big as 25GB but 3-5GB is common

maleghast10:09:36

@mccraigmccraig I did one at 20.1 and then another at 19.5 and they both shipped at an average network speed of 30Mb/s but when I kicked off the third transfer I was looking at speeds of 600kb/s instead

maleghast10:09:53

(third transfer was another 20.1 Gb file)

mccraigmccraig10:09:49

i don't think i've ever shipped 60GB of files continuously to S3 though, so i don't think i would have encountered any throttling after 40GB or so

maleghast10:09:54

*nods* Yeah, I realise it's not all that common a use-case

maleghast10:09:01

.grib files are MAHOOSIVE

mccraigmccraig10:09:31

some of our structured log dumps and db snapshots are of that sort of size, but they compress quite well

maleghast10:09:48

I have not tried compressing .grib files, but they are already a compressed binary format file, so I am not expecting that they will compress all that well. Also I am fetching them after they have been created / compiled on a third-party's system (via an API or by hand) and as such I cannot easily control their initial state. I suppose that I could push them through compression so that they are saved to disk in a compressed format, but I cannot request that they be compressed on that third-party's end.

alexlynham11:09:25

by ship in you mean upload?

alexlynham11:09:56

the shuffling/serverless tools for S3 are pretty wild good these days depending on what your vector is...

alexlynham11:09:31

other than that a small jump box that you can spin up & spin down via whatever your build/config tool is can be a useful way of moving stuff about

alexlynham11:09:38

but like @firthh said, if you use bigger instances it will be less painful... plus if you optimise for your bottleneck (cpu/ram) you might find it's quicker & so cost is not so different

maleghast11:09:08

@alex.lynham - Thanks, that's all helpful 🙂

maleghast11:09:23

The problem (although I may be missing something) with serverless and pipelines, is that the third party I am acquiring data from is very strict about the way in which I request data and that they fulfil those requests. I have to make a request, which is queued for an indeterminate period, then it is "run" as a job, which may take hours+, and then finally once the job is complete I am given a URL that lasts for 24 hours (I think). I can't see a way of creating a serverless pipeline to make an HTTP request to that URL once it becomes available.

alexlynham11:09:36

SNS?

maleghast11:09:23

If I am successful in writing an automation (or series of automations) that make this all hands-free, then at least knowing that if I use a larger instance with dedicated network that I should not have the same throttling issues is kinda enough.

alexlynham11:09:33

so notify SNS on completion/availability and then it can invoke a lambda

maleghast11:09:49

@alex.lynham - Yeah, now you mention it I could do that...

maleghast11:09:59

Can I create a Lambda that can take parameters out of the SNS message that invoked it, so that the message could include the URL to the asset and the destination on S3?

maleghast11:09:25

That would remove the need for an intermediary Instance that is downloading the asset, writing it to disk and then pushing it on to S3

maleghast11:09:15

I guess my concern is that the sheer size of the assets may make that approach costly and / or fragile, but I am more than willing to admit that that is sceptical cynicism coming from a position of ignorance. If I could be persuaded that Lambda could manage all of that without hurting, I would be VERY happy to do it that way.

firthh11:09:00

If you don’t trust Lambda to do the actual processing of the large file, you could still use it as the gateway to trigger the download and upload to S3 on infrastructure you trust

👍 4

💯 4

firthh11:09:32

e.g. lambda to start a large spot instance that will do the download of the asset and upload to S3

maleghast11:09:12

@firthh - Yeah, I can see that - the only risk being that the Spot Instance is killed mid-download / upload to S3

firthh11:09:20

But that might be getting overly convoluted at that point

firthh11:09:35

True

maleghast11:09:46

It's not that I don't trust Lambda, I am just not sure what it's actually capable of, and how much doing something like this might actually cost...

firthh11:09:19

Yeah, I think long running lambdas get expensive

firthh11:09:30

I’ve never used in but maybe datapipeline could be useful?

firthh11:09:59

Scrap that. It looks like your data has to start somewhere in AWS for that

alexlynham12:09:52

yeah long running lambdas are costly and also don't go over 5min

alexlynham12:09:31

I mean basically can you get the upload point to e.g. sync to S3 directly? so you could have a set of steps - s3 sync - generate url and put in e.g. json file - json file triggers SNS - SNS uses URL

alexlynham12:09:39

or something ike that

alexlynham12:09:41

idk 🙂

alexlynham12:09:15

the quickest fix is just to bump the size of the EC2 instances you're currently using and/or look into relative pricing of spot ones

maleghast12:09:20

@firthh - Yeah, that was my understanding on Data Pipeline too, so I had kinda decided it was not the use-case that I wanted.

maleghast12:09:24

@alex.lynham - If Lambdas timeout at 5 minutes then they could not handle this workload either. I am going to look at pricing for big, network-capable instances, which was what I suspected that I might have to do, but thanks for the input 🙂

alexlynham13:09:52

the final thing to consider is AWS step functions

alexlynham13:09:04

as iirc those have a longer potential lifespan

alexlynham13:09:24

I know it's a year for the whole pipeline, but I'd need to look at the docs for individual components

mccraigmccraig14:09:42

anyone have any ideas of how i can debug where memory is being used in a docker container ? i've got a container with a 3GB limit running a clojure process which is using 1.7GB... but the container is getting oom-killed and i've no idea where that extra 1.3GB is going... anyone seen anything similar or have any ideas ?

mccraigmccraig14:09:26

here's some logging which makes it all pretty clear: https://gist.github.com/mccraigmccraig/df2295da08a9a1e3c21fbe2871780e3c

firthh14:09:02

Are you setting memory options on the JVM?

firthh14:09:36

I think there is a chance that JVM memory can spike and get killed before that usage makes it into logs anywhere

mccraigmccraig14:09:26

@firthh yeah, i've got -Xms2560m -Xmx2560m -XX:+UseG1GC -server

mccraigmccraig14:09:04

so i should be seeing java OOMs well before the container gets killed

firthh14:09:50

Yeah, I would have expected something in Clojure to start throwing exceptions

Conor14:09:17

Looks like Java is not obeying your settings for some reason? 788633 pages ~= 3Gb I think

mccraigmccraig14:09:46

ah - those are pages, not KB ?

Conor14:09:04

Yes AFAIK

mccraigmccraig14:09:21

that would make sense

mccraigmccraig14:09:53

hmm. possible 💡 - i've got the yourkit agent installed on those processes ... i wonder if that's doing something nuts and logging stuff off-heap

reborg14:09:30

try setting -XX:MaxDirectMemorySize and see if it stops OOM

thomas14:09:48

and if possible do a heap dump when you get an OOM. (and mount the volume from the docker container on the host, so you don't loose it)

otfrom15:09:53

@reborg I didn't know about MaxDirectMemorySize. Does that cope with all the different types of memory a JVM can consume?

reborg15:09:57

@otfrom off-heap memory gets the same as -Xmx if nothing else is specified. So assuming something is allocating off-heap, setting -XX:MaxDirectMemorySize to a reasonably low value would OOM without crashing the container (my guess)

2018-09-20

Channels