Fork me on GitHub
#aws
<
2022-07-28
>
Sathiya10:07:18

Hi. I am using https://github.com/mcohen01/amazonica to connect to s3 to upload and download files. My requirements needs me to download a large number of files in parallel. We are running into Unable to execute HTTP request: Timeout waiting for connection from pool . We would like to increase the max connections and reduce the max idle time. But we are not sure how to set these configruation. We tried setting properties in defclientconfig but not sure if we are setting it right. We are getting null pointers after that. Can some one help with this configuration change. Thanks in advance

jumar11:07:48

How many files are you downloading in parallel? And how exactly you are doing that?

Sathiya11:07:58

we have multiple users downloading or uploading. so there is a constant flow of requests. also there is option to select multiple files and download. in that case, we spawn multiple thread and download in parallel

jumar11:07:33

We use claypoole's pmap like this and I haven't seen this problem yet:

(cp/pmap 100 #(s3/get-object bucket-name %) keys)

jumar11:07:54

I guess your files are probably much bigger than ours (they are quite small, a few kilobytes usually)

jumar12:07:21

Do you know how many of them are running in parallel when the problem happens? If it's more than 100, I would think about introducing a queue and delay the new requests, because I guess you won't get better throughput by launching that many connections from a single machine. https://aws.amazon.com/blogs/developer/enabling-metrics-with-the-aws-sdk-for-java/ might be useful as they recommend https://github.com/aws/aws-sdk-java/issues/2663.

Sathiya12:07:09

yeah we have most of our files over 10Mbs. some may be around 500Mbs. Also its not just once. There will be constant hits with this ranges. Thats why we are considering increasing the connection pool size. any ideas on how to achieve it with amazonica

jumar12:07:25

What I'm saying is that increasing the pool size might not be the best idea because it can already be quite big. Anyway, did you look at this? https://github.com/mcohen01/amazonica#client-configuration -> https://docs.aws.amazon.com/AWSJavaSDK/latest/javadoc/com/amazonaws/ClientConfiguration.html

Sathiya17:07:57

Yeah. But not sure how too add it in configuration and use it while calling s3/get-object.