I've got a very odd problem with the latest version of aws-api, presumably after the change from cognitect's http client to native java http client: https://github.com/cognitect-labs/aws-api/blob/main/UPGRADE.md#08711--2024-12-03 The problem is that aws api calls hang, in our case it's license manager's api list-received-licenses. The strange thing is that it works fine on localhost (when using profile credentials), but it hangs on an ec2 instance where we rely on the default credentials provider I would be super grateful for any ideas on what might be wrong. More details in the thread.
@jumar we believe this issue is now fixed in v 0.8.774 /cc @marcobiscaro2112 https://clojurians.slack.com/archives/C015AL9QYH1/p1756838674496359
This is roughly the code we have in place
(require '[cognitect.aws.client.api :as aws])
(require '[cognitect.aws.credentials :as credentials])
(def my-client (aws/client {:api :license-manager :credentials-provider (credentials/default-credentials-provider http-client) #_(credentials/profile-credentials-provider aws-profile) )))
(aws/invoke license-manager-api {:op :ListReceivedLicenses})
This never completes.A thread dump suggests that there's another thread that's blocked while reading EC2 metadata and I think that's the reason why the main operation is pending forever:
"async-thread-macro-1" #58 daemon prio=5 os_prio=0 cpu=1.90ms elapsed=1013.14s tid=0x0000ffff600582f0 nid=0x169 waiting on condition [0x0000ffff19dfd000]
java.lang.Thread.State: WAITING (parking)
at jdk.internal.misc.Unsafe.park(java.base@17.0.14/Native Method)
- parking to wait for <0x00000000d0ec73a8> (a java.util.concurrent.CountDownLatch$Sync)
at java.util.concurrent.locks.LockSupport.park(java.base@17.0.14/LockSupport.java:211)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(java.base@17.0.14/AbstractQueuedSynchronizer.java:715)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(java.base@17.0.14/AbstractQueuedSynchronizer.java:1047)
at java.util.concurrent.CountDownLatch.await(java.base@17.0.14/CountDownLatch.java:230)
at clojure.core$promise$reify__8621.deref(core.clj:7257)
at clojure.core$deref.invokeStatic(core.clj:2337)
at clojure.core$deref.invoke(core.clj:2323)
at clojure.core.async$fn__43145.invokeStatic(async.clj:138)
at clojure.core.async$fn__43145.invoke(async.clj:127)
at cognitect.aws.ec2_metadata_utils$get_response_data.invokeStatic(ec2_metadata_utils.clj:63)
at cognitect.aws.ec2_metadata_utils$get_response_data.invoke(ec2_metadata_utils.clj:62)
at cognitect.aws.ec2_metadata_utils$IMDSv2_token.invokeStatic(ec2_metadata_utils.clj:157)
at cognitect.aws.ec2_metadata_utils$IMDSv2_token.invoke(ec2_metadata_utils.clj:148)
at cognitect.aws.region$instance_region_IMDS_v2_provider$reify__49949.fetch(region.clj:112)
at cognitect.aws.region$fn__49916$G__49912__49918.invoke(region.clj:24)
at cognitect.aws.region$fn__49916$G__49911__49921.invoke(region.clj:24)
at clojure.core$some.invokeStatic(core.clj:2718)
at clojure.core$some.invoke(core.clj:2709)
at cognitect.aws.region$chain_region_provider$reify__49927.fetch(region.clj:37)
at cognitect.aws.region$fn__49916$G__49912__49918.invoke(region.clj:24)
at cognitect.aws.region$fn__49916$G__49911__49921.invoke(region.clj:24)
at cognitect.aws.util$fetch_async$fn__49627$fn__49628.invoke(util.clj:297)
- locked <0x00000000d20cf868> (a cognitect.aws.region$chain_region_provider$reify__49927)
at cognitect.aws.util$fetch_async$fn__49627.invoke(util.clj:296)
at clojure.core.async$thread_call$fn__43264.invoke(async.clj:487)
at clojure.lang.AFn.run(AFn.java:22)
at java.util.concurrent.ThreadPoolExecutor.runWorker(java.base@17.0.14/ThreadPoolExecutor.java:1136)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(java.base@17.0.14/ThreadPoolExecutor.java:635)
at java.lang.Thread.run(java.base@17.0.14/Thread.java:840)Previously, we used these dependencies
com.cognitect.aws/api {:mvn/version "0.8.692"}
com.cognitect.aws/endpoints {:mvn/version "1.1.12.718"}
com.cognitect.aws/license-manager {:mvn/version "847.2.1365.0"}
And I'm 100% sure it worked with those a few months ago.For completeness, here are the versions we use right now
com.cognitect.aws/api {:mvn/version "0.8.730-beta01"}
com.cognitect.aws/endpoints {:mvn/version "871.2.30.11"}
com.cognitect.aws/license-manager {:mvn/version "871.2.29.35"}
(It's quite difficult for me to rollback and confirm the old behavior because there were lots of changes in the in the meantime)UPDATE: I managed to revert to using old aws-api version, that is "0.8.692" That fixes the problem. Of course, I had to switch back to using cognitect's http-client, but I suppose the java http-client isn't the problem on its own - because the cognitect's client also blocks when using the latest aws-api version.
We saw similar issues, but did not investigate thoroughly just rolled back.
Great to hear that, thanks!
I created a github issue: https://github.com/cognitect-labs/aws-api/issues/267