This page is not created by, affiliated with, or supported by Slack Technologies, Inc.
2023-05-10
Channels
- # aws (39)
- # babashka (4)
- # beginners (5)
- # biff (25)
- # cider (14)
- # clj-on-windows (40)
- # clojure-europe (36)
- # clojure-gamedev (1)
- # clojure-losangeles (4)
- # clojure-norway (51)
- # clojure-spec (5)
- # clojure-uk (2)
- # clojurescript (2)
- # clr (176)
- # data-science (10)
- # datalevin (17)
- # datomic (7)
- # deps-new (4)
- # docs (3)
- # emacs (12)
- # figwheel (3)
- # figwheel-main (5)
- # hyperfiddle (20)
- # instaparse (3)
- # introduce-yourself (8)
- # lsp (66)
- # malli (43)
- # off-topic (4)
- # rdf (11)
- # reagent (5)
- # releases (2)
- # sci (11)
- # shadow-cljs (24)
- # slack-help (2)
- # specter (7)
- # tools-deps (3)
- # xtdb (48)
I’m getting this error and having trouble authenticating to access dynamoDB 🧵
{:__type com.amazon.coral.service#UnrecognizedClientException, :message The security token included in the request is invalid., :cognitect.anomalies/category :cognitect.anomalies/incorrect}
I have a role, with sts and dynamodb permissions as well as an AWS_WEB_IDENTITY_TOKEN_FILE
. I’m trying to use those fetch credentials.
Here are some code that shows how its fetching credentials:
(defn- default-credentials-provider []
(let [provider (DefaultAWSCredentialsProviderChain.)
credentials ^AWSCredentials (.getCredentials provider)
access-key-id (.getAWSAccessKeyId credentials)
secret-access-key (.getAWSSecretKey credentials)]
(credentials/basic-credentials-provider
{:access-key-id access-key-id
:secret-access-key secret-access-key})))
(defn assumed-role-credentials-provider
"make a credentials provider that can assume a role"
[role-arn web-identity-token]
(let [sts (aws/client {:api :sts
:credentials-provider (default-credentials-provider)})]
(credentials/cached-credentials-with-auto-refresh
(reify credentials/CredentialsProvider
(fetch [_]
(when-let [creds (:Credentials
(aws/invoke sts
{:op :AssumeRoleWithWebIdentity
:request {:RoleArn role-arn
:WebIdentityToken web-identity-token
:RoleSessionName (str (gensym "some-session-"))}}))]
{:aws/access-key-id (:AccessKeyId creds)
:aws/secret-access-key (:SecretAccessKey creds)
:aws/session-token (:SessionToken creds)
::credentials/ttl (credentials/calculate-ttl creds)}))))))
defn create-provider
[]
(assumed-role-credentials-provider (System/getenv "AWS_ROLE_ARN")
(slurp (System/getenv "AWS_WEB_IDENTITY_TOKEN_FILE")))
very similar to how the assume_role_example in the docs show, https://github.com/cognitect-labs/aws-api/blob/main/examples/assume_role_example.clj, but I’m going the AssumeRoleWithWebIdentity
Then when trying to access dynamodb:
(def dynamodb-client
(aws/client {:api :dynamodb
:region "us-east-1"
:credentials-provider (create-provider)}))
with a simple list table command
(defn list-tables
[]
(aws/invoke dynamodb-client {:op :ListTables}))
I get the error in the original post. seems like its fetched credentials, but the token is invalid. I’ve also seen the error of Unable to fetch credentials. See log for more details.
so its a step above that at leastThis may not be a Clojure issue, but maybe a deployment issue, where there maybe a disconnect or something that is preventing me from authenticating to AWS ¯\(ツ)/¯ but if someone has experienced this, let me know
Which version of aws-api, and which version of data.xml?
I am using [com.cognitect.aws/api "0.8.656"]
and had [org.clojure/data.xml "0.0.8"]
, but I just removed the data.xml dependency which I wasn’t really using directly
So you're all set at this point?
Now authenticating, I’m having trouble with one of the operations TransactGetItems
(def c (aws/client {:api :dynamodb
:region "us-east-1"}))
(aws/invoke c
{:op :TransactGetItems
:request {:TransactItems [{:Get {:TableName "my-table"
:Key {:pk {:S "some-id"}}}}]}})
I’m pretty sure I have the shape correct in the request but still getting this error
pk
is the name of the column for my partition key, so that :Key
value seems right
{:__type "com.amazonaws.dynamodb.v20120810#TransactionCanceledException", :CancellationReasons [{:Code "ValidationError", :Message "The provided key element does not match the schema"}], :Message "Transaction cancelled, please refer cancellation reasons for specific reasons [ValidationError]", :cognitect.anomalies/category :cognitect.anomalies/incorrect}
looks like with this op, you need the sk as well… I think what I want here is BatchGetItem
, so disregard this above
While I got this to work on EKS, I’m now seeing The security token included in the request is invalid
after long running pods ~24h. It seems that maybe the way I’m fetching credentials-provider is incorrect? What can cause this to not work after some time? I’ve verified that the AWS_WEB_IDENTITY_TOKEN_FILE
got refreshed and has a new token…
Take a look at this gist to see if it's helpful to you, I wrote it a couple years ago to solve a similar problem: https://gist.github.com/gws/130ad8bfec5495c25c3dbc0ed2a69d42
Hey gordon, thanks for the reply! I will definitely take a look. What is baffling is that after 2d7h of the pods running, I just hit my API expecting it to fail with the error, but didn’t… I’m scratching my head over here
Took a look at the gist and I’m basically doing the same thing, except I’m explicitly using AssumeRoleWithWebIdentity API call instead of running through the chain-credentials-provider. would it be beneficial to have it go through the different credentials provider options knowing that the web token refreshes correctly and is a valid form of obtaining temporary creds to access AWS services? I definitely can try that as that is might last option left. It did fail again btw 😢 Just odd that after its long running, it fails, which tells me something is caching and not refreshing.
The only thing I can think of is the cached-credentials-with-auto-refresh
that may not be refreshing?
I'm pretty sure you're pulling the token out of AWS_WEB_IDENTITY_TOKEN_FILE
once when you construct the provider
Hmm ok, interesting, I’ll change it up similarly to how its done on the gist and hope this one works. 🤞
Sorry to be late to this thread. @UPFES57NE did @U1GEY70F5’s suggestion help? (Thanks @U1GEY70F5!)
Hey scott, thanks for replying. So far, so good. My pods have been running for ~36h and still accessing AWS services. The real test will be tomorrow to see if its still working.
Curious though as to why gordon’s code of maybe where the slurp
is versus the code snippet I posted makes a difference.
@UPFES57NE
> I’ve verified that the AWS_WEB_IDENTITY_TOKEN_FILE
got refreshed and has a new token…
So the key thing, then, is that the AWS_WEB_IDENTITY_TOKEN_FILE
file has been updated. But your program snippet is only slurp
ing the AWS_WEB_IDENTITY_TOKEN_FILE
once, at the start. Whereas @U1GEY70F5’s example re-`slurp`s the AWS_WEB_IDENTITY_TOKEN_FILE
every time the periodic refresh triggers a new fetch of credentials from STS service. That fetching of a new temporary credential from STS is going to need the up-to-date web identity token.
Yea that makes sense, I would’ve thought that the slurp at the top level function when it gets called should suffice.
Like when accessing a service like dynamo, when creating the client, it calls the create-provider
fn where the AWS_WEB_IDENTITY_TOKEN_FILE
gets slurp
ed
defn create-provider
[]
(assumed-role-credentials-provider (System/getenv "AWS_ROLE_ARN")
(slurp (System/getenv "AWS_WEB_IDENTITY_TOKEN_FILE")))
(def dynamodb-client
(aws/client {:api :dynamodb
:region "us-east-1"
:credentials-provider (create-provider)}))
It may be that your dynamodb client just hasn't happened to fail yet, I think the same situation applies. The slurp
results in a copy of the AWS_WEB_IDENTITY_TOKEN_FILE
being brought into memory, like a snapshot of the file at that moment. But if that file is later being updated (and I'm assuming that's happening by some mechanism outside the scope of your program that you've posted), then the in-memory value is out of date and another slurp
needs to happen.
I guess I’m assuming that every request that comes in, would call the create-provider
and re-`slurp` the token file regardless of where slurp
is storing the token. It may seem like thats not the case if @U1GEY70F5’s code works tomorrow still.
So in a way, having the slurp
at the entry function of creating a provider is almost like caching the token for all incoming request when the app initializes. And having the slurp
at the low level operation of when its fetching creds will actually read the token file…
What is also weird is in the early stages of developing this application, I had a delay
in the clients:
(def dynamodb-client
(delay
(aws/client {:api :dynamodb
:region "us-east-1"
:credentials-provider (create-provider)})))
And I would notice that it would fail between 48h-55h, which tells me that it slurp
ed the new tokenand the invocation/operation would be like
(defn query
[]
(aws/invoke @dynamodb-client
{:op :Query
:request {…}}}}))
and I had it working and continually developed the application, which would re-deploy new changes and not until I had a stable version is when the security token issue arose.> I guess I’m assuming that every request that comes in, would call the create-provider
...
The create-provider
function is only called once, at the time the client is created. That function returns an object that reifies the CredentialsProvider
protocol, and it's the fetch
function of that protocol that gets invoked with every request.
In other words, the credential provider is an object that repeatedly fetches new credentials...but the provider itself was created once and initialized with data (the role-arn
and the web-identity-token
) whose values do not change (in the current implementation).
The workaround that @U1GEY70F5 has suggested is, instead of passing around a web-identity-token value, instead pass around a java.io.File reference to the AWS_WEB_IDENTITY_TOKEN_FILE
and re-read that file (via slurp
) whenever fetch
is invoked, ensuring your program always has the most recent contents of that file.
Gotcha! That makes total sense! Man, I’ve been banging my head trying to figure out where its being “cached” cause thats what it seemed like it was doing
I think the use of delay
would only postpone the problem, the problem being that sooner or later a fetch
is attempted with a web identity token value that no longer matches what's in the AWS_WEB_IDENTITY_TOKEN_FILE
file.
Just want to update/maybe close the loop on this, my pods have been running for 2d13h, and its nice and healthy. Thank you @U1GEY70F5 and @U07PUGBA6 for chiming in on this issue.
While I got this to work on EKS, I’m now seeing The security token included in the request is invalid
after long running pods ~24h. It seems that maybe the way I’m fetching credentials-provider is incorrect? What can cause this to not work after some time? I’ve verified that the AWS_WEB_IDENTITY_TOKEN_FILE
got refreshed and has a new token…