An AWS IAM Identity Center vulnerability
The short version: AWS IAM Identity Center exchanges third-party OIDC tokens
for Identity Center-issued tokens. Identity Center relies on the jti
claim in the
third-party tokens to prevent replay attacks. Identity Center maintained a cache of
previously-seen jti
values for a fixed period (24 hours) and didn’t enforce that
the third-party tokens had expiry claims. This meant that a token with a jti
claim and without an exp
claim could be replayed after >24 hours had passed.
AWS now enforces that these third-party OIDC tokens include an exp
claim.
I feel this was a relatively minor issue, because a) few legitimate IdPs issue
tokens that never expire and b) the affected CreateTokenWithIAM
API also
required AWS IAM authorization (unlike the public CreateToken
API).
Story
Around AWS re:Invent 2023, Amazon launched a handful of features relating to making it easier for customers to access their cloud data in a scalable and auditable way. These features span IAM Identity Center (formerly AWS SSO), S3, Athena, Glue, Lake Formation and other services.
The APIs, SDKs and documentation were released over a period of a couple of weeks. I dived in straight away, because I’m interested in any changes to authentication, authorisation or audit logs in AWS. Documentation was a little sparse in the beginning, so I had to learn how to use these APIs through trial and error. Interestingly, had I waited a few days the documentation would have been more complete so I wouldn’t have needed to experiment and likely wouldn’t have found this issue.
Specifically, IAM Identity Center’s trusted identity propagation docs
at the time didn’t yet specify that the jti
claim in the OIDC token was required.
I had created a hand-written OIDC IdP to trial the functionality - that felt
easier than trying to sign up and learn how to use Okta/Microsoft Entra/other IdPs.
Through a process of trial and error[1] I learned that my tokens needed a jti
claim and it had to be unique each time. I also learned that they didn’t need
iat
(issued at time), nbf
(not-before time) and exp
(expiry time) claims.
I got distracted by a different issue[2] (wherein the AWS docs provided
an incorrect ARN) and chased down a suspicion that there was a security issue
with that. That theory went nowhere, so it took a few days (and chatting with
peers) to realise that the combination of relying on unique jti
claims + no
expiry claims could only work if IAM Identity Center maintained a cache of
every jti
it had ever seen, forever. So I created a token, exchanged it and
saved it to disk. Two days later I tried it again and it worked again. The docs
said there was replay prevention, so this should have failed: bug confirmed.
If you like sequence diagrams (who doesn’t?), here’s one that represents the flow.
The problem was that the OIDC JWT token named j1
could be replayed after 24
hours:
Remediation
I emailed the AWS Security team with this finding on December 1st, while re:Invent was still on. They got back to me less than 12 hours later with an initial (human) confirmation of my email. I got a follow-up email on December 9th to confirm that they were still actively investigating the issue. On December 15th, I got an update to say that fixes for this issue had been deployed to all regions. That’s a very quick turnaround for a small issue reported during re:Invent.
Thanks
Thanks to Ian Mckay, Nathan Glover and Brandon Sherman for sanity-checking this blog post. It was even harder to read before their helpful advice.
Footnotes
1: My tokens kept getting rejected, so I was randomly trying all kinds of
permutations: tokens with aud
claims and without sub
claims, tokens with
both aud
and sub
claims, tokens with a sub
claim and no exp
claim, etc.
It was slow work. I eventually tried adding jti
and nbf
claims - and it worked!
I then tried without an nbf
claim - it failed. It took a while before I realised
that failed because I had re-used jti
, not because of the missing timestamp
claims.
2: I also solved this issue through trial and error by trying all plausible ARNs. I eventually stumbled on one that worked and tweeted about it. On November 29th I received an email from the AWS Security Outreach team to say that they’d seen my tweet, fixed all the docs and that the correct ARN was in fact something different. How many giant corporations can ever react that quickly to a documentation issue, let alone during their annual conference? Super impressive.