Issue with Invalidating/Revoking external access t...

nagashree_b · 03-19-2019 04:43 PM

We are using an external Identity provider to manage authentication and generation of access tokens. I have referred to https://docs.apigee.com/api-platform/security/oauth/use-third-party-oauth-system

So far I have been able to :
1. Generate the accesstoken and store it in Apigee

2. Verify Access token in Apigee

However, we have a usecase where if the user profile in the identity provider is updated, we need to revoke the existing access token that's stored in Apigee.

I have exposed a resource in my api proxy to receive profile update events and perform InvalidateToken operation. This is my OAuthV2 policy configuration

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<OAuthV2 async="false" continueOnError="false" enabled="true" name="OAuthV2-Revoke-Token">
    <DisplayName>OAuthV2-Revoke-Token</DisplayName>
    <Properties/>
    <Attributes/>
    <Operation>InvalidateToken</Operation>
    <Tokens>
        <Token type="accesstoken" cascade="true">request.header.access_token</Token>
    </Tokens>
    <SupportedGrantTypes/>
    <GenerateResponse enabled="true"/>
    <Tokens/>
</OAuthV2>

I see HTTP 200 response for the invalidate call (no errors/suspicious messages in the trace session). However, if I try to verify the access token after revoking it, I still see a success. Is there something I am missing here?

I have verified that the access_token header has the right value in the trace session

I referred to the docs and a few other posts on the community, but not able to figure out what could be wrong.

https://community.apigee.com/questions/23500/invalidate-external-oauth-access-token.html

dchiesa1

You may be experiencing a cache issue. Revoking the token may remove it from the token store, but if the token has been recently used then it will be present in cache in the various message processors. A revocation does not synchronously communicate with all message processors in the network to notify all of them about this specific revocation event. Instead the token is revoked in the backing store. The cached token state in the various MPs will remain until cache expiry, which is 180s.

this 180s of cache TTL is a "Window of vulnerability" that is inherent to the distributed system. If you cannot tolerate this, then you need to enforce a cache-less distributed consensus, and that will slow your system down considerably, at scale.

nagashree_b

@Dino-at-Google , firstly thanks for providing these intricate details which are not explicitly mentioned in the docs.

Second, by "cache-less distributed consensus" do you mean that the token will not be stored/cahced in Apigee and each time the token needs to be verified the external auth provider api needs to be called?

dchiesa1

You're welcome, I'm glad to help. I think the reason these details are not in the doc is because we do not wish people to depend on the 180s. Although I think it is appropriate to say that "token state may be cached".

regarding the distributed store - there are multiple ways to implement a synchronized check. One would be to store a record in a database when the token is invalidated, and then check that database prior to validating any token. This obviously has performance implications, because you cannot tolerate dirty reads. I would not recommend it, if you can avoid it.

nagashree_b

@Dino-at-Google Thanks again.

Do you think it is a good idea to have a custom cache resource in Apigee that can be looked-up, would that help? What I mean is, when the invalidate call is made, use the Populate cache policy with TTL of may be 180s and add the token as the cache key and value. While doing the verify token operation, use the lookup cache policy to fetch the entry and have some conditional check based on this?

OR does the MP synchronization come into play for this as well?

dchiesa1

That would work. The cache in Apigee Edge is distributed and multiple MPs in the same datacenter (AZ) will get cached items "almost immediately."

So, Why not use that?

The information will propagate through the clusters of MPs more slowly. If you have Apigee SaaS provisioned across muiltiple AZ's (every SaaS org) or multiple regions, the cache synchronization goes through a data persistence layer. But the cached item will still propagate, in O(milliseconds).

So using a cache does narrow the Window of vulnerability significantly.

If you have PCI or HIPAA organization, this approach would not work globally. You would need to use some other mechanism to propagate the information. With PCI or HIPAA orgs, the inter-datacenter cache propagation via the data persistence layer is disabled.

Issue with Invalidating/Revoking external access tokens