Intermittent failure - Lookup cache - specific to Production environment only

Hi All,

There might be a similar question here :
https://community.apigee.com/questions/29179/intermittent-issues-with-lookup-cache-in-productio.html

But raising another one to provide more details.

On Prem : Version 4.18.01.00

2 Message Transaction Processors

Have written an API that has the notion of a transient resource that is meant to last just a few minutes.

I don't care about each and every interaction with the API - just the final state that I send to a backend API that will persist the resource.

To manage this transient resource I am using an Application level Cache.

  1. I create a unique key for the cache using uuid via a python script policy against a POST /resource - PopulateCache (some json) against that key
    RESPONSE :
    Header -> Location /resource/{key}
  2. The user does a few updates to it using PATCH /resource/{key} - so LookupCache - make change - PopulateCache to update the value back in the cache (existing json + patch changes)
  3. Once all the changes are complete - I send the JSON to the backend - InvalidateCache (key)
  4. Any further calls to /resource/{key} is a 404

(use case scenario : think like a shopping cart for a session-id)

- i don't care how many things a user puts in / out of his / her basket - and how many times

But when they say - confirm - I just need to know the final state in that basket

But unlike a shopping cart - the whole interaction should last about 2-3 mins tops.

The whole thing works fine in a pre-prod environment.

In production though between requests 1,2,3,4 - i get cache misses.

The API returns with a 404 - can't find it in the cache - based on the results of LookUpCache Policy saying it couldn't find it.

But if the same request is replayed - it works!

I try it again - can't find it 404

Try again - yup found it.

It's very intermittent and unstable.

In pre-production - I've got no issues what-so-ever - works just fine.

It's not a load issue either - the production instance is in a controlled alpha test - so we are talking about just the one transaction at a time that we are trying to get through

I think there is something amiss in the way its configured - but not sure where to start looking

Based on https://docs.apigee.com/api-platform/cache/cache-internals.html - it is my understanding that regardless of which Message Transaction Processor receives the request - the cache should still be available.

Is there a service / something that needs to be running to keep this in sync?

Any recommendations - on what steps to go by with troubleshooting?

0 1 177
1 REPLY 1

I'd recommend you open a ticket with support to get their assistance in diagnosing and troubleshooting this. It's possible they'll be able to guide you on how to turn on debug logging in the MP to see and verify the behavior of the cache, etc. and that may clarify what's happening behind the scenes, whether it is a bug, and so on.

Also, if you can, update to 4.19.01.