LookupCache retrieves inconsistent values on multiple requests.

harrycho
Participant I

I have a fairly simple PopulateCache policy in one of the proxy as follows. It simply caches the current timestamp for a given user specified in KeyFragment.

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<PopulateCache async="false" continueOnError="false" enabled="true" name="blacklist-user-PC">
    <DisplayName>blacklist-user-PC</DisplayName>
    <Properties/>
    <CacheKey>
        <KeyFragment ref="blacklist.user_id"/>
    </CacheKey>
    <CacheResource>user_blacklist</CacheResource>
    <Scope>Global</Scope>
    <ExpirySettings>
        <TimeoutInSec>86400</TimeoutInSec>
    </ExpirySettings>
    <Source>blacklist.timestampInSec</Source>
</PopulateCache>

Then in another proxy, I have a LookupCache policy that retrieves the cached value where the KeyFragment is also the user.

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<LookupCache async="false" continueOnError="false" enabled="true" name="lookup-cache-debug-LC">
    <DisplayName>lookup-cache-debug-LC</DisplayName>
    <Properties/>
    <CacheKey>
        <KeyFragment ref="request.formparam.user_id"/>
    </CacheKey>
    <CacheResource>user_blacklist</CacheResource>
    <Scope>Global</Scope>
    <AssignTo>flowVar</AssignTo>
</LookupCache>

When I hit PopulateCache proxy 3 times with the same key fragment, I get an ascending timestamp values and last one should get stored replacing the older values.

However, when I call the LookupCache proxy and call it multiple times, I get differing values rather than a last stored value.

I initially thought this was because of availability over consistency in CAP theorem but it never seem to eventually consistent.

I confirmed that LookupCache always uses the same "cachekey" and "cachename" in the trace UI.

For example:

store A 1 
store A 2
store A 3

// (order may differ)

get A -> 2 (expecting 3)
get A -> 1 (expecting 3)
get A -> 3


invalidate A

// (order may differ)

get A -> 1 (expecting cache miss)
get A -> (cache miss)
get A -> 2 (expecting cache miss)

Any idea what's going on here?

Thanks

0 9 354
9 REPLIES 9

I don't have an explanation for what you're reporting. I've never seen that.

Can you give further details? Apigee SaaS or OPDK? What version if OPDK?

Do you have a simple pushbutton test?

a script that would load in an API proxy bundle and then invoke it and demonstrate the problem?

Thanks for getting back to me.

This is on Apigee SaaS evaluation account (harrycho-eval). This behavior doesn't seem to be reproducible on a paid account. In addition, I found another community post that describes this behavior (https://community.apigee.com/questions/26920/cache-update-time.html).

I have a simple API endpoints that calls PopulateCache and LookupCache.

This proxy basically gets the current Apigee timestamp and saves it in a cache with a provided key (cachedkey)

curl -X POST \
  'http://harrycho-eval-test.apigee.net/test_cache/set?cachekey=a' \
  -H 'content-length: 0'

Then to get cached value

curl -X GET 'http://harrycho-eval-test.apigee.net/test_cache/get?cachekey=a'

Here's also a zip file for the proxy described above.

test-cache-rev1-2019-09-30-1.zip

Thanks.

Hi Harry

are you saying that it happens consistently for you?

Thanks for the proxy, let me try it out.

oops - it appears to have been deleted or something? I Can no longer download that zip file. Can you re-upload it?

Yeah, it's consistent on my evaluation account but not on paid account. Unfortunately, I couldn't figure out why the two makes difference in caching behavior.

ok I looked at your proxy and it looks pretty good. Nothing I would change.

It's unfortunate that you're seeing this behavior. It might be due to a stale entry.

Have you tried these:

  1. waiting a day, to let the cache expire?
  2. or ... forcing the overwrite by Populating the cache with a new value that expires in 10 seconds?
  3. Or you could include a prefix in the cache key. That would avoid the problem.

Thanks for confirming my proxy. At least that relieves my worry that I was doing something odd.

As for your suggestions, unfortunately my use case is that I put a key-value pair in cache and the value would be replaced depending on some event. If no events were triggered, then let the cache expire.

I tried called InvalidateCache just before re-populating and the behavior seem to be the same; now some of the "get" requests would return no value and some would return previous values.

I also think it's something related to stale entries. As noted in the other post, I think it's the inconsistency with underlying message processors (MPs) in L1 cache. One of the suggestion there was to manually restart MPs if Apigee is OPDK but no solution was provided for SaaS.

I really appreciate your time taking a look into this issue.

As for your suggestions, unfortunately my use case is that I put a key-value pair in cache and the value would be replaced depending on some event. If no events were triggered, then let the cache expire.

Yes, I figured. That's a standard cache use case, I guess.

The ideas I offered do not preclude that usage.

Idea #1 is just wait. And then re-test. Maybe you already tried that. I noticed the entry expiry is 86400 now, that's 24 hours. If you previously used a very long expiry, then the cache entry may not expire within a day. Just because the expiry in the poilicy currently says "24 hours" don't assume the entry that is IN cache is set to expire in 24 hours.

Idea #2 is forcing an overwrite by temporarily overwriting the value.

Eg, run this ONCE

<PopulateCache name='Cache-Populate-1'>
  <CacheResource>cache1</CacheResource>
  <Source>request.queryparam.value</Source>
  <Scope>Application</Scope>
  <CacheKey>
    <!--  <Prefix>fixedPrefix</Prefix> -->
    <KeyFragment ref='request.queryparam.cachekey' />
  </CacheKey>
  <ExpirySettings>
    <TimeoutInSec>10</TimeoutInSec>
  </ExpirySettings>
</PopulateCache><br>

And then revert that policy and use 86400 again. (BTW during development I usually keep the expiry much lower than 1 day, so I can actually verify expiry in a reasonable interval.

Idea #3 is just use a different key

Eg

<PopulateCache name='Cache-Populate-1'>
  <CacheResource>cache1</CacheResource>
  <Source>request.queryparam.value</Source>
  <Scope>Application</Scope>
  <CacheKey>
    <!--  <Prefix>fixedPrefix</Prefix> -->
    <KeyFragment ref='apiproxy.name' />
    <KeyFragment ref='request.queryparam.cachekey' />
  </CacheKey>
  <ExpirySettings>
    <TimeoutInSec>86400</TimeoutInSec>
  </ExpirySettings>
</PopulateCache><br>

None of these prevent you from doing what you hope to do.

I tried my own test and it works as expected.

See attached.

test-this.zip

If you have a cache behaving incorrectly on a supported (paid) org, then you can contact support.

If you observe a cache behaving incorrectly on an evaluation organization, we can look into it if you have a reproducible test case.