Quota Asynchronous Behavior

Not applicable

// Quota types

If we want the quota counter to start from the point when the req reaches the API, we can use rolling or flexi . If we have 6 Mps for example , the counter starts different for each Mp and use calendar so that all the Mp counters start at the same time from the start time you configure (the start time can be past date as well) Now we know all Mps start the clock/counter at the same time .

//AsynchronousConfiguration

When we enable Asynchronous even though all the Mps start counter at the same time , Mp sync with / lookup on Cassandra will not happen at the same time . They start 10 sec (default) from the point the policy is deployed to Mp and the deployed time can be different from Mp-Mp . So chances that all the Mps may not go to Cassandra at the same time and you might not see expected results . Even the connection strategy (roundrobin or leastconnection) at load balancer can add to the inconsistent behaviour of the quota.

Considering all the above factors that can change the expected behavior of quota , what is the recommended , best practice on how to use AsynchronousConfiguration ?

cc @argo any tips ?

0 5 512
5 REPLIES 5

For rolling and flexi, counter starts differently for each identifier (which is usually an app - but doesn't have to be) - not for each MP. In fact this config has got nothing to do with MPs. Is it possible that this is the root cause of your confusion?

Also the main advantage of async quota is that the total number of requests to C* over a period will be much less compared to sync quota. Sync will go to C* for every request to update the counters (AFAIK) and async will go in every x period as configured. Every MP will obviously make C* connections individually.

Not applicable

@Ozan Seymen , I posted regarding the types based on my previous tests but after your reply did a quick test and saw types are not per MP but per identifier as you said . Thanks for confirming .

That is not a real problem for me but good to know that it is working as expected .

Coming back to the problem that I have , below is my policy and I have 5 mps

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<Quota async="false" continueOnError="true" enabled="true" name="quotawork" type="calendar">
    <DisplayName>Quota-1</DisplayName>
    <FaultRules/>
    <Properties/>
    <Allow count="10"/>
    <Identifier ref="request.header.clientId"/> 
    <Interval>1</Interval>
    <Distributed>true</Distributed>
    <Synchronous>false</Synchronous>
    <TimeUnit>minute</TimeUnit>
    <AsynchronousConfiguration>
        <SyncIntervalInSeconds>10</SyncIntervalInSeconds>    
    </AsynchronousConfiguration>
</Quota> 

With the above config I could not define any pattern and see intermediate 403 in between 200s . something like the output in the txt file .

I think the problem lies in minute quota configured without PreciseAtSecondsLevel setting.

From the documentation:

"Set to true to have the quota record to a precision of, or at intervals of, one second.

For example, use this setting when you have a quota with the TimeUnit element set to minutes and you need to ensure that the quota is counted and enforced by the second.

Set to false to have the quota record to a precision of, or at intervals of, one minuteminute"

I would generally recommend setting this to true if you are doing a minute quota. But also beware of the amount of data that mp will need to write to c*.

As a quick test, can you just change minute to an hour and repeat your test?

Yes want to use PreciseAtSecondsLevel and always go with flexi as it will from the point the client sends request instead of calendar . Will check that and let you know . thanks again .

Not applicable

@Ozan Seymen , Spent some time today and the below is what I observed . We have LB infront of routers sending traffic based on least connections so we really don't know in advance which Mp will be serving the traffic .

(Load in case of synchronous)  > (load in case of Asycn based on msgs) > (load in case of Async based on time )

(Upperlimit allowed + no incosistency ) < (Upperlimit allowed + incosistency) < (Upperlimit allowed + incosistency)

Load = load on Cassandra . upper limit = number of requests beyond the configured allow count . inconsistency = number of 403 in between 200 s before all calls start returning 403.

When TPM is set as 30 , I observed the below for 6 Mps .

Asynchronous 10 sec — 52 calls and intermediate 403 are 15 before all started returning 403 Asynchronous 5 sec — 44 calls and intermediate 403 are 3 . Asynchronous 5 req — 44 calls no intermittent 403

Wanted to go with Async based on message count as in the time based , last 10 sec before all the Mps go sync with C* can allow more calls where as if we use Aysnc based on messages we know beforehand on how many calls will be allowed in the worst case .