Edge Microgateway Quota Plugin

williamssean · ‎04-05-2017

What is a quota is why does it matter? A quota restricts the number of requests an application can make to the Microgateway. For example, you could create a quota to allow one application to make 100 requests per month, and then create a second quota to allow another application to make 200 requests per month. Quotas are setup when you create an Edge product and that quota is enforced locally on the Microgateway with the quota plugin.

In this article I will explain how the Edge Microgateway quota plugin works. The quota plugin is actually distributed across Microgateway instances (parent processes), because the quota is created in Edge and updated asynchronously for every request to the Microgateway. The best explanation for the distributed counter is found in our documentation for the quota policy.

To understand how the Quota policy works, then you need to understand what an Edge Message Processor (MP) is. The MP is the process that executes all the policies in Edge. Typically you have 2 or more within your Edge organization. Assume for now that you only have two message processors. The Edge Quota policy has a distributed setting that can be set as true or false. If the value is set to false, then the quota counter is NOT distributed, meaning that each MP has its own counter. So if you set your quota limit to 10 requests, then the developer will be able to make 20 requests in total, 10 requests on one MP and 10 requests on the other.

<Distributed>true</Distributed>
or

<Distributed>false</Distributed>

However, if the distributed flag is set to true, then there is only one counter shared across both MPs. Each developer should receive about 10 requests. I say "should" and "about" because there is some latency between counter updates. It is not an exact counter. The quota policy for Edge Microgateway has the distributed flag set to true.

If you are a Google managed cloud customer, then you cannot change distributed flag. If you manage Edge in your private/public cloud, then you can change this value. However, I do not recommend it as this would further complicate the quota plugin operation. If you were to set the distributed flag to false, then each MP would get its own quota counter; each Microgateway child process also has a quota counter. Ultimately, this would lead to significantly exceeding your quota limit.

Now I'll discuss how the Edge Microgateway works. Microgateway starts in cluster mode by default, so that means that each parent process will start two or more child processes depending on the number of cores on your system. Every time you start the Microgateway, each child process effectively has its own quota counter. This is the reason the quota counter is stored in Edge; to "gently" enforce the quota across all Microgateway child processes. It is not quite exact, so there is some leeway between the actual quota set in Edge and how that quota is applied to each request locally on the Microgateway.

Imagine that you are running the Microgateway in Pivotal Cloud Foundry or Cloud Foundry, its open source counter part. Cloud Foundry is a microservices platform that enables companies to quickly build and scale applications. It works by loading your app into a container, which can be scaled independently of other Cloud Foundry apps. You can run the Microgateway within each container of the target service or as a Cloud Foundry app via the Apigee Service Broker. Depending on the scenario, you could have dozens of Microgateway instances (parent processes) running within Cloud Foundry. Remember, each parent process spawns two or more child processes.

One would expect that if:

you are using the same quota identifier, typically the application ID, across all requests AND
the quota type is flexi AND
the quota distributed flag is set to true (so all MPs share the quota counter)

then the quota would be consistently enforced across all requests to Microgateway, regardless of how many instances are running. That's what you would expect! But that is not the case!

Why?

part of the reason is due to the fact that Edge Microgateway sends the request asynchronously to Edge
there are multiple child processes and each child process gets its own counter
Pivotal Cloud Foundry containers could crash, loosing the local counter

What is a "flexi" Quota?

Edge Microgateway's quota uses the flexi type. This type of quota starts a counter when the first request is received and resets the counter based on the interval specified. For example, if you setup a product quota for 10 requests per minute, then a quota of type flexi will start counting when the first request is received and reset the counter every 10 minutes. Quotas also require an identifier, to track the number of requests; typically this is the client ID. However, the Microgateway uses the Edge application ID.

If you are a Google managed cloud customer, then you cannot change the quota type. If you manage Edge in your private/public cloud, then you can change the quota type. If you decide to change the type, then I would strongly encourage you to load test it to ensure it behaves according to your expectations.

Testing the Quota Plugin

While preparing this article, I performed a few tests on the Microgateway - the first one running two Microgateway instances on my local machine, the second running Microgateway within PCF Dev.

For the first test,

I started two Microgateway instances
each instance has 2 child processes
one instance listening on port 8000 and the other listening on port 8001
set the quota limit to 1 request every minute
all my requests use the same JWT, which means that the same application ID was used for both Microgateway instances

I sent one request to port 8000 and one request to port 8001 and both requests returned 200 response codes. When I sent the second request to 8000 and 8001 respectively, both requests succeeded, but the third request to both instances failed.

What this shows is that each child process has its own counter even though there is a global counter stored in Edge. This is true when I execute my test one at a time in a very controlled environment. What happens if I use Gatling to generate my requests and run the Microgateway in Cloud Foundry? Well, that was my second test.

Each application in Cloud Foundry runs is a separate container and we have an edgemicro-decorator that will start the Microgateway in the same container as the target application. Each Microgateway still has two child processes running.

I started a single container, with one Microgateway instance
that instance has 2 child processes
Microgateway listens on port 8080
target service listens on port 8090
set the quota limit to 1 request every minute

I used gatling to generate 1 request every second, for 60 seconds. All the requests use the same application ID, so the Edge quota counter should be applied across all Microgateway instances. The results are below. This shows that out of the 60 requests sent to the Greeting resource, 48 of them failed due to a quota violation. So this would imply that even though each child process has its own counter, they all rely on the quota response from Edge. It also demonstrates the imprecise nature of the quota plugin.

---- Requests ------------------------------------------------------------------
> Global                                                   (OK=72     KO=48    )
> Authenticate                                             (OK=60     KO=0     )
> Greeting                                                 (OK=12     KO=48    )
---- Errors --------------------------------------------------------------------
> status.find.in(200,304,201,202,203,204,205,206,207,208,209), b     48 (100.0%)
ut actually found 403

Here are the results from the second time I ran it. This time 17 requests were allowed through the Microgateway and 31 were rejected by the quota. The reason for the 502s and 404s is because my CF application crashed and was restarted during the process.

2017-04-06 14:27:31                                          62s elapsed
---- Requests ------------------------------------------------------------------
> Global                                                   (OK=77     KO=43    )
> Authenticate                                             (OK=60     KO=0     )
> Greeting                                                 (OK=17     KO=43    )
---- Errors --------------------------------------------------------------------
> status.find.in(200,304,201,202,203,204,205,206,207,208,209), b     31 (72.09%)
ut actually found 403
> status.find.in(200,304,201,202,203,204,205,206,207,208,209), b      8 (18.60%)
ut actually found 404
> status.find.in(200,304,201,202,203,204,205,206,207,208,209), b      4 ( 9.30%)
ut actually found 502

Next, I scaled EM to 2 instances in Cloud Foundry and sent 1 request every second for 60 seconds. This time only 4 requests were allowed through both instances of the Microgateway.

2017-04-06 14:32:29                                          62s elapsed
---- Requests ------------------------------------------------------------------
> Global                                                   (OK=64     KO=56    )
> Authenticate                                             (OK=60     KO=0     )
> Greeting                                                 (OK=4      KO=56    )
---- Errors --------------------------------------------------------------------
> status.find.in(200,304,201,202,203,204,205,206,207,208,209), b     56 (100.0%)
ut actually found 403

Next, I doubled the amount of traffic to Microgateway - 2 requests every second for 60 seconds. This time thirteen requests were allowed through the Microgateways. Majority of the other requests were rejected with quota violation (403); the 502 response codes occurred because one of my Cloud Foundry containers failed and was restarted.

---- Requests ------------------------------------------------------------------
> Global                                                   (OK=133    KO=107   )
> Authenticate                                             (OK=120    KO=0     )
> Greeting                                                 (OK=13     KO=107   )
---- Errors --------------------------------------------------------------------
> status.find.in(200,304,201,202,203,204,205,206,207,208,209), b    105 (98.13%)
ut actually found 403
> status.find.in(200,304,201,202,203,204,205,206,207,208,209), b      2 ( 1.87%)
ut actually found 502

Quota Plugin Request/Response

See our docs for more specific instructions on how to install and configure the Microgateway. This section provides a high-level summary of the quota plugin request to Edge and its subsequent response.

For the quota plugin to work correctly, the oauth plugin must be enabled and the quota plugin must be placed after the oauth plugin.

plugins:
    sequence:
      - oauth
      - quota

You have to obtain a JWT from Edge first.

curl -X POST "http://api.enterprise.apigee.com:9001/edgemicro-auth/token" -H "Content-type: application/json" -d '{"client_id":"","client_secret":"","grant_type":"client_credentials"}'

The quota is based on the Edge application identifier, which is stored in the JWT. After you add the quota plugin to the sequence, then you need to restart the Microgateway so that it loads the quota plugin and executes it as part of the request flow. Alternatively, you could wait 10 minutes for the Microgateway to automatically reload the config file.

When the Microgateway receives the first request, it creates a new quota in Edge. The Microgateway sends a POST request to the Edge resource below to update and return the quota counter. This request is sent asynchronously to Edge and the Microgateway child process counter is updated out-of-band.

POST /quotas/organization/demo/environment/prod/v2/quotas/apply

{"identifier":"e711deac-4d17-4f0f-9f4b-e072b0505c0b","weight":2,"interval":1,"allow":60,"timeUnit":"minute","quotaType":"flexi"}

The response is shown below. You can see that it contains the total number of requests allowed and what has been used so far.

{
    "allowed": 60,
    "used": 34,
    "exceeded": 0,
    "available": 26,
    "expiryTime": 1491422767427,
    "timestamp": 1491422740656
}

Since the quota plugin is dependent on Edge to store the quota, it is important that the connection between Edge and the Microgateway is fairly stable. What happens if the internet connection is down? Microgateway will rely on its cached configuration so it will continue to operate; however, the quota will not be applied because it will not receive a response from the POST request. This means that all requests to Edge Microgateway will be allowed.

Summary

The quota plugin can be used to generally limit the number of requests per application flowing through the Microgateway. However, keep in mind the following points:

Microgateway starts in cluster mode by default and each running instance will have 2 or more child processes running, depending on the number of cores in your machine
each child process effectively gets its own counter
although the quota is defined in Edge, the quota plugin does not enforce it exactly; so don't depend on it to strictly limit all application requests to your defined quota
Microgateway uses a "flexi" Edge quota, so the counter starts when the first request is received and resets based on the interval specified
the Edge quota is distributed, which means the counter is shared across all of your MPs
a consistent connection to Edge is required for the Quota plugin to work correctly; if the connection is unavailable, then all requests will be allowed through the Microgateway