Limit for concurrent API calls in Apigee Edge

mmagtalas · 01-29-2024 03:51 PM

Hello everyone.

We have a specific use case which is to implement throttling of API calls in one of our proxies in Apigee Edge. The requirement is to allow only 10 concurrent API calls at a time and any other calls will be in queue.

We tried using the Quota and Spike Arrest policies, but we encountered issues in the process. For Quota, it is not allowed/recommended to use second as the time unit, and for Spike Arrest, we have received inconsistent 429 errors when doing some load testing.

As an example, we have tried setting a 10 per second rate which should be translated to 1 request every 100ms as per the documentation. But we are getting 2 or more successful API calls inside of 100ms. We are thinking it’s because of the distribution in the message processors. But we would want to confirm that as well.

Are there any better workarounds to accomplish our use case? In addition, is queueing of request possible in Apigee edge?

Thanks.

dchiesa1

The SpikeArrest in Apigee Edge is not suitable for that task.

There is a SpikeArrest in Apigee X which can enforce per-second rate limits. However, it is not designed to limit concurrent calls. And you would need to upgrade to Apigee X to get this behavior.

One Q: WHY do you need to limit it to 10 concurrent calls?

mmagtalas

Thank you for the response @dchiesa1

We need to set this specific limit on the consumer calls as in the process, we are calling a third party target URL which involves cost.Q

May we ask what's the best way that you can recommend to handle this use case in Apigee Edge? Our organization is not going to upgrade to Apigee X.

Thanks.

dchiesa1

May we ask what's the best way that you can recommend to handle this use case in Apigee Edge?

You can run the request from Apigee Edge out to the eventual target, via a WAF or load balancer that does the kind of rate limiting you prefer. For example, Google Cloud Armor can perform rate limiting in this way. It can be applied to an external Application Load Balancer (which can accept inbound requests from "the internet", including your own Apigee Edge system). And it can target (as upstream) an external system, something running on the internet. (Instructions for that load balancer scenario here).

There are other managed load balancers or WAFs that enforce rate-limiting, from third parties. Maybe Akamai and etc. I don't know much about these products.

Or you could use a manage-it-yourself approach with something like YARP or nginx or Envoy or similar. If you manage it yourself, you will need to worry about redundancy, cross-region failover, resiliency, and consistency. So it's not a simple problem but you could try it. If you don't need multi region or >99% availability, you can greatly simplify the problem by storing the counters in memory. So no consistency issue, and monitoring is a much lower burden.

Before you go this route, make sure you have your SpikeArrest policy configured properly. UseEffectiveCount is set true, and etc.

mmagtalas

Thanks @dchiesa1

Will look further into these. But just one more item to confirm. Part of our use case is putting all other requests on queue after the limit of 10 concurrent calls at a time is reached. Will this be possible in Apigee edge and what's the workaround?

dchiesa1

Apigee Edge doesn't do queueing of requests. It's a proxy; it serves *right now*. Either proxies to the upstream or sends back a response (429 over quota, etc). Right now.

If you want to queue requests I can suggest that you modify the semantics of your API to just accept the request. Store the request in a queue (like GCP PubSub or CloudSQL record, or similar). Send back a confirmation to the client, maybe send back an HTTP 202 with a reference URI to indicate "ok, got the request, you'll have to check back later for status". To make this happen you need to have a back-end system that supports this architecture.

One way to implement this is to use Application Integration. suppose you have a long-running operation, and you want to accept requests, but you don't want the system to have to hold the connection open until the work is done. So, configure your Apigee proxy to trigger an Integration , then immediately respond to the client with a 202 status. The client can call back later to check status. In the meantime the integration executes asynchronously, with respect to the HTTP request.

You could do something similar with Google Cloud Workflows - though the connector options there are much smaller.

Of course you can have other , D-I-Y ways of implementing this. For example, have Apigee call a Cloud Function, which updates a record in a CloudSQL database, then responds with 202. Some other system, maybe a cloud run job, gets run on a schedule, and processes all of the records in the SQL table. Maybe the job runs every 3 minutes, or every hour. Whatever. And again, the client needs to call back into the API to retrieve the status of the process.