We're trying to protect our backends with SpikeArrest policies. Quota isn't precise enough for our needs. We will have 4+ MPs in each of 2 datacenters, but our traffic can be swung to a single datacenter at any time. We also have several cases proxies which hit the same backend. We don't want this backend overloaded, but at the same time, we don't want to scale this backend significantly larger than needed on average. Since SpikeArrest doesn't count across MPs or across proxies, how can we protect this backend?
We were looking at using weights and dynamically scaling them based upon current MP count, but I can't find a way to easily get the MP count in this environment.
Is this even a valid use case for SpikeArrest? It feels like there are too many variables (multiple MPs, multiple proxies, multiple client channels). Is there something better? Is there some way to make something reusable across all of our proxies to handle this?
Solved! Go to Solution.
First, Have you seen this documentation page?
It discusses SpikeArrest, Quota, and the ConcurrentRateLimit policies, and compares them.
The approach of using Quota and scaling weights based on the current MP count sounds messy and fragile. You shouldn't need to do that. What was your goal there? I don't understand why you'd want to change the quota weights based on MP count. Quota is distributed. it will work with numerous MPs.
Can you explain this statement:
we don't want to scale this backend significantly larger than needed on average.
Does the backend itself get auto-scaled based on inbound load? If that's the case why are you limiting requests into it? If it is elastic, why not just enjoy the elasticity?
If the backend is in-elastically configured, in other words if the capacity of the backend to serve requests does not dynamically expand based on the volume of requests it is handling, then... why not just let the backend return 503's when it is too busy? Apigee Edge will receive the 503s and just relay that to the clients.
If the backend is statically configured and is returning 503s but continued inbound load from Apigee Edge proxies causes detrimental effects, maybe you can use ConcurrentRateLimit to set a limit across all your proxies. Guess at a limit, and gradually adjust the limit over a course of days, until consistent load at that limit drives the backend at a reasonable utilization, like 70-75% of CPU. Maybe the backend is not primarily constrained by CPU; maybe it's memory or I/O or network. In that case, use that metric.
ConcurrentRateLimit works on the name of the TargetEndpoint, not on the URL of the backend. If you have multiple proxy bundles, use a common name for the TargetEndpoint across them, if the backend server is the same.
----------
There are some other potential options.
First, Have you seen this documentation page?
It discusses SpikeArrest, Quota, and the ConcurrentRateLimit policies, and compares them.
The approach of using Quota and scaling weights based on the current MP count sounds messy and fragile. You shouldn't need to do that. What was your goal there? I don't understand why you'd want to change the quota weights based on MP count. Quota is distributed. it will work with numerous MPs.
Can you explain this statement:
we don't want to scale this backend significantly larger than needed on average.
Does the backend itself get auto-scaled based on inbound load? If that's the case why are you limiting requests into it? If it is elastic, why not just enjoy the elasticity?
If the backend is in-elastically configured, in other words if the capacity of the backend to serve requests does not dynamically expand based on the volume of requests it is handling, then... why not just let the backend return 503's when it is too busy? Apigee Edge will receive the 503s and just relay that to the clients.
If the backend is statically configured and is returning 503s but continued inbound load from Apigee Edge proxies causes detrimental effects, maybe you can use ConcurrentRateLimit to set a limit across all your proxies. Guess at a limit, and gradually adjust the limit over a course of days, until consistent load at that limit drives the backend at a reasonable utilization, like 70-75% of CPU. Maybe the backend is not primarily constrained by CPU; maybe it's memory or I/O or network. In that case, use that metric.
ConcurrentRateLimit works on the name of the TargetEndpoint, not on the URL of the backend. If you have multiple proxy bundles, use a common name for the TargetEndpoint across them, if the backend server is the same.
----------
There are some other potential options.
Good stuff! but is it a good pattern to create one single proxy per backend, so all calls to that backend need to be through this proxy (proxy chaining). In this case you can add spike arrest or ConcurrentRateLimit on that single proxy. It will also provide a good analytics corss proxies for that backend and having full control. What do you think ? is a good practice to do that for each backend?
Thank you
User | Count |
---|---|
1 | |
1 | |
1 | |
1 | |
1 |