Solved: SpikeArrest to protect single backend with multipl...

Report Inappropriate Content · 03-02-2017 01:22 PM

We're trying to protect our backends with SpikeArrest policies. Quota isn't precise enough for our needs. We will have 4+ MPs in each of 2 datacenters, but our traffic can be swung to a single datacenter at any time. We also have several cases proxies which hit the same backend. We don't want this backend overloaded, but at the same time, we don't want to scale this backend significantly larger than needed on average. Since SpikeArrest doesn't count across MPs or across proxies, how can we protect this backend?

We were looking at using weights and dynamically scaling them based upon current MP count, but I can't find a way to easily get the MP count in this environment.

Is this even a valid use case for SpikeArrest? It feels like there are too many variables (multiple MPs, multiple proxies, multiple client channels). Is there something better? Is there some way to make something reusable across all of our proxies to handle this?

DChiesa

First, Have you seen this documentation page?

It discusses SpikeArrest, Quota, and the ConcurrentRateLimit policies, and compares them.

ConcurrentRateLimit is designed to limit the number of concurrent connections OUT from Apigee Edge to a specific backend target. This policy keeps a distributed counter.
Quota is designed to limit the number of requests IN to Apigee Edge, by app, or by user. This policy keeps a distributed counter.
SpikeArrest is designed to limit the number of INBOUND requests to Apigee Edge on a per second basis. This policy never keeps a distributed counter.

The approach of using Quota and scaling weights based on the current MP count sounds messy and fragile. You shouldn't need to do that. What was your goal there? I don't understand why you'd want to change the quota weights based on MP count. Quota is distributed. it will work with numerous MPs.

Can you explain this statement:

we don't want to scale this backend significantly larger than needed on average.

Does the backend itself get auto-scaled based on inbound load? If that's the case why are you limiting requests into it? If it is elastic, why not just enjoy the elasticity?

If the backend is in-elastically configured, in other words if the capacity of the backend to serve requests does not dynamically expand based on the volume of requests it is handling, then... why not just let the backend return 503's when it is too busy? Apigee Edge will receive the 503s and just relay that to the clients.

If the backend is statically configured and is returning 503s but continued inbound load from Apigee Edge proxies causes detrimental effects, maybe you can use ConcurrentRateLimit to set a limit across all your proxies. Guess at a limit, and gradually adjust the limit over a course of days, until consistent load at that limit drives the backend at a reasonable utilization, like 70-75% of CPU. Maybe the backend is not primarily constrained by CPU; maybe it's memory or I/O or network. In that case, use that metric.

ConcurrentRateLimit works on the name of the TargetEndpoint, not on the URL of the backend. If you have multiple proxy bundles, use a common name for the TargetEndpoint across them, if the backend server is the same.

----------

There are some other potential options.

Use a quota policy, specifying an identifier corresponding to the backend server. Attach this policy in the Target Request flow. This would enforce a Quota on Apigee Edge, rather than enforcing a quota on a developer app.
Use TargetServers, and set up a Health monitoring endpoint on the backend service that reports "healthy" when it is ok, and "not healthy" when it wants to receive no more load. In that case Edge will take the target server out of rotation and will send no more requests to it, until it begins to report healthy. (There is a polling ofthe health monitoring endpoint)
Use proxy chaining, and set up a passthrough proxy in front of each backend service. Embed a Quota policy in the proxy request flow in the chained proxy. In this case, the path is app -> proxy1 -> proxy2 -> backend. The chained proxy can be very fast, and will be local.

View solution in original post

DChiesa

First, Have you seen this documentation page?

It discusses SpikeArrest, Quota, and the ConcurrentRateLimit policies, and compares them.

ConcurrentRateLimit is designed to limit the number of concurrent connections OUT from Apigee Edge to a specific backend target. This policy keeps a distributed counter.
Quota is designed to limit the number of requests IN to Apigee Edge, by app, or by user. This policy keeps a distributed counter.
SpikeArrest is designed to limit the number of INBOUND requests to Apigee Edge on a per second basis. This policy never keeps a distributed counter.

The approach of using Quota and scaling weights based on the current MP count sounds messy and fragile. You shouldn't need to do that. What was your goal there? I don't understand why you'd want to change the quota weights based on MP count. Quota is distributed. it will work with numerous MPs.

Can you explain this statement:

we don't want to scale this backend significantly larger than needed on average.

Does the backend itself get auto-scaled based on inbound load? If that's the case why are you limiting requests into it? If it is elastic, why not just enjoy the elasticity?

If the backend is in-elastically configured, in other words if the capacity of the backend to serve requests does not dynamically expand based on the volume of requests it is handling, then... why not just let the backend return 503's when it is too busy? Apigee Edge will receive the 503s and just relay that to the clients.

If the backend is statically configured and is returning 503s but continued inbound load from Apigee Edge proxies causes detrimental effects, maybe you can use ConcurrentRateLimit to set a limit across all your proxies. Guess at a limit, and gradually adjust the limit over a course of days, until consistent load at that limit drives the backend at a reasonable utilization, like 70-75% of CPU. Maybe the backend is not primarily constrained by CPU; maybe it's memory or I/O or network. In that case, use that metric.

ConcurrentRateLimit works on the name of the TargetEndpoint, not on the URL of the backend. If you have multiple proxy bundles, use a common name for the TargetEndpoint across them, if the backend server is the same.

----------

There are some other potential options.

Use a quota policy, specifying an identifier corresponding to the backend server. Attach this policy in the Target Request flow. This would enforce a Quota on Apigee Edge, rather than enforcing a quota on a developer app.
Use TargetServers, and set up a Health monitoring endpoint on the backend service that reports "healthy" when it is ok, and "not healthy" when it wants to receive no more load. In that case Edge will take the target server out of rotation and will send no more requests to it, until it begins to report healthy. (There is a polling ofthe health monitoring endpoint)
Use proxy chaining, and set up a passthrough proxy in front of each backend service. Embed a Quota policy in the proxy request flow in the chained proxy. In this case, the path is app -> proxy1 -> proxy2 -> backend. The chained proxy can be very fast, and will be local.

alfsalim-1

Good stuff! but is it a good pattern to create one single proxy per backend, so all calls to that backend need to be through this proxy (proxy chaining). In this case you can add spike arrest or ConcurrentRateLimit on that single proxy. It will also provide a good analytics corss proxies for that backend and having full control. What do you think ? is a good practice to do that for each backend?

Thank you

SpikeArrest to protect single backend with multiple MPs and proxies