Is SpikeArrest global to ALL environments, or is i...

itsechimera · 11-30-2018 11:09 AM

I have 6 environments within my Apigee implementation that point to various back end proxies within my network so that we can maintain our test environments etc. We have 4 MP's.

I have set a spike arrest on our OAuth Proxy, 15,000 / minute which is pretty high.

I got a user in ONE of our environments(test) hit the spike arrest but according to the traffic I see in our logs that environment received no where near enough requests to hit the spike arrest on it's own.

This leads me to beleive that the spikearrest is aggregating all traffic from all environments in it's algorithm, that the arrest is for the "proxy" itself regardless of environment that traffic hit.

Is this the case? If so what is recommended, we periodically run performance testing and large automated test suites in the lower environments and can't have this traffic effecting our Prod environment. I feel like upping the arrest and configuring it to behave differently across MP's isn't a good solution, I would prefer my Prod env be totally isolated.

Second question, how would I get this data in a report or via an API call? I can't find these 429 responses anywhere in my apigee dashboards.

dknezic

Please read the documentation on how the spike arrest works.

https://docs.apigee.com/api-platform/reference/policies/spike-arrest-policy

In this instance, the spike arrest will not wait for 15,000 requests but instead will raise a fault if it receives more than 1 request in a .004 second window.

itsechimera

I have read the documentation and I find no reference to "Environments" made in this documentation, which is what I am trying to ask.

I understand the smoothing, even taking that into account the successful traffic I see in my environment does not support the theory that two calls came in simultaneously like this. I guess it's possible but Apigee also provides me with basically no way to determine this, I have also scowered the logs using their API's for this time frame and find nothing to support the theory that we really did hit the smoothed spike arrest at the sub-second level.

Again, my theory is that the Spike Arrest is global to the entire Proxy regardless of which target environment hits the limit. Test environment gets a call at 12:00:000 and Prod gets one at 12:00:003 then Prod get's blocked. This would make much more sense from what I am seeing, aggregating all traffic across my 6 environments that each Proxy can route to in order to determine when to limit the requests.

dknezic

Spike arrest rate isn't global. You may like to try this using a smaller rate

Also where are you getting the timestamps from?

itsechimera

Is there any way to pull logs from Apigee that would elucidate the issue? I am only able to see 1 of these kinds of responses by querying Apigee logs via their API and that isn't the one that I am asking about(the one I can see is legitimate on a target with a lower arrest that is commonly used asynchronously so the arrest makes sense there).

I know what date and roughly what time a client reported hitting the spike arrest, I have transactional logging on my applications that I can see the traffic in which is where I am grabbing what time-stamps I can. Now, a 429 would never have gone that far so all I can do is assess the successful traffic I see there and try to correlate that with some logging from Apigee but as I have said I can't find any useful logging on the event from Apigee.

Is SpikeArrest global to ALL environments, or is it environment specific?