What do percentile queries(99th, 95th) return?

rymonroe
Participant II

To elaborate on my confusing subject does a query like...

/stats/apis?t=agg_percentile&select=percentile(total_response_time,99)&timeRange=4/10/2020+00:48:26~4/11/2020+00:48:26&timeUnit=century&limit=1440

return the average 99th percentile response time over the entire time frame or does it return some sort of max?

When I run the above query it seems to return some sort of max 99th percentile in some slice of the original time range rather than the average over entire time range. Thanks.

0 5 646
5 REPLIES 5

99th percentile

I don't know what you're thinking when you suggest "some sort of max".

you might be clear on this, but let's just state it for completeness:

The 99th percentile is... the time threshold, under which 99% of calls return.

Said another way, 1% of calls take longer than the 99th percentile.


99th pct is not the same as a maximum. For a given interval, the maximum response time might be 55000ms (55s), while the 99th percentile could be 2786ms.

As for your timeUnit of century, I'm not sure what that would do. I don't know if the system computes percentiles over such a timeunit. They may be? I can't imagine it's of very much use.

Most people use the percentile to look at a minute-by-minute or hour-by-hour variation.

I guess daily variation might also be interesting.

But... by century?

If you want an average, then you probably want percentile(total_response_time,50)

50th percentile, not 99th.

50th percentile is the median, not the mean.

Thanks for the response. I'm using a separate query for avg response time that works just fine.

I'm using century because I only want 1 data point returned, my time range only specifies 24 hours by spans over 2 days. I guess I could have used week instead of century. When I execute a similar query to above but use the timeUnit as minute it works just fine.

This will be used as a daily report to spot potential issues with backend services before the customer complains.

I understand the difference between max and percentiles and what I expect is if there were for example a proxy received 10,000 calls in the last 24 hours that Apigee would take the top 1%(100) of calls during the entire time range and return that value.

What seems to be happening is Apigee is splitting up the time range in 15 minute slices and looking at the 99th percentile in each of these slices. It then returns the max of these 99th percentile 15 minutes slices which is way higher than the actual value.

Hmmm...

It's possible Apigee may compute the 99th percentile numbers over a fixed time window; you may not get to choose. In other words, it's possible that it is always calculated over 15 minutes.

Let me see if I can get someone to confirm this.

Thanks I'm curious what that max is.

What I ended up doing is running the same query but with time unit set to minute and just averaging the value. Works just fine.

Hi Dino/Ryan,

We compute the percentiles (p50, p95, p99) over the 5-minute time window. When the user queries for a time range with time unit parameter (usually minute), we select the max value in the grouped time bucket by the time unit.

Please let me know if you have further questions. Thanks.