WTF, why don't my stats queries return the same counts ?

This recently came up on an internal hangouts chat discussion among the technical team at Apigee. One of the team @Miren , asked, hey, why are my stats queries returning different numbers if I sum by minute, versus sum by hour (or day)?

This illustrates the problem:

$ orgname=myorg
$ apiname=myapi
$ curl -s -n "https://api.enterprise.apigee.com/v1/o/$orgname/environments/test/stats/apiproxy?select=sum(message_count)&timeRange=4/4/2017+00:00~4/5/2017+00:00&timeUnit=minute&filter=(apiproxy+eq+%27$apiname%27)+and+(response_status_code+eq+200)&accuracy=0" | jq '[.environments[0].dimensions[0].metrics[0].values[].value | tonumber] | reduce .[] as $item (0; . + $item)'
31493
$ curl -s -n "https://api.enterprise.apigee.com/v1/o/$orgname/environments/test/stats/apiproxy?select=sum(message_count)&timeRange=4/4/2017+00:00~4/5/2017+00:00&timeUnit=hour&filter=(apiproxy+eq+%27$apiname%27)+and+(response_status_code+eq+200)&accuracy=0" | jq '[.environments[0].dimensions[0].metrics[0].values[].value | tonumber] | reduce .[] as $item (0; . + $item)'
37853

What's going on there? Well, first, the main command is a curl command, to the Admin API for Apigee Edge. (Notice that the curl command uses -n, and not -u . Good hygiene! ) Basically it's querying the API volume for a given API, summing by minute. The curl command returns a big JSON payload. This gets piped through the jq utility , which is "like sed for JSON". If you use JSON and curl, then having the JQ tool in your toolbox is probably a really good idea.

The JQ tool accepts a query language, and in this case, with this query:

'[.environments[0].dimensions[0].metrics[0].values[].value | tonumber] | reduce .[] as $item (0; . + $item)'

... we're basically telling JQ to sum all the results that Apigee Edge returned, and print the outcome.

And the results show that when we ask Apigee Edge to first give us the counts by minute, and then sum THAT, we get a different result than if we ask Apigee Edge to give us the counts by hour, and sum those buckets.

The first command asks for the API call volume count, bucketed by _minute_. There are ... let's see 1440 minutes in a day. So we would expect from that API call 1440 items in an array. Let's check that. jq has a length query for that.

$ curl -s -n "https://api.enterprise.apigee.com/v1/o/$orgname/environments/test/stats/apiproxy?select=sum(message_count)&timeRange=4/4/2017+00:00~4/5/2017+00:00&timeUnit=minute&filter=(apiproxy+eq+%27$apiname%27)+and+(response_status_code+eq+200)&accuracy=0" | jq '[.environments[0].dimensions[0].metrics[0].values[].value | tonumber] | length'
1000

Whoa. What's up with that?

Well it turns out that the stats query returns a default limit of 1000 records. So if you query for a full day, by minute, you will not get all the data. The query results are truncated. But according to the documentation, you can specify a limit query param. So if I specify 2000, what happens?

$ curl -s -n "https://api.enterprise.apigee.com/v1/o/$orgname/environments/test/stats/apiproxy?select=sum(message_count)&timeRange=4/4/2017+00:00~4/5/2017+00:00&timeUnit=minute&filter=(apiproxy+eq+%27$apiname%27)+and+(response_status_code+eq+200)&accuracy=0&limit=2000" | jq '[.environments[0].dimensions[0].metrics[0].values[].value | tonumber] | length'
1438

Hmmm, why 1438 and not 1440? Because there were 2 buckets of 1-minute size during that day, in which no inbound requests were received.

ok, so then what happens if I use the limit param in the command that uses JQ to sum, and not just count? As expected, we get consistent results:

$ curl -s -n "https://api.enterprise.apigee.com/v1/o/$orgname/environments/test/stats/apiproxy?select=sum(message_count)&timeRange=4/4/2017+00:00~4/5/2017+00:00&timeUnit=minute&filter=(apiproxy+eq+%27$apiname%27)+and+(response_status_code+eq+200)&accuracy=0&limit=2000" | jq '[.environments[0].dimensions[0].metrics[0].values[].value | tonumber] | reduce .[] as $item (0; . + $item)'
37853

Cool!

Thanks to Sriram Padmanabhan for the clue about the limit. Also, I filed a doc bug asking that the text be amended to state that the default limit is 1000.

Version history
Last update:
‎04-28-2017 10:55 AM
Updated by: