Querying analytics APIs in Apigee Edge

1 4 2,690

As you probably already know, once you have set up your APIs in Apigee Edge, we collect information about your traffic and API usage.

You can use this data through our UI with the out of the box toolset of specialized dashboards and custom reports, but sometimes you want to use the data in a very customized pipeline.

This is a very short tutorial on how to use our analytics APIs to get your data and use it in your own processes.

Let's start with an imaginary organization in Apigee Edge called “palanqueta”. That will imply that if I’m doing some API calls they are similar to this one:

https://api.enterprise.apigee.com/organizations/palanqueta/apiproducts

Some context first. When you query analytics data, you are not looking at the object you want to see data about, like a product or developer, but the environment where the calls were handled. All analytics data in your organization is scoped by this environment, and there is where all begins:

When building the URL for your query, you always will start with a string like this:

api.enterprise.apigee.com/v1/organizations/{org}/environments/{env}

In my case, I want to use my production environment called “prod”, so the starting point will always be:

api.enterprise.apigee.com/v1/organizations/palanqueta/environments/prod

Analytics data can be seen as a resource on your environments, a resource called “stats”:

api.enterprise.apigee.com/v1/organizations/palanqueta/environments/prod/stats

After this, it comes the list of dimensions I want to use:

api.enterprise.apigee.com/v1/organizations/palanqueta/environments/prod/stats/developer,developer_app

Dimensions work very similarly to a GROUP BY in an SQL query. In fact, in the response, you will see that your data is grouped by the values of your dimensions.

I will add now the metrics. This is done as part of the query string using the “select” parameter:

…/palanqueta/environments/prod/stats/developer,developer_app?select=sum(message_count)

The “select” parameter works very similar to a SELECT in an SQL, please note that you don’t need to include fields already mentioned in the dimension list. The select field is used only for metrics, not dimensional data.

Metric fields are aggregated by the groups defined in the dimension list. The API supports the following aggregate functions: SUM, AVG, MAX, MIN.

I like to add a “sortby” parameter, similar to an ORDER BY in SQL and a DESC/ASC parameter, both are optional, but it is quite useful to have them. The same as in SQL, if you use a field in the “sortby” list, you also need to have a reference to it in the “select” parameter. Also, note that this field is used with metrics and not with dimensions:

…/stats/developer,developer_app?select=sum(message_count)&sort=DESC&sortby=sum(message_count)

The query is almost done. Analytics data is time series based and I need to specify the time range I want to use:

…?select=sum(message_count)&sort=DESC&sortby=sum(message_count)&timeRange=03%2F21%2F2015+22:00:00~03%2F23%2F2015+22:00:00

And the time unit used to aggregate data, this one works as a window size parameter:

..?select=sum(message_count)&sort=DESC&sortby=sum(message_count)&timeRange=03%2F21%2F2015+22:00:00~03%2F23%2F2015+22:00:00&timeUnit=hour

In the case I want to filter my results, I use a “filter” parameter similar to a WHERE in SQL:

…&filter=(api_product+eq+’premium’)

This will force the query to include only calls that were done in the context of the “premium” product.

In the end, I like to add some extra parameters that will ensure data and formats are as I want them:

...&tsAscending=true&_optimized=js&limit=14400

tsAscending will force timestamps to be ordered in ascending order. If you are using the “sortby” parameter with sort=DESC, forcing the timestamps to be ascending is a good idea.

_optimized will force the JSON in the response to optimize the space and be less verbose.

limit will act as an SQL limit internally. This will affect the internal number of rows in the query, not necessarily the final result, I recommend to use a number between 10000 and 20000.

The final query will be:

https://api.enterprise.apigee.com/v1/organizations/palanqueta/environments/prod/stats/developer,deve...

And that’s it, enjoy your data!

Questions and comments are more than welcome.

Comments
Not applicable

It works OK from a web browser but 303s to the /login page when I curl it with Basic authorization in the header.

What authorization mechanism do I have to use to curl such a query?

Not applicable

I'm working on an on-prem stack.

Not applicable

/environments/prod/stats/api_product?limit=14400&select=avg(total_response_time)&sort=ASC&sortby=avg(total_response_time)&timeRange=06%2F24%2F2017+00:00:00~06%2F24%2F2017+16:00:00&timeUnit=minute

Can the data that the stats spit out be more granular than on the hour? Every minute??

I've tried changing the timeRange to less than 24 hours and timeUnit to "minute". It still returns data on the hour.

baselalrefaie
Bronze 3
Bronze 3

Hi,

what is the results will look like, how we can use it for another tool.

Version history
Last update:
‎04-08-2015 08:02 PM
Updated by: