{ Community }
  • Academy
  • Docs
  • Developers
  • Resources
    • Community Articles
    • Apigee on GitHub
    • Code Samples
    • Videos & eBooks
    • Accelerator Methodology
  • Support
  • Ask a Question
  • Spaces
    • Product Announcements
    • General
    • Edge/API Management
    • Developer Portal (Drupal-based)
    • Developer Portal (Integrated)
    • API Design
    • APIM on Istio
    • Extensions
    • Business of APIs
    • Academy/Certification
    • Adapter for Envoy
    • Analytics
    • Events
    • Hybrid
    • Integration (AWS, PCF, Etc.)
    • Microgateway
    • Monetization
    • Private Cloud Deployment
    • 日本語コミュニティ
    • Insights
    • IoT Apigee Link
    • BaaS/Usergrid
    • BaaS Transition/Migration
    • Apigee-127
    • New Customers
    • Topics
    • Questions
    • Articles
    • Ideas
    • Leaderboard
    • Badges
  • Log in
  • Sign up

Get answers, ideas, and support from the Apigee Community

  • Home /
  • Business of APIs /
avatar image
7

Forming an API Monitoring Strategy - Where to Start  

  • Export to PDF
Dom Couldwell created · Feb 04, 2016 at 05:52 PM · 4k Views · edited · Aug 10, 2016 at 03:49 PM

Monitoring your API's health is key to maintaining a trusted, reliable, and robust API program, and to quickly identifying and resolving issues. You can monitor both the proxy and underlying target endpoints.

When designing your API, consider how to monitor in a lightweight and maintainable fashion. Also, think about what role the API may take in monitoring underlying target health.

This article outlines the approach that the Customer Success team at Apigee takes when helping customers form a monitoring strategy.

Starting out

Ask yourself the following questions when you start to think about API Monitoring:

  • What are the requirements for monitoring API health?
    • Is a simple ping enough?
    • Are there certain resources that are critical?
    • How deep does the monitoring need to be?
  • Are there requirements for monitoring the target endpoint health through the API?
    • Are you looking to monitor both proxy and target health? Differentiating between proxy health vs target health can be key when diagnosing issues in production.
  • Which environments are important to have monitoring in place?
    • Production is obvious but it could be just as important to monitor alpha, beta and dev integration environments.

Requirements

The main objectives of a monitoring strategy are:

  • Defining various request/response patterns that touch as many components as possible to test the health of the overall system.
  • Defining an external system that can execute these requests reliably and consistently. Ideally, this system should have the capability to run requests from multiple different data centres around the world.

A general best practice consists of:

  • Designing various specialised cheap-to-execute requests that monitor the health of target components and connectivity between the proxy and the target endpoint.
  • Using a selection of real API resources to assess the health of individual proxy components.

Common patterns

The following are some examples of resources we commonly use to fulfill the above requirements. The patterns described in this article are:

  • Ping sub resource -- A specialised sub resource exposed by the proxy to test proxy network connectivity and proxy deployment status.
  • Status Resource -- A specialised resource to test proxy-to-target network connectivity and assess target API health.
  • Using Real Requests -- Using the existing API resources to check the health of the system.

Ping sub resource

This is a specialised sub resource exposed by the proxy to test proxy network connectivity and proxy deployment status. The proxy does not hit any target APIs in this scenario.

Although it could be implemented as first-class resource, it is recommended to implement at as a sub resource. So, each API Proxy bundle is instrumented by providing independent monitoring capabilities.

Example

Here is an example implementation:

Example Request

GET /customer/v1/ping
Accept: application/json

Example Response

HTTP/1.1 200 OK
Content-Type: application/json 
{
    "environment": "prod",
    "clientIp": "100.10.1.0",
    "api": "customer-v1",
    "verb": "GET",
    "responseTime": 20,
    "message": "pong"
}

Status resource

This is a specialised resource to test proxy-to-target network connectivity and assess target API health. It is exposed by both the proxy and target APIs, as follows:

  1. A client request hits a proxy /status endpoint.
  2. In turn, the proxy hits the status (or health) endpoints exposed by each target API -- see below.

    Status endpoints for target APIs and components will need to do all internal testing necessary to report the health of that component.

  3. Apigee responds in keeping with target responses, as follows:
    • If all targets respond with success, Apigee responds with 200 OK. The response includes an array of objects containing health and timing information for each target system.
    • If at least one target returns failure, Apigee responds with 500 Internal Server Error. The response includes an array of JSON objects containing health and timing information for target systems. It is important for the status resource to respond as soon as it understands that a particular target system is failing. In other words, if one system is failing, it shouldn't wait until all systems respond.

Example

Example request

GET /customer/v1/status
Accept: application/json

Example success response

HTTP/1.1 200 OK
Content-Type: application/json
[
    {
        "name": "customer-v1",
        "component" : "crm",
        "targetResponseTime": 350,
        "status": "ok",
        "response": ""
    },
    {
        "name": "customer-v1",
        "component" : "loyalty",
        "targetResponseTime": 500,
        "status": "ok",
        "response": ""
    }
]

Example failure response

HTTP/1.1 500 Internal Server Error
Content-Type: application/json
[
    {
        "name": "customer-v1",
        "component" : "crm",
        "targetResponseTime": 600,
        "status": "failure",
        "response": "unable to connect to customer database"
    },
    {
        "name": "customer-v1",
        "component" : "loyalty",   
        "targetResponseTime": 500,
        "status": "ok",
        "response": ""
    }  
]

While implementing this resource, you'll learn the quickest and cheapest route to understanding how each target system's health can be checked.

Target considerations

  • If a target API already exposes a status/health endpoint, use that.
  • If a target API cannot expose a status endpoint -- for example, the API is external and not in the team's control, you can use a simple (and cheap) GET request.
  • If the target system is not an API (for example, it is a database), the target system will need to expose commands specific to the system to monitor general connectivity between Apigee and this component. For example, mongodb has a db.serverStatus() command that returns quickly and does not impact MongoDB performance. The proxy /status endpoint can execute db.serverStatus() on mongo to report its status.

Security considerations

  • Protect the status resource if the response contains confidential data.
  • Consider masking to prevent unnecessary or internal information from leaking from endpoints when reporting errors, such as when database names occur in error strings.

Using real requests

This approach uses the existing API resources to check the health of the system. Because the tests are running on a production environment, be careful when choosing resources for this. Ideally data that is used by this resource will be isolated from all other system data. For example, in hotel API a new dummy hotel can be created within the system where monitoring can do reservations and cancellations without affecting real hotel availability.

Analytics

If you are using real requests for monitoring, and if APIs are protected by API keys or OAuth, create a new separate application for monitoring. That way, requests can be identified in analytics.

Regardless of the monitoring approach you take, the requests will still appear in any analytics report so you may want to consider adding something in the requests to be able to easily filter them out of any reporting.

Tools

There are a number of tools out there to help you monitor your API. Here's some of the tools we have used:

  • Apigee Health - https://health.apigee.com
  • Librato - https://www.librato.com/
  • Pingdom - https://www.pingdom.com/
  • Runscope - https://www.runscope.com/
  • API Metrics - http://apimetrics.io/
  • Uptime - https://github.com/fzaninotto/uptime

Summary

Think about what you're trying to monitor and why. Think about the cost of monitoring. Don't forget about the security of the resources you are exposing.

thub.nodes.view.add-new-comment
monitoringbest practicesbusinessrapid launchaccelerator methodology
Add comment Show 2
10 |5000 characters needed characters left characters exceeded
▼
  • Viewable by all users
  • Viewable by Apigeeks only
  • Viewable by the original poster
  • Viewable by moderators
  • Viewable by moderators and the original poster
  • Advanced visibility
Viewable by all users
avatar image jonesfloyd ♦♦ · Feb 04, 2016 at 07:25 PM 0
Link

@Dom Couldwell, ping @docs here when you're finished with Steve's comments. Thanks!

avatar image Ben Rodriguez · Jan 22, 2017 at 08:21 PM 0
Link

Thanks for posting Dom. My org is going through this process now. we are looking at these tools and integrating with legacy tools we have in place like servicenow and zendesk.

Article

Contributors

avatar image avatar image avatar image

Follow this article

7 People are following this .

avatar image avatar image avatar image avatar image avatar image avatar image avatar image

Navigation

Forming an API Monitoring Strategy - Where to Start

Related Articles

Best Practices for Defining an API Specification

Forming an API Test Strategy - Where to Start

Agile Assurance: Best Practices for running effective Sprints

What makes a great Product Owner?

What does success look like?

Agile Assurance: Best Practices for managing your API product backlog

API Governance tooling approaches - How do I automate my API program governance?

​Consumer Privacy in Digital Business

Personal Privacy API - how and why market applications use it?

How does the Apple and FBI battle over privacy affect Digital Business

  • Products
    • Edge - APIs
    • Insights - Big Data
    • Plans
  • Developers
    • Overview
    • Documentation
  • Resources
    • Overview
    • Blog
    • Apigee Institute
    • Academy
    • Documentation
  • Company
    • Overview
    • Press
    • Customers
    • Partners
    • Team
    • Events
    • Careers
    • Contact Us
  • Support
    • Support Overview
    • Documentation
    • Status
    • Edge Support Portal
    • Privacy Policy
    • Terms & Conditions
© 2021 Apigee Corp. All rights reserved. - Apigee Community Terms of Use - Powered by AnswerHub
  • Anonymous
  • Sign in
  • Create
  • Ask a question
  • Create an article
  • Post an idea
  • Spaces
  • Product Announcements
  • General
  • Edge/API Management
  • Developer Portal (Drupal-based)
  • Developer Portal (Integrated)
  • API Design
  • APIM on Istio
  • Extensions
  • Business of APIs
  • Academy/Certification
  • Adapter for Envoy
  • Analytics
  • Events
  • Hybrid
  • Integration (AWS, PCF, Etc.)
  • Microgateway
  • Monetization
  • Private Cloud Deployment
  • 日本語コミュニティ
  • Insights
  • IoT Apigee Link
  • BaaS/Usergrid
  • BaaS Transition/Migration
  • Apigee-127
  • New Customers
  • Explore
  • Topics
  • Questions
  • Articles
  • Ideas
  • Badges