sre / product owners - top analytics queries

0 0 87

Hi all - This article will capture your top-queries/reports for a production-grade Apigee infrastructure. This is assumes you have ingested the logs into log-aggregator (Splunk, DataDog, ELK, .. ) and have enabled APM monitoring. For alerts, instrument per your needs - pager-duty, Newrelic, ..

Apigee Edge / Business metrics

  • Measure global 2xx rate
  • Measure 2xx rate by virtual host
  • Total traffic counts by developer-app ( Time Measurements - rolling 4 hours, daily, weekly)
  • Measure 2xx by product
  • Measure 2xx, 5xx by product
  • Measure 2xx of your target-servers

System metrics:

  • Measure tcp-opens / close-waits on routers and message-processors
  • Measure connection-counts on your front-end load balancers
  • C* (Cassandra) - monitor token keyspace (important health metric / point of no-return issues)
  • ZK (zookeeper) - measure sync-rate to RMPS. (Seen slowness effect deploy times - multi-region datacenter)
  • OS specifics - RHEL vs. Ubuntu vs. Amazon-AMI (compare against benchmarks)
Version history
Last update:
‎08-11-2019 05:51 PM
Updated by: