Private cloud Monitoring options

Hi All,

We have Apigee Edge 4.17.01 on prem deployment, and are looking for monitoring options. Apigee product documentation gives a fair idea, but I would love to hear from other customers about their experience with implementation.

Server Health

We are considering plugging in Edge JMX interface to Zenoss - our enterprise monitoring tool.

Does out-of-the-box Edge JMX interface usually fulfill the server monitoring needs? Can we create custom MBeans with Edge?

API Health

As Apigee test is not available for private cloud, what are the other recommended options for monitoring API proxies? Our APIs are mostly internal/ private.

Alerts

This link gives a short intro to alerts on Edge. Where can I find more detailed information on configuring alerts?

Thank you!

Solved Solved
0 3 2,348
1 ACCEPTED SOLUTION

Not applicable

Hi,

JMX is available. You can learn more about how to configure and how to enable JMX auth in the Operations Guide:

http://docs.apigee.com/private-cloud/latest/monitoring-best-practices

Now, I suggest you expand your monitoring strategy beyond the items you mentioned above. You will need a holistic view of different layers, individual components and components interaction to truly monitor Edge.

Summary

1. In all nodes, collect metrics regarding:

CPU

RAM,

Disk

2. On every Router, monitor component status by executing:

curl -v http://localhost:8081/v1/servers/self/up

Expected response: HTTP 200 OK

3. On every Message Processor, monitor component status by executing:

curl -v http://localhost:8082/v1/servers/self/up

Expected response: HTTP 200 OK

4. On every Zookeeper node:

Check component status:

echo ruok | nc <host> 2181

Expected output: imok

Check the presence of a leader:

echo stat | nc <host> 2181 | grep Mode

Expected output: Follower or Leader. The check is is to ensure there is a leader on the system at all times.

5. On every Cassandra node, check statusthrift:

/opt/apigee/apigee-cassandra/bin/nodetool -h <host> statusthrift

Expected output: running

6. Introduce synthetic transactions using heartbeat API.

You could implement a simple proxy with 3 operations to test API traffic within Edge independent from the backend systems or real APIs.

Example:

A. rmp is a loop back on the Message Processor. No target proxy that returns HTTP 200 OK. This will help you test that traffic on Router and Message Processor is possible.

/v1/heartbeat/rmp

B. rmpcs no target operation similar to rmp but in addition execute one policy that hits Cassandra. Quota could be used for it.

/v1/heartbeat/rmpcs

C. rmpbackend passthrough that allows you to hit a mock on the backend to test that MPs are able to connect to a given backend system.

/v1/heartbeat/rmpbackend

7. Install Apigee Monitoring Console (optional but a good idea).

http://docs.apigee.com/private-cloud/latest/apigee-monitoring-dashboard-overview

8. Monitor a representative portion of your real APIs.

Monitoring your APIs is important. This monitoring should be contextual to the SLAs each API needs to meet.

9. Complement your monitoring view with analytics data.

Edge Analytics will allow you to take a deeper look a trends and patterns. Combine monitoring events with analytics data to expand your understanding of traffic, APIs, targets and platform behavior.

10. Prioritize

JMX, Edge Metrics API and other interfaces we offer can give you a lot more information. The list above represents a good set of priorities to cover before exploring more detailed metrics provided by other interfaces.

View solution in original post

3 REPLIES 3

Not applicable

Hi,

JMX is available. You can learn more about how to configure and how to enable JMX auth in the Operations Guide:

http://docs.apigee.com/private-cloud/latest/monitoring-best-practices

Now, I suggest you expand your monitoring strategy beyond the items you mentioned above. You will need a holistic view of different layers, individual components and components interaction to truly monitor Edge.

Summary

1. In all nodes, collect metrics regarding:

CPU

RAM,

Disk

2. On every Router, monitor component status by executing:

curl -v http://localhost:8081/v1/servers/self/up

Expected response: HTTP 200 OK

3. On every Message Processor, monitor component status by executing:

curl -v http://localhost:8082/v1/servers/self/up

Expected response: HTTP 200 OK

4. On every Zookeeper node:

Check component status:

echo ruok | nc <host> 2181

Expected output: imok

Check the presence of a leader:

echo stat | nc <host> 2181 | grep Mode

Expected output: Follower or Leader. The check is is to ensure there is a leader on the system at all times.

5. On every Cassandra node, check statusthrift:

/opt/apigee/apigee-cassandra/bin/nodetool -h <host> statusthrift

Expected output: running

6. Introduce synthetic transactions using heartbeat API.

You could implement a simple proxy with 3 operations to test API traffic within Edge independent from the backend systems or real APIs.

Example:

A. rmp is a loop back on the Message Processor. No target proxy that returns HTTP 200 OK. This will help you test that traffic on Router and Message Processor is possible.

/v1/heartbeat/rmp

B. rmpcs no target operation similar to rmp but in addition execute one policy that hits Cassandra. Quota could be used for it.

/v1/heartbeat/rmpcs

C. rmpbackend passthrough that allows you to hit a mock on the backend to test that MPs are able to connect to a given backend system.

/v1/heartbeat/rmpbackend

7. Install Apigee Monitoring Console (optional but a good idea).

http://docs.apigee.com/private-cloud/latest/apigee-monitoring-dashboard-overview

8. Monitor a representative portion of your real APIs.

Monitoring your APIs is important. This monitoring should be contextual to the SLAs each API needs to meet.

9. Complement your monitoring view with analytics data.

Edge Analytics will allow you to take a deeper look a trends and patterns. Combine monitoring events with analytics data to expand your understanding of traffic, APIs, targets and platform behavior.

10. Prioritize

JMX, Edge Metrics API and other interfaces we offer can give you a lot more information. The list above represents a good set of priorities to cover before exploring more detailed metrics provided by other interfaces.

Thank you @Maudrit for the detailed response. It is quite helpful.

Does the Apigee Monitoring console execute the calls/commands mentioned in points 2,3,4,5 above?

As mentioned here http://docs.apigee.com/private-cloud/latest/apigee-monitoring-dashboard-overview you can see the below

On this screen, you can see information about the:

  • Router: status, traffic, errors, load, and more.
  • Message Processor: status and health, traffic, target latency, target response codes, and more.
  • Node(Cassandra and ZK) metrics: CPU usage, disk space, heap usage, and more metrics.