Health Monitor removing target servers

Hi

We have a load balancer set up with health monitor enabled. We could see that after sometime we are getting error of "no active target servers". We disabled health monitor and re-deployed everything works seamlessly.

The target servers point to our APP Gateway in azure and are load balanced in Apigee.

Can anyone help me understand the below queries?

  • We have whitelisted apigee ips at our app gateway on azure, does health monitor also falls under the same ips as that of apigee servers?
  • Is there a way to look into health monitor logs?

Regards,

Venkat

0 3 505
3 REPLIES 3

cleisommais
Participant II

Hi,

At least we have two ways to define the target server.

First defining directly within target endpoints xml.

Second defining within admin->environments->target servers and link it within target endpoint in proxy.

I suppose you are using the second option. This second option you can enable and disable the target per environment. Your error seems something like that.

My code example:

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<TargetEndpoint name="default">
    <PreFlow name="PreFlow">
        <Request/>
        <Response/>
    </PreFlow>
    <Flows/>
    <PostFlow name="PostFlow">
        <Request/>
        <Response/>
    </PostFlow>
    <HTTPTargetConnection>
        <Properties/>
        <SSLInfo>
            <Enabled>true</Enabled>
        </SSLInfo>
        <LoadBalancer>
            <Server name="TS-Product"/>
        </LoadBalancer>
        <Path>/db</Path>
    </HTTPTargetConnection>
</TargetEndpoint>

The server name is a target server created in target servers setup.

I hope it can help you.

Hi

Apologies for delay in response, below is my configuration.

By disabling health monitor everything work fines, but if enabled the health removes target servers after max failures are reached.

Not sure why adding health monitor causes this issue.

<HTTPTargetConnection>
    <LoadBalancer>
        <Algorithm>RoundRobin</Algorithm>
        <Server name="XXXXX-XXXXX-XXXXX-XXXXX"/>
        <Server name="XXXXX-XXXXX-XXXXX-XXXXX"/>
        <MaxFailures>5</MaxFailures>
        <RetryEnabled>true</RetryEnabled>
    </LoadBalancer>
    <Properties/>
    <Path>/XXXXX</Path>
    <HealthMonitor>
        <IsEnabled>true</IsEnabled>
        <IntervalInSec>60</IntervalInSec>
        <HTTPMonitor>
            <Request>
                <ConnectTimeoutInSec>10</ConnectTimeoutInSec>
                <SocketReadTimeoutInSec>30</SocketReadTimeoutInSec>
                <Port>443</Port>
                <Verb>GET</Verb>
                <Path>/XXXXX/_status</Path>
                <Header name="TraceId">XXXXX</Header>
            </Request>
            <SuccessResponse>
                <ResponseCode>200</ResponseCode>
             </SuccessResponse>
        </HTTPMonitor>
    </HealthMonitor>
</HTTPTargetConnection>

Regards

Venkat

@Venkat Tummala,

In your target server, you have configured MaxFailures along with HealthMonitor. As per documentation,

If you configure MaxFailures > 0, the TargetServer is removed from rotation when the target fails the number of times you indicate.

With your HealthMonitor configuration, it does the following checks

  1. Polls the target servers by making request to the endpoint /XXXXX/_status once every 60 seconds (expects a 200 response from the target server as per your configuration)
  2. If Apigee Edge is able to connect to the target servers within 10 seconds
  3. If Apigee Edge is able to get the data from target servers within 30 seconds

If any of the checks fails, then it will increment the failure count by 1. And once the failure count exceeds MaxFailure (which is 5 in your case) for a target server, then that target server will be removed from rotation. If both the target servers are removed, then all the subsequent requests will be returned with 503 response codes and fault code as NoActiveTargets.

If you are on Private Cloud, you can check the Message Processor logs (/opt/apigee/var/log/edge-message-processor/logs/system.log).

If you are on Public Cloud, please raise a case with Apigee Support Team.