Health Monitor Load Balancing, is it possible to stop calling /health endpoint once is known to be OK ?

We are trying to set up the load balancing across a number of target servers. and we are having problems with the number of calls to /health endpoint in healthmonitor.

1. we are able to take the target server out of rotation

2. we are able to put back the server in the rotation

yes. (thanks to the health monitor to /health endpoint

our Problem

1. target servers saw a lot of requests in 1 day (around 600,000 requests) we have it setup as 1 request every 15 seconds.

the question

is it possible for health monitor to "stop" spamming /health calls once it recognizes the server is OK? (pricing is a concern),

something like MaxSuccesses = 2 and then not calling health anymore.

are we missing something ?

thanks for the advice

Solved Solved
0 3 284
1 ACCEPTED SOLUTION

In my testing I've noticed that there is a call made from each Message Processor (MP), so in my org I've got 4 MPs and I can see 4 calls being made every IntervalInSec. This implies that each MP keeps track of the TS health.

To see how many MPs you currently have deployed you can check the deployment of a particular proxy and count the servers with type "message-processor" for each env.

curl -n {{MGMTSVR}}/v1/o/{{ORG}}/apis/yourproxy-name/deployments

View solution in original post

3 REPLIES 3

Hi @Esteban Lartigue,

The HealthMonitor / HTTPMonitor has two purposes:

  1. take the server out of rotation upon non-success response
  2. bring it back into rotation upon success response

HTTPMonitor needs to make health check calls to bring the server back into rotation.

See docs: https://docs.apigee.com/api-platform/deploy/load-balancing-across-backend-servers#settingloadbalance...

You can also use ServerUnhealthyResponse to take a server out of rotation, but you'll still need the HTTPMonitor to bring it back in once the failure count is reached.

So given that, you could make the polling IntervalInSec longer, depending on how fast you want the server back in rotation. The up side is the ServerUnhealthyResponse will take it out sooner than the polling IntervalInSec.

Hope that helps.

Thank you, we have tested successfully that target servers are out and back to the rotation using the health monitor.

We have a problem with the number of requests that are hitting the target servers /health endpoint.

we are seeing 9x , 10x times more requests of what we would expect and I need to justify the business for the high number of requests.

do you have an insight on the formula for this high number of requests? we increased the interval to 120 seconds. and expected 120 request in the target system, but they got 1,000 (9x more)

this is our current configuration

<LoadBalancer>
            <Algorithm>LeastConnections</Algorithm>
            <Server name="lr-cms-na-server1"/>
            <Server name="lr-cms-na-server2"/>
            <Server name="lr-cms-na-server3"/>
            <Server name="lr-cms-na-server4"/>
            <Server name="lr-cms-na-server5"/>
            <Server name="lr-cms-na-server6"/>
            <Server name="lr-cms-na-server7"/>
            <Server name="lr-cms-na-server8"/>
            <Server name="lr-cms-na-server9"/>
            <Server name="lr-cms-na-server10"/>
            <MaxFailures>3</MaxFailures>
</LoadBalancer>
<Path>{cmsUri}</Path>
<HealthMonitor>
    <IsEnabled>true</IsEnabled>
    <IntervalInSec>120</IntervalInSec>
    <HTTPMonitor>
        <Request>
            <Verb>GET</Verb>
            <Path>/api/health</Path>
        </Request>
        <SuccessResponse>
            <ResponseCode>200</ResponseCode>
            </SuccessResponse>
    </HTTPMonitor>
</HealthMonitor>

In my testing I've noticed that there is a call made from each Message Processor (MP), so in my org I've got 4 MPs and I can see 4 calls being made every IntervalInSec. This implies that each MP keeps track of the TS health.

To see how many MPs you currently have deployed you can check the deployment of a particular proxy and count the servers with type "message-processor" for each env.

curl -n {{MGMTSVR}}/v1/o/{{ORG}}/apis/yourproxy-name/deployments