nginx: (de)registration of message-processor causes 504-errors (outages)

Not applicable
Summary

When a new message-processor is registered a notification is sent to the router to reload nginx (virtual host update). Unfortunately this notification is sent exactly at the same time to all routers which causes the ELB to have no connected routers for a short period of time and by that it generates 504-GATEWAY TIMEOUT responses to requests.

Question

Is there a way to spread the reload of nginx (virtual host update) over a couple of seconds so that there is always a single router to handle requests.

Details

When a new 'edge-message-processor' is registered the following message is sent to 'edge-router':

2017-03-13 12:06:25,585  Apigee-Timer-9 INFO  LOAD-BALANCER - LoadBalancingManagementServiceImpl.serversAdded() : Servers [ServerBean{uuid=817e3600-e145-4b42-9ff2-3c52f17dab14, externalIPAddress='10.3.1.29', internalIPAddress='10.3.1.29', externalHostName='localhost', internalHostName='localhost', types=[message-processor], isReachable=true, isUp=false, tags={dp.color=green, jmx.rmi.port=1101, http.port=8998, http.management.port=8082, started.at=1489406777951, http.ssl.flag=false, rpc.port=4528, Profile=MessageProcessor, websocket.port=8999}, pod=Pod{name='gateway', region='dc-1'}, buildInfo=null}] added to scope ServerScope{organization='<org>', environment='<env>'}

which results in a list of:

- NginxUtil.checkConfig() : Testing config of Nginx on this machine using command /opt/nginx/scripts/apigee-nginx configtest and file <configuration-file>
- NginxUtil.checkConfig() : exit code of the command 0
- NginxUtil.refreshConfiguration() : Refreshing nginx configuration using command /opt/nginx/scripts/apigee-nginx reload

The same issue occurs when a message-processor is deregistered

1 1 266
1 REPLY 1

Not applicable