First check the management server log
CuratorFramework-0-EventThread ERROR o.a.c.ConnectionState - ConnectionState.checkState() : Authentication failed main INFO ZOOKEEPER - ZooKeeperServiceImpl.exists() : Retry path existence path:/featureflag, reason: KeeperErrorCode = ConnectionLoss for /featureflag CuratorFramework-0 WARN o.a.c.ConnectionState - ConnectionState.checkTimeouts():
From the above log we can figure out that the ZK nodes are unhealthy so management server can't connect to zookeeper.
Then check the zookeeper log :-
Exception causing close of session 0x0 due to java.io.IOException:ZooKeeperServernot running [myid:3]- INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1001]-Closed socket connection for client <Ip>:44486(no session established for client)
If ZK are reporting errors with each other like above,then likely leader node is down and they have not elected a leader
To quickly check that the ZooKeeper software is running, follow these steps: 1)Login to each ZooKeeper machine and run the command: echo ruok | nc `hostname -i` 2181 or run echo srvr | nc `hostname -i` 2181 2)Confirm that you get the following response from each ZooKeeper instance: imok Note:If you get no response,or a ‘broken pipe’ error,zookeeper instance not serving request or ZooKeeper is not running 3)Obtain more information about the status of Zookeeperby logging into each ZooKeeper machine and running the command: echo status | nc `hostname -i` 2181
Check the conf_zookeeper_connection.string on Management server, Message-processor and router to validate the ZK connectivity.
/opt/apigee/token/application/message-processor.properties, router.properties: /opt/apigee/customer/application/management-server.properties:
Kindly mention the Zk node in string pool like in below order and if the leader node has the problem/ not in service then try to stop the node and re-elect the leader from the existing node. Note the leader node or the first node in the connection.string should always be working.
conf_zookeeper_connection.string=<leader-hostname>:2181,<follower-hostname>:2181,<follower-hostname>:2181{notin quorum,not serving requests}