1. Check ZK stat
echo stat|nc <ZKHOST> 2181
You should see the following output (for a follower):
Zookeeper version: 3.4.5-1392090, built on 09/30/2012 17:52 GMT Clients: /a.b.c.d:xxxx[0](queued=0,recved=1,sent=0) Latency min/avg/max: 0/0/0 Received: 1 Sent: 0 Connections: 1 Outstanding: 0 Zxid: 0xc00000044 Mode: follower Node count: 653
2. Check ZKserver.sh config for any non standard changes.
3. Check that ZKCLI works by running the following script
./zkCli.sh
4. Confirm that the directory contains data by running the following two commands
ls /organizations
ls /
5. Confirm you can telnet to ports 3888 and 2181
6. check that ZK is listening on the correct ports 3888 and 2181
netstat -an|grep LISTEN
You should see the following output:
tcp 0 0 0.0.0.0:22 0.0.0.0:* LISTEN tcp 0 0 0.0.0.0:601 0.0.0.0:* LISTEN tcp 0 0 127.0.0.1:25 0.0.0.0:* LISTEN tcp 0 0 127.0.0.1:32000 0.0.0.0:* LISTEN tcp 0 0 :::50382 :::* LISTEN tcp 0 0 :::3888 :::* LISTEN tcp 0 0 :::22 :::* LISTEN tcp 0 0 :::2181 :::* LISTEN unix 2 [ ACC ] STREAM LISTENING 776 @/com/ubuntu/upstart unix 2 [ ACC ] STREAM LISTENING 56398322 /tmp/ssh-JOJuYA9V11/agent.30541 unix 2 [ ACC ] SEQPACKET LISTENING 843 @/org/kernel/udev/udevd unix 2 [ ACC ] STREAM LISTENING 20629889 /dev/log unix 2 [ ACC ] STREAM LISTENING 20629895 /var/lib/syslog-ng/syslog-ng.ctl unix 2 [ ACC ] STREAM LISTENING 6911 /var/run/dbus/system_bus_socket
Clustered Zookeeper:
ZK clusters are configured for 2xF+1 for failure. This means that 3 clustered ZK nodes can afford to have 1 node fail. 6 node clusters can afford 2 nodes to fail, etc.
The ZK leader node is the only node which can be written to. All other nodes are read only.
If you have confirmed the previous steps on all nodes, and are still unable to get all components to start up, confirm that all nodes are trying to talk to the leader node. To do this, take a TCP dump of one of the components (ex. MP, PG) while it is starting up and filter on the ZK port numbers (3888 and 2181). Check the IP addresses the traffic is flowing to and confirm its the IP for the Leader node.
User | Count |
---|---|
7 | |
2 | |
2 | |
1 | |
1 |