Edge AutoStart Setup in Linux

giridharj
Participant I

I am bit confused with Setting up AutoStart as described in the Documentation.

https://docs.apigee.com/private-cloud/v4.18.01/setting-server-autostart

//First start ZooKeeper, Cassandra, LDAP (OpenLDAP)

If ZooKeeper and Cassandra are installed as cluster, the complete cluster must be up and running before starting any other Apigee component//

1.) So, what if ZK/Cassandra/OpenLDAP gets started before ZK/Cassandra on other nodes are yet to come up?

2.) If all Zk/Cass nodes in cluster needs to be up before starting others - what will happen in case of one Zk/Cass having issues, say hardware issues, would there be issues starting up other services? If yes, then this would create a lot of failure points. Please advice.

3.) What if the Mgmt UI/OpenLDAP comes online before the Zookeeper node. When all Linux machines are getting rebooted, we can not control which ones come online first before the other.

Will the Apigee scripts take care of checking the Peer machines containing other Apigee components to start up in correct sequence ??

0 1 155
1 REPLY 1

rmishra
Participant V

1) The implementation for each product is different, ZooKeeper works with the concept of a leader election, while OpenLDAP works off the notion of peers , so if a peer is not up, it will wait and throw errors during replication till the peer is up.Now all OpenLDAP nodes will not be impacted, only the LDAP node whose peer is down . Each OpenLDAP node knows only about one peer. If the Zookeeper leader is not up, the other zookeeper nodes will wait till the lease period expires and then elect a new leader.
I am speaking in very broad strokes because the implementation for each of them is very different.

2) Assume that your machine hosting Zookeeper is down, and assuming that you have planned for high availability, the remaining zookeepers will elect a new leader. Same goes for Cassandra. When your hardware issue is resolved and you bring the same machine back to the zookeeper ensemble , it joins the cluster as a Follower and is synced up. Cassandra would behave in the same way but with different semantics

3. When Linux Machines are rebooted, you need to control how and when they are rebooted. You would have some kind of centralized software/scheduler (talk to your OS Administration team) which reboots them and control their reboot sequence. I wish i could be more specific but this varies greatly depending on your underlying OS and the practices your team follows. Bottomline, when a node hosts multiple apigee components, apigee service start up /shut down scripts control the sequence in which the nodes are started. For components hosted across nodes, you need to control the sequence in which those nodes are started