Adding a new zookeeper/cassandra in existing 13 node

Not applicable

Dear All,

We have a 13 Node cluster setup in e.g. US-EAST-1a.

Now we require to add few more components i.e. cassandra/zookeeper/rmp

I went through operation guide but not able to find how to add new cassandra/zookeeper. How i can add new zookeeper/cassandra to existing cluster ?

Also if i use second availability zone i.e. US-EAST-1b then will i face any issues ? as new components i.e. zookeeper/cassandra/rmp will be in US-EAST-1b and they have to communicate with US-EAST-1a components.

Solved Solved
1 13 889
1 ACCEPTED SOLUTION

Not applicable

so last year i took our 18 server installations (2 datacenters each) and expanded them to 27 servers (3 datacenters each)

I got some input from Apigee about this, and labbed everything out so as to avoid issues.

Our expansion included adding a new RMP stack in each environment, and cass & zk services.

In a nutshell the process looked like this:

  1. back everything up.
  2. make sure you have it backed up and can restore.
  3. prep your new servers (you will need the IP addresses)
  4. update existing cass and zk configs and topology (i have a script i used to handle this that is a bit of a mess)
  5. restart those services
  6. install new cs/zk services
  7. change replication factor of Cass
  8. repair all instances
  9. validate
  10. we did more here - but you only asked about cass and zk

There is an architect inside Apigee that fed me the seed info i needed to make this expansion work. Baring that if you can figure out how to get someone at Apigee to put you in touch w/ me I can share my notes.

View solution in original post

13 REPLIES 13

Not applicable

Curious. Cass/ZooK is a bottleneck in your cluster?

I have played around with adding message-processor to existing cluster. In my tests, install-setup-associate new UUID with appropriate environment(s) has been sufficient.

Never tried CZ extend.

Also, I am curious about taking a single DC cluster and extending to multi-DC without a complete rebuild. Hopefully, an Apigee engineer is preparing a response. 😉

We are actually expanding our current setup to get High availability. We are on 13 Node we are planning to expand it to minimum 18 Nodes

And yeah RMP addition is part of their operations guide... also you'll find the part where we can replace the existing zookeeper & cassandra IP's 🙂

Not applicable

so last year i took our 18 server installations (2 datacenters each) and expanded them to 27 servers (3 datacenters each)

I got some input from Apigee about this, and labbed everything out so as to avoid issues.

Our expansion included adding a new RMP stack in each environment, and cass & zk services.

In a nutshell the process looked like this:

  1. back everything up.
  2. make sure you have it backed up and can restore.
  3. prep your new servers (you will need the IP addresses)
  4. update existing cass and zk configs and topology (i have a script i used to handle this that is a bit of a mess)
  5. restart those services
  6. install new cs/zk services
  7. change replication factor of Cass
  8. repair all instances
  9. validate
  10. we did more here - but you only asked about cass and zk

There is an architect inside Apigee that fed me the seed info i needed to make this expansion work. Baring that if you can figure out how to get someone at Apigee to put you in touch w/ me I can share my notes.

@Benjamin Goldman do you think it would be possible/appropriate to share the notes with broader community too? Possibly as an article?

not sure. I would think about it - but putting it in a forum like this w/o spending a lot of time scrubbing it is kind of dangerous. Also - it was written for 15.01 - so some things are now different. That would have to be worked out by the next person.

Thanks a lot for your answer Benjamin. I did some checks from my end and as you mentioned some configuration files need to be modified and also token distribution need to be taken care of. Can you please share the steps you have ? I can use those steps as reference and try out in my test enviornment and share the results here later on ?

Had lots of family over this last weekend (surprise! here are a bunch of little kids that want to play with their uncle!) I will try to sit down with a bottle of wine and push this out this week. I may email you direcltly @Birute Awasthi so you can help me post the final product in a less frustrating fashion.

Hi Benjamin,

Can you please keep in the loop as well when you share the steps ?

Best Regards,

Kris

Thanks a lot for your answer Benjamin. I did some checks from my end and as you mentioned some configuration files need to be modified and also token distribution need to be taken care of. Can you please share the steps you have ? I can use those steps as reference and try out in my test enviornment and share the results here later on ?

which version are you working on? It might be moot.

I am installating 15.07 & 13 Node cluster. Would be great if you can share the steps you have and I can use them as reference on 15.07 Also one additional querry, do you know if i can add a new slave to existing PG setup, my dev enviornment is currently running on 1 master only (it was originally slave which is promoted to master when old master was terminated)

ill try to find time to write something up this weekend. it will NOT apply to the updated cassandra database that happend in 15.07 but to 15.01. I would have to think about how to modify what we did to simplify it in 15.07...