how apigee cassandra works across datacenters

Not applicable

Dear All,

I was going through apigee documentation and I have some doubts regarding cross datacenter cassandra fucntionality. I guess that for cross datacenter "NetworkTopology Strategy" is used. And data replication will be asynchronous.

As per the installation guide there is a 12 Node DR setup where we can give cassandra Ip's along with the DC involved.

1. Can someone please tell me in detail how they interact & replicate data with each other ?

2. What kind of ring/token structure is being used ? what happens if 1 DC goes down completely ?

3. There are some steps mentioned in apigee operations document for backup and recovery, those steps will be valid for multi region DR setup also ? Is there any other backup/recovery strategy for multi region setup when whole region is down ?

If possible can someone point to some good link for ring/token structure, i went through lot of weblinks but I still have doubts. I have gone through below document :

http://docs.datastax.com/en/archived/cassandra/1.1/docs/cluster_architecture/partitioning.html

2 4 2,247
4 REPLIES 4

To answer you questions

2) The cassandra nodes in the entire apigee deployment [spanning across DCs] form a cassandra cluster, cassandra nodes within a datacenter form local rings. During apigee deployment configuration we have configured local quorum, so clients would call only the cassandra nodes within the DC [same ring]

3) since the rings are part of the same cluster, the steps are applicable for the multi-DC or multi-region scenario as well

1) I hope the above answers explain your 1st question

This could be a good reference - http://www.datastax.com/wp-content/uploads/2012/08/WP-IntrotoCassandra.pdf again, there are several resources on the web to explain in detail

I'm looking for a scenario where if cassandra nodes are down in a DC, then data has to be fetched from cassandras of other DC. Could you please help with configuration changes to have cassandra not to have local quoram?

While its theoretically possible to configure it, it have a huge performance impact, especially in Gateway use case.

I presume you are looking for a DR use case, its recommend to configure your DR strategy at the DC level.

How can we setup DR at DC level if all C* nodes are in a cluster the local quorum will be applied always. Even if we make sure one DC stays always up when other DC goes down, the local quorum will have an impact to the whole cluster causing datastore, kvm, deployment errors. What is the best way to tackle this. How Apigee vendor can keep their SaaS environment up all the time? If anyone can provide insight would be really helpful.