Cassandra Token distribution for Multi Datacenter Deployments

Cassandra Token distribution

Current Apigee deployments do a hard token assignment for each Cassandra node. The methodology we follow is to have a token offset of 100 for each token range for each datacenter.

For example, DC-1 has a token offset of 0 for each token range, DC-2 has a token offset of 100, and DC-3 has an offset of 200.

In the following example, the "nodetool ring" command returns the token range ownership of the Cassandra cluster. Nodetool displays datacenter us-east as the primary token range owner. This has no reference to the data stored on each node.

Data representation on each node is controlled by the Replication factor, which is set at keyspace level. Each keyspace can have different replication factors and strategies.

Apigee implements NetworkToplogyStrategy as a default to make it easy for expansion to multiple data centers in the future.

Example Configuration :

us-east : 3 Nodes
us-west : 3 Nodes

Token ownership info

$ nodetool ring
Note: Ownership information does not include topology; for complete information, specify a keyspace

Datacenter: us-west
==========
Address        Rack        Status State   Load            Owns               Token
                                                                             113427455640312821154458202477256070484
192.168.0.18  2a          Up     Normal  8.98 GB         33.33%              0
192.168.1.14  2b          Up     Normal  5.92 GB         33.33%              56713727820156410577229101238628035242
192.168.2.13  2c          Up     Normal  5.81 GB         33.33%              113427455640312821154458202477256070484

Datacenter: us-east
==========
Address        Rack        Status State   Load            Owns               Token
                                                                             113427455640312821154458202477256070584
192.168.3.32  1b          Up     Normal  7.78 GB         0.00%               100
192.168.4.25  1c          Up     Normal  5.92 GB         0.00%               56713727820156410577229101238628035342
192.168.5.45  1d          Up     Normal  6.05 GB         0.00%               113427455640312821154458202477256070584


Data Replication information per keyspace.

Example :-

Keyspace : kms
Replication Factor : 3 per datacenter
Replication Strategy : NetworkToplogy Strategy
[default@unknown] use kms;
Authenticated to keyspace: kms
[default@kms] show schema;

create keyspace kms
  with placement_strategy = 'NetworkTopologyStrategy'
  and strategy_options = {us-west : 3, us-east : 3}
  and durable_writes = true;

When specifying a particular keyspace, the "nodetool ring" command displays the data/replica ownership per node for the keyspace.

In the following example, the "kms" keyspace has 3 replicas in both datacenters. Because the number of nodes is also 3 per datacenter, it displays 1 single node with 100% of 1 replica of the data.

$ nodetool ring kms
Datacenter: us-west
==========
Address Rack Status State Load Owns Token
113427455640312821154458202477256070484
192.168.0.18 2a Up Normal 8.98 GB 100.00% 0
192.168.1.14 2b Up Normal 5.92 GB 100.00% 56713727820156410577229101238628035242
192.168.2.13 2c Up Normal 5.81 GB 100.00% 113427455640312821154458202477256070484
Datacenter: us-east
==========
Address Rack Status State Load Owns Token
113427455640312821154458202477256070584
192.168.3.32 1b Up Normal 12.78 GB 100.00% 100
192.168.4.25 1c Up Normal 5.92 GB 100.00% 56713727820156410577229101238628035342
192.168.5.45 1d Up Normal 6.05 GB 100.00% 113427455640312821154458202477256070584
$

Another example is where there are 6 nodes per datacenter. Here, replication factor is 3, but there are 6 nodes to share the number of replicas, so each node will have 50% of each replica set. Essentially 1 node has 50% of 1 copy of the data.

Example Configuration :

us-east : 6 Nodes

us-west : 6 Nodes
$ nodetool ring kms
Datacenter: us-east
==========
Address Rack Status State Load Owns Token
141784319550391026443072753096570088100
192.168.1.11 1d Up Normal 102.45 GB 50.00% 0
192.168.2.12 1c Up Normal 104.04 GB 50.00% 28356863910078205288614550619314017620
192.168.3.13 1b Up Normal 105.31 GB 50.00% 56713727820156410577229101238628035242
192.168.1.14 1d Up Normal 104.44 GB 50.00% 85070591730234615865843651857942052860
192.168.2.15 1c Up Normal 107.15 GB 50.00% 113427455640312821154458202477256070480
192.168.3.16 1b Up Normal 104.19 GB 50.00% 141784319550391026443072753096570088100
Datacenter: us-west
==========
Address Rack Status State Load Owns Token
141784319550391026443072753096570088205
192.168.4.18 2a Up Normal 102.36 GB 50.00% 100
192.168.5.17 2b Up Normal 107.76 GB 50.00% 28356863910078205288614550619314017721
192.168.6.16 2c Up Normal 108.39 GB 50.00% 56713727820156410577229101238628035342
192.168.4.13 2a Up Normal 109.08 MB 50.00% 85070591730234615865843651857942052963
192.168.5.19 2b Up Normal 109.59 GB 50.00% 113427455640312821154458202477256070584
192.168.6.17 2c Up Normal 106.89 GB 50.00% 141784319550391026443072753096570088205
$

@brindalovely @Akash Prabhashankar @Dave Newman @Janice Hunt

Version history
Last update:
‎11-17-2015 01:21 PM
Updated by: