Scaling out in Cassandra
1.Scaling out VS scaling up
Scaling up
scaling up refers to converting existing infrastructure to better or more robust hardware
This could mean adding storage capacity, increasing memory, moving to newer machines with more cores
Scaling out
scaling out simply means adding more machines that roughly match the specifications of the existing machines
Since Cassandra scales linearly with its peer-to-peer architecture, scaling out is often more desirable
2.Adding nodes without vnodes
If possible, run repairs to ensure all nodes contain the most recent data
Make sure Cassandra is installed, but do not start the process. If you use a package manager, be aware that Cassandra will start automatically, so you will need to stop the process before proceeding
Edit cassandra.yaml file as follows
sudo vim /etc/cassandra/cassandra.yamlSet the seeds,initial value,rpc_address,endpoint and auto_bootstrap values
Start the Cassandra daemon on the new node as follows
sudo service cassandra startCheck the status of the bootstrap process as follows
sudo nodetool netstats
3.Adding nodes with vnodes
Same as adding nodes without vnodes but,instead of initial_tokens, set the value of num_tokens to a greater value than exisiting nodes
Update the appropriate properties files
If you're using the GossipingPropertyFileSnitch, add the cassandra-rackdc.properties file on each new node. If you have chosen the PropertyFileSnitch, you will need to update cassandra-topology.properties on ALL nodes (a restart is not required on existing nodes)
For eg
Larger values represent proportionally larger nodes in your cluster, with 256 being the default. If all your nodes are the same size, this default should be sufficient
Keep track of the bootstrapping process as follows
sudo nodetool netstats