Graceful Removal of Node
A node may need to be removed from a cluster for a variety of reasons. For example, the capacity might no longer be needed, or a node may need hardware maintenance, such having more memory added, or a node might be down due to hardware failure.
Designed to be fault-tolerant, Cassandra handles node removal gracefully.
The nodetool decommission command is for a planned removal, whereas the nodetool removenode command is for a dead node.
Setup a 4 Node cluster using SaltStack
nodetool status
Datacenter: dc1
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns (effective) Host ID Rack
UN 159.89.164.69 108.61 KiB 256 49.3% 7e42de58-915f-457f-afa6-10613737037f rack1
UN 139.59.95.39 117.93 KiB 256 46.2% 8f3617e8-dccf-4a0e-96e2-a05a15dc7da3 rack1
Datacenter: dc2
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns (effective) Host ID Rack
UN 159.89.168.195 74.88 KiB 256 55.6% a7d3b15b-8c80-4ac0-913a-29c0a4653091 rack1
UN 139.59.72.230 75 KiB 256 48.9% 88438a1f-1040-4f03-b152-a47dbf114504 rack1
Decommissioning a Node
Decommissioning a node is when you choose to take a node out of service.
The decommission command assigns the token ranges that the node was responsible for to other nodes, and then streams the data from the node being decommissioned to the other nodes.
Decommissioning a node does not remove data from the decommissioned node. It simply copies data to the nodes that are now responsible for it.
nodetool -h 139.59.72.230 -p 7199 decommission
See that the node is leaving, as indicated by UL in the nodetool status window
nodetool status
Datacenter: dc1
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns (effective) Host ID Rack
UN 159.89.164.69 108.61 KiB 256 49.3% 7e42de58-915f-457f-afa6-10613737037f rack1
UN 139.59.95.39 114.5 KiB 256 46.2% 8f3617e8-dccf-4a0e-96e2-a05a15dc7da3 rack1
Datacenter: dc2
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns (effective) Host ID Rack
UN 159.89.168.195 74.88 KiB 256 55.6% a7d3b15b-8c80-4ac0-913a-29c0a4653091 rack1
UL 139.59.72.230 79.95 KiB 256 48.9% 88438a1f-1040-4f03-b152-a47dbf114504 rack1
After a while, see that the node is gone and that its load has been assigned to the remaining nodes:
nodetool status
Datacenter: dc1
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns (effective) Host ID Rack
UN 159.89.164.69 123.24 KiB 256 66.9% 7e42de58-915f-457f-afa6-10613737037f rack1
UN 139.59.95.39 129.17 KiB 256 60.8% 8f3617e8-dccf-4a0e-96e2-a05a15dc7da3 rack1
Datacenter: dc2
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns (effective) Host ID Rack
UN 159.89.168.195 99.62 KiB 256 72.3% a7d3b15b-8c80-4ac0-913a-29c0a4653091 rack1
Putting a Node Back into Service
Since data is not removed from a node when it is decommissioned (the data is copied to the other nodes, but not removed from the decommissioned node), it is best to clear the data from the decommissioned node, if the node has been down for any length of time, before putting the node back into service.
In general, it is faster to have the node join as a clean one (with no data), rather than have it join with old data that then needs to be repaired.
Clearing Data From a Node
service cassandra stop
rm -r /var/lib/cassandra
Put a Node Back into Service
mkdir -p /var/lib/cassandra/data
mkdir -p /var/lib/cassandra/commitlog
mkdir -p /var/lib/cassandra/saved_caches
mkdir -p /var/lib/cassandra/hints
chmod a+w /var/lib/cassandra/data
chmod a+w /var/lib/cassandra/commitlog
chmod a+w /var/lib/cassandra/saved_caches
chmod a+w /var/lib/cassandra/hints
service cassandra start
See the node joining the cluster:
nodetool status
Datacenter: dc1
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns (effective) Host ID Rack
UN 159.89.164.69 123.24 KiB 256 66.9% 7e42de58-915f-457f-afa6-10613737037f rack1
UN 139.59.95.39 129.17 KiB 256 60.8% 8f3617e8-dccf-4a0e-96e2-a05a15dc7da3 rack1
Datacenter: dc2
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns (effective) Host ID Rack
UN 159.89.168.195 99.62 KiB 256 72.3% a7d3b15b-8c80-4ac0-913a-29c0a4653091 rack1
UJ 139.59.72.230 193.88 KiB 256 ? 596bb656-3709-4ecb-85af-9a75196d1c1e rack1
Removing a Dead Node
Removing a dead node from the cluster is done to reassign the token ranges that the dead node was responsible for to other nodes in the cluster and to populate other nodes with the data that the dead node had been responsible for.
nodetool status
Datacenter: dc1
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns (effective) Host ID Rack
UN 159.89.164.69 118.28 KiB 256 49.3% 7e42de58-915f-457f-afa6-10613737037f rack1
UN 139.59.95.39 129.17 KiB 256 49.2% 8f3617e8-dccf-4a0e-96e2-a05a15dc7da3 rack1
Datacenter: dc2
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns (effective) Host ID Rack
UN 159.89.168.195 99.62 KiB 256 54.0% a7d3b15b-8c80-4ac0-913a-29c0a4653091 rack1
DN 139.59.72.230 69.92 KiB 256 47.5% 596bb656-3709-4ecb-85af-9a75196d1c1e rack1
nodetool removenode 596bb656-3709-4ecb-85af-9a75196d1c1e
nodetool removenode status
RemovalStatus: Removing token (-9219401015247577737). Waiting for replication confirmation from [/159.89.168.195,/139.59.95.39,/159.89.164.69].
nodetool status
Datacenter: dc1
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns (effective) Host ID Rack
UN 159.89.164.69 118.28 KiB 256 49.3% 7e42de58-915f-457f-afa6-10613737037f rack1
UN 139.59.95.39 129.17 KiB 256 49.2% 8f3617e8-dccf-4a0e-96e2-a05a15dc7da3 rack1
Datacenter: dc2
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns (effective) Host ID Rack
UN 159.89.168.195 99.62 KiB 256 54.0% a7d3b15b-8c80-4ac0-913a-29c0a4653091 rack1
DL 139.59.72.230 69.92 KiB 256 47.5% 596bb656-3709-4ecb-85af-9a75196d1c1e rack1
nodetool removenode status
RemovalStatus: No token removals in process.
nodetool status
Datacenter: dc1
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns (effective) Host ID Rack
UN 159.89.164.69 123.41 KiB 256 66.9% 7e42de58-915f-457f-afa6-10613737037f rack1
UN 139.59.95.39 134.3 KiB 256 60.8% 8f3617e8-dccf-4a0e-96e2-a05a15dc7da3 rack1
Datacenter: dc2
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns (effective) Host ID Rack
UN 159.89.168.195 104.75 KiB 256 72.3% a7d3b15b-8c80-4ac0-913a-29c0a4653091 rack1