Commonly used properties

Properties most frequently used when configuring Cassandra.

Before starting a node for the first time, you should carefully evaluate your requirements.

Common initialization properties

Note: Be sure to set the properties in the Quick start section as well.

commit_failure_policy

(Default: stop)

Policy for commit disk failures:

  • die: Shut down gossip and Thrift and kill the JVM, so the node can be replaced.
  • stop: Shut down gossip and Thrift, leaving the node effectively dead, available for inspection using JMX.
  • stop_commit: Shut down the commit log, letting writes collect but continuing to service reads (as in pre-2.0.5 Cassandra).
  • ignore: Ignore fatal errors and let the batches fail.

disk_optimization_strategy

(Default: ssd)

The strategy for optimizing disk reads. Possible values: ssd or spinning.

disk_failure_policy

(Default: stop)

Sets how Cassandra responds to disk failure. Recommend settings: stop or best_effort. Valid values:

  • die: Shut down gossip and Thrift and kill the JVM for any file system errors or single SSTable errors, so the node can be replaced.
  • stop_paranoid: Shut down gossip and Thrift even for single SSTable errors.
  • stop: Shut down gossip and Thrift, leaving the node effectively dead, but available for inspection using JMX.
  • best_effort: Stop using the failed disk and respond to requests based on the remaining available SSTables. This allows obsolete data at consistency level of ONE.
  • ignore: Ignore fatal errors and lets the requests fail; all file system errors are logged but otherwise ignored. Cassandra acts as in versions prior to 1.2.

endpoint_snitch

(Default: org.apache.cassandra.locator.SimpleSnitch)

Set to a class that implements the IEndpointSnitch interface. Cassandra uses the snitch to locate nodes and route requests.

  • SimpleSnitch

Use for single-datacenter deployment or single-zone deployment in public clouds. Does not recognize datacenter or rack information. Treats strategy order as proximity, which can improve cache locality when you disable read repair.

  • GossipingPropertyFileSnitch

Recommended for production. Reads rack and datacenter for the local node in cassandra-rackdc.properties file and propagates these values to other nodes via gossip. For migration from the PropertyFileSnitch, uses the cassandra-topology.properties file if it is present.

  • PropertyFileSnitch

Determines proximity by rack and datacenter, which are explicitly configured in cassandra-topology.properties file.

  • Ec2Snitch

For EC2 deployments in a single region. Loads region and availability zone information from the Amazon EC2 API. The region is treated as the datacenter and the availability zone as the rack and uses only private IP addresses. For this reason, it does not work across multiple regions.

  • Ec2MultiRegionSnitch

Uses the public IP as the broadcast_address to allow cross-region connectivity. This means you must also set seed addresses to the public IP and open the storage_port or ssl_storage_port on the public IP firewall. For intra-region traffic, Cassandra switches to the private IP after establishing a connection.

  • RackInferringSnitch

Proximity is determined by rack and datacenter, which are assumed to correspond to the 3rd and 2nd octet of each node's IP address, respectively. Best used as an example for writing a custom snitch class (unless this happens to match your deployment conventions).

  • GoogleCloudSnitch

Use for Cassandra deployments on Google Cloud Platform across one or more regions. The region is treated as a datacenter and the availability zones are treated as racks within the datacenter. All communication occurs over private IP addresses within the same logical network.

  • CloudstackSnitch

Use the CloudstackSnitch for Apache Cloudstack environments.

rpc_address

(Default: localhost)

The listen address for client connections (Thrift RPC service and native transport). Valid values:

  • unset:

Resolves the address using the configured hostname configuration of the node. If left unset, the hostname resolves to the IP address of this node using /etc/hostname, /etc/hosts, or DNS.

  • 0.0.0.0:

Listens on all configured interfaces. You must set the broadcast_rpc_address to a value other than 0.0.0.0.

  • IP address
  • hostname

rpc_interface

(Default: eth1)note

The listen address for client connections. Interface must correspond to a single address, IP aliasing is not supported.

rpc_interface_prefer_ipv6

(Default: false)

If an interface has an ipv4 and an ipv6 address, Cassandra uses the first ipv4 address by default, i. If set to true, the first ipv6 address will be used.

seed_provider

The addresses of hosts designated as contact points in the cluster. A joining node contacts one of the nodes in the - seeds list to learn the topology of the ring.

  • class_name (Default: org.apache.cassandra.locator.SimpleSeedProvider)

The class within Cassandra that handles the seed logic. It can be customized, but this is typically not required.

  • - seeds (Default: 127.0.0.1)

A comma-delimited list of IP addresses used by gossip for bootstrapping new nodes joining a cluster. If your cluster includes multiple nodes, you must change the list from the default value to the IP address of one of the nodes.

Attention: In multiple data-center clusters, include at least one node from each datacenter (replication group) in the seed list. Designating more than a single seed node per datacenter is recommended for fault tolerance. Otherwise, gossip has to communicate with another datacenter when bootstrapping a node.

Making every node a seed node is not recommended because of increased maintenance and reduced gossip performance. Gossip optimization is not critical, but it is recommended to use a small seed list (approximately three nodes per datacenter).

enable_user_defined_functions

(Default: false)

User defined functions (UDFs) present a security risk, since they are executed on the server side. In Cassandra 3.0 and later, UDFs are executed in a sandbox to contain the execution of malicious code. They are disabled by default.

enable_scripted_user_defined_functions

(Default: false)

Java UDFs are always enabled, if enable_user_defined_functions is true. Enable this option to use UDFs with language javascript or any custom JSR-223 provider. This option has no effect if enable_user_defined_functions is false.

Common compaction settings

compaction_throughput_mb_per_sec

(Default: 16)

Throttles compaction to the specified Mb/second across the instance. The faster Cassandra inserts data, the faster the system must compact in order to keep the SSTable count down. The recommended value is 16 to 32 times the rate of write throughput (in Mb/second). Setting the value to 0 disables compaction throttling.

compaction_large_partition_warning_threshold_mb

(Default: 100)

Cassandra logs a warning when compacting partitions larger than the set value.

Common memtable settings

memtable_heap_space_in_mb

(Default: 1/4 of heap size)note

The amount of on-heap memory allocated for memtables. Cassandra uses the total of this amount and the value of memtable_offheap_space_in_mb to set a threshold for automatic memtable flush. For details, see memtable_cleanup_threshold.

memtable_offheap_space_in_mb

(Default: 1/4 of heap size)note

Sets the total amount of off-heap memory allocated for memtables. Cassandra uses the total of this amount and the value of memtable_heap_space_in_mb to set a threshold for automatic memtable flush. For details, see memtable_cleanup_threshold.

Common disk settings

concurrent_reads

(Default: 32)note

Workloads with more data than can fit in memory encounter a bottleneck in fetching data from disk during reads. Setting concurrent_reads to (16 × number_of_drives) allows operations to queue low enough in the stack so that the OS and drives can reorder them. The default setting applies to both logical volume managed (LVM) and RAID drives.

concurrent_writes

(Default: 32)note

Writes in Cassandra are rarely I/O bound, so the ideal number of concurrent writes depends on the number of CPU cores on the node. The recommended value is 8 × number_of_cpu_cores.

concurrent_counter_writes

(Default: 32)note

Counter writes read the current values before incrementing and writing them back. The recommended value is (16 × number_of_drives) .

concurrent_batchlog_writes

(Default: 32)

Limit on the number of concurrent batchlog writes, similar to concurrent_writes.

concurrent_materialized_view_writes

(Default: 32)

Limit on the number of concurrent materialized view writes. Set this to the lesser of concurrent reads or concurrent writes, because there is a read involved in each materialized view write.

Common automatic backup settings

incremental_backups

(Default: false)

Backs up data updated since the last snapshot was taken. When enabled, Cassandra creates a hard link to each SSTable flushed or streamed locally in a backups subdirectory of the keyspace data. Removing these links is the operator's responsibility.

snapshot_before_compaction

(Default: false)

Enables or disables taking a snapshot before each compaction. A snapshot is useful to back up data when there is a data format change. Be careful using this option: Cassandra does not clean up older snapshots automatically.

Common fault detection setting

phi_convict_threshold

(Default: 8)note

Adjusts the sensitivity of the failure detector on an exponential scale. Generally this setting does not need adjusting.

results matching ""

    No results matching ""