Revision as of 16:48, March 15, 2016 by Bonniem (talk | contribs) (Configuring Cassandra)
Jump to: navigation, search

Configuring Cassandra

Cassandra can be configured prior to installation by editing the files located in %CASSANDRA_HOME%\conf. Within the conf directory, are cassandra.yaml, logback.xml, and other files which may be edited to tune Cassandra's performance, to customize the Cassandra cluster settings or even change logging settings.

Basic Configuration

Prior to creating a Cassandra cluster, it is important to first modify a few core settings in cassandra.yaml:

cluster_name:

Name of the Cassandra cluster. Must be identical for all nodes in cluster.

num_tokens
Leave the default value – unless a cluster is being migrated from a version 1.1.x cluster and the data needs to be maintained. Refer to the reference in the yaml for more information.

initial_token:
Leave the default value.

data_file_directories:
commitlog_directory:
saved_caches_directory:
Ensure that the above are all pointing to valid directories.

seeds: (default: "127.0.0.1")
Specifies a comma-delimited list of IP addresses. New nodes will contact the seed nodes to determine the ring topology and to obtain gossip information about other nodes in the cluster. Every node should have the same list of seeds.

start_native_transport: false (default is true)

start_rpc: true (default is false)

listen_address: (default: localhost)
The IP address that other Cassandra nodes will use to connect to this node. If left blank, uses the hostname configuration of the node.

rpc_address: (default: localhost)
The listen address for remote procedure calls. To listen on all interfaces, set to 0.0.0.0. If left blank, uses the hostname configuration of the node.

rpc_port: (default: 9160)
The port for remote procedure calls and the Thrift service.

NOTE: For Orchestration the Thrift interface is required. Assure that the start_native_transport is set to false, and that the start_rpc is set to true.

storage_port: (default: 7000)
The port for inter-node communication.

endpoint_snitch: (default: SimpleSnitch)
This option determines how Cassandra views the cluster, SimpleSnitch for a single cluster and PropertyFileSnitch, or other snitch chosen in the yaml, for a multiple data center cluster.

Note that the PropertyFileSnitch requires the cassandra-topology.properties file to describe the multiple data center cluster, for example within that file the following will need to be provided:

        # Cassandra Node IP=Data Center:Rack
        135.225.58.81=DC1:RAC1

        135.225.58.82=DC1:RAC2

        135.225.58.83=DC2:RAC1

        135.225.58.90=DC2:RAC2

Next, modify %CASSANDRA_HOME%\bin\cassandra.bat (if Windows), or %CASSANDRA_HOME%/conf/cassandra-env.sh (if Unix-based) to configure the JVM. It is important to verify that the JMX port does not conflict with other configured services:

-Dcassandra.jmx.local.port=7199 (in cassandra.bat)
JMX_PORT="7199" (in cassandra-env.sh)

Note that remote access via the JMX port is not recommended due to the possibility of unintended access to that port which could disrupt Cassandra operation.


Storage Schema

Creation of the schema is performed by Orchestration on startup if not done so manually. Before starting Orchestration in this case, assure that the Cassandra cluster is started first, then start one Orchestration instance. The schema will be created and propagated to all Cassandra instances. Manual schema creation can be done with the Cassandra CLI, note that the cassandra-cli is not available in Cassandra versions after 2.1.x, see Useful Tools section for more details. Seen below is a schema example for Orchestration on Cassandra 2.x (note that it conforms to the cassandra-cli syntax, again see Useful Tools section). Note that the replication factor is set to 1 in this sample and is the only allowed value for a single node deployment.

For a multiple node cassandra cluster this should be increased to increase availability. Refer to the following web site to determine the replication factor required for your deployment, noting that Orchestration performs all operations at consistency level of ONE.

http://www.ecyrd.com/cassandracalculator/

For more discussion on this topic, please refer to http://www.datastax.com/docs/1.1/dml/data_consistency for a discussion of consistency and the replication factor (RF).

Sample Orchestration Schema for Cassandra 2.2.x and Orchestration versions prior to 8.1.3


/*This file contains an example of the Orchestration keyspace, which should be tailored to the deployed cassandra instance capabilities.  
This file should be copied to the cassandra install conf directory.
The schema can be loaded using the cassandra-cli command line interface from the cassandra root install directory as follows:
        ./bin/cassandra-cli -host ip-address-of-cassandra-host --file conf/orchestration-schema.txt
        where ip-address-of-cassandra-host is the ip form of the host - ie. 135.225.58.81
        note that the above assumes that the thrift port is the default of 9160.

The cassandra-cli includes online help that explains the statements below. You can access the help without connecting to a running 
cassandra instance by starting the client and typing "help;"
NOTE: Please assure that the replication_factor is set correctly. Use Cassandra version 2.2.x or later.
*/

create keyspace Orchestration
    with strategy_options={replication_factor:1}
    and placement_strategy = 'org.apache.cassandra.locator.SimpleStrategy';

use Orchestration;

create column family Document
    with comparator = UTF8Type
    and column_type = Standard  
    and memtable_throughput = 128
    and memtable_operations = 0.29
    and read_repair_chance = 1.0
    and max_compaction_threshold = 32
    and min_compaction_threshold = 4 
    and gc_grace = 86400
    and comment = 'JSON form of the scxml document, keyed by md5 of document';

create column family Session
    with comparator = UTF8Type
    and column_type = Standard  
    and memtable_throughput = 128
    and memtable_operations = 0.29
    and read_repair_chance = 1.0
    and max_compaction_threshold = 32
    and min_compaction_threshold = 4 
    and gc_grace = 86400
    and comment = 'JSON form of the session, keyed by session GUUID';

create column family ScheduleByTimeInterval
    with comparator = UTF8Type
    and column_type = Standard  
    and memtable_throughput = 128
    and memtable_operations = 0.29
    and read_repair_chance = 1.0
    and max_compaction_threshold = 32
    and min_compaction_threshold = 4 
    and gc_grace = 86400
    and comment = 'Column names are the concatenation of scheduled ActionGUUID, action type, and idealtime in msecs, 
    column values are the action content. The keys are in form of time since the epoch in msecs divided by some time increment, 
    say 60000, for 1 minute intervals.';

create column family ScheduleBySessionID
    with comparator = UTF8Type
    and column_type = Standard  
    and memtable_throughput = 128
    and memtable_operations = 0.29
    and read_repair_chance = 1.0
    and max_compaction_threshold = 32
    and min_compaction_threshold = 4 
    and gc_grace = 86400
    and comment = 'Column names are the concatenation of scheduled ActionGUUID, action type, and idealtime in msecs, 
    column values are the idealtime in msecs, keyed by session id';

create column family SessionIDServerInfo
    with comparator = UTF8Type
    and column_type = Standard  
    and memtable_throughput = 128
    and memtable_operations = 0.29
    and read_repair_chance = 1.0
    and max_compaction_threshold = 32
    and min_compaction_threshold = 4 
    and gc_grace = 86400
    and comment = 'Session id and assigned node, keyed by session id';

create column family SessionIDServerInfoRIndex
    with comparator = UTF8Type
    and column_type = Standard  
    and memtable_throughput = 128
    and memtable_operations = 0.29
    and read_repair_chance = 1.0
    and max_compaction_threshold = 32
    and min_compaction_threshold = 4 
    and gc_grace = 86400
    and comment = 'Columns are session ids and the column values are also the session id, keyed by the string 
    form of the server node which owns the session.';

create column family RecoverSessionIDServerInfoRIndex
    with comparator = UTF8Type
    and column_type = Standard  
    and memtable_throughput = 128
    and memtable_operations = 0.29
    and read_repair_chance = 1.0
    and max_compaction_threshold = 32
    and min_compaction_threshold = 4 
    and gc_grace = 86400
    and comment = 'Columns are session ids and the column values are also the session id, keyed by the string form 
    of the server node which owns the session. Entries are only those sessions for which recovery is enabled.';

create column family ORS8130000
    with comparator = UTF8Type
    and column_type = Standard  
    and memtable_throughput = 128
    and memtable_operations = 0.29
    and read_repair_chance = 1.0
    and max_compaction_threshold = 32
    and min_compaction_threshold = 4 
    and gc_grace = 86400
    and comment = 'Dummy column family to designate the Orchestration schema version.'; conf directory.

In the above examples, one may notice that during the creation of keyspaces and column families, it is possible to configure various attributes. These attributes are described in the table below:

Keyspace Attributes

Option

Default

Description

name

N/A (Required)

Name for the keyspace.

placement_strategy

org.apache.casandra.locator.SimpleStrategy

Determines how replicas will be distributed among nodes in a Cassandra cluster. Allowed values:

  • org.apache.cassandra.locator.SimpleStrategy
  • org.apache.cassandra.locator.NetworkTopologyStrategy

A simple strategy simply distributes replicas to the next N-1 nodes in the ring for a replication_factor of N. A network topology strategy requires the Cassandra cluster to be location-aware (able to determine location of rack/datacentre). In this case, the replication_factor is set on a per-datacentre basis.

strategy_options

N/A

Specifies configuration options for the replication strategy.

For SimpleStrategy, one must specify replication_factor:number_of_replicas.

For NetworkTopologyStrategy, one must specify datacentre_name:number_of_replicas.

Column Family Attributes

Option

Default

Description

comparator

BytesType

Defines data type to use when validating or sorting column names. The comparator cannot be changed once a column family has been created.

column_type

Standard

Determines whether column family is a regular column family or a super column family. Use Super for super column families.

read_repair_chance

0.1

Specifies probability that read repairs should be invoked on non-quorum reads. Value must be between 0 and 1. Lower values improve read throughput but increases chances of stale values when not using a strong consistency level.

min_compaction_threshold

4

Sets the minimum number of SSTables to trigger a minor compaction when compaction_strategy=sizeTieredCompactionStrategy. Raising this value causes minor compactions to start less frequently and be more I/O-intensive. Setting this value to 0 disables minor compactions.

gc_grace_seconds

864000 (10 days)

Specifies the time to wait before garbage collecting tombstones (items marked for deletion). In a single node cluster, it can be safely set to zero.

comment

N/A

A human readable comment describing the column family.

column_metadata

N/A

Defines the attributes of a column. For each column, values for name and validation_class must be specified. It is also possible to create a secondary index for a column by setting index_type and index_name.

Performance Tuning

Besides configuring keyspaces and column families, it is possible to further tweak the performance of Cassandra by editing cassandra.yaml (Node and Cluster Configuration) or by editing cassandra-env.sh (JVM Configuration).

Descriptions of tunable properties can be found in both cassandra.yaml and cassandra-env.sh. A summary of these properties can be seen in the tables below:

Performance Tuning Properties (cassandra.yaml)

Option

Default

Description

column_index_size_in_kb

64

The size at which column indexes are added to a row. Value should be kept small if only a select few columns are consistently read from each row as a higher value implies that more row data must be deserialized for each read (until index is added).

commitlog_sync

periodic

Allowed values are periodic or batch. In periodic mode, the value of commitlog_sync_period_in_ms determines how frequently the commitlog is synchronized to disk. Writes are acknowledged at every periodic sync. If set to batch, writes are not acknowledged until fsynced to disk.

commitlog_sync_period_in_ms

10000 (10 seconds)

Determines how often (in milliseconds) to sync commitlog to disk when commitlog_sync is set to periodic.

commitlog_total_space_in_mb

4096

When commitlog reaches specified size, Cassandra flushes memtables to disk for oldest commitlog segments. Reduces amount of data to replay on startup.

compaction_throughput_mb_per_sec

16

Throttles compaction to the given total throughput across entire system. Value should be proportional to rate of write throughput (16 to 32 times). Setting to 0 disables compaction throttling.

concurrent_compactors

1 (per CPU core)

Max number of concurrent compaction processes allowed on a node.

concurrent_reads

32

Recommended setting is 16 * number_of_drives. This allows enough operations to queue such that the OS and drives can reorder them and minimize disk fetches.

concurrent_writes

32

Number of concurrent writes should be proportional to number of CPU cores in system. Recommended setting is (8 * number_of_cpu_cores).

in_memory_compaction_limit_in_mb

64

Size limit for rows being compacted in memory. Larger rows spill to disk and use a slower two-pass compaction process. Recommended value is 5 to 10 percent of available Java heap size.

index_interval

128

Influences granularity of SSTable indexes in memory. Smaller value indicates higher sampling of the index files, resulting in more effective indexes at the cost of memory. Recommended value is between 128 and 512 with a large column family key cache; larger value for small rows, or smaller value to increase read performance.

memtable_flush_writers

1 per data directory

Number of memtable flush writer threads. Influences flush performance and can be increased if you have a large Java heap size and many data directories.

memtable_total_space_in_mb

1/3 of heap

Total memory used for all column family memtables on a node.

multithreaded_compaction

false

When true, each compaction operation uses one thread per SSTable being merged in addition to one thread per core. Typically only useful on nodes with SSD hardware.

reduce_cache_capacity_to

0.6

Sets target max cache capacity when Java heap usage reaches threshold defined by reduce_cache_sizes_at. Emergency measure for preventing out-of-memory errors, along with flush_largest_memtables_at.

reduce_cache_sizes_at

0.85

When Java heap usage exceeds this percentage (after CMS garbage collection), Cassandra reduces the cache capacity as specified by reduce_cache_capacity_to. Set to 1.0 to disable.

sliced_buffer_size_in_kb

64

Buffer size to use for reading contiguous columns. Should match size of columns typically retrieved using query operations involving a slice predicate.

stream_throughput_outbound_megabits_per_sec

400

Max outbound throughput on a node for streaming file transfers.

JVM configuration settings Linux: conf/cassandra-env.sh Windows: bin\cassandra.bat

Option

Default

Description

MAX_HEAP_SIZE

Half of available physical memory

Maximum heap size for the JVM. Same value is used for minimum heap size, allowing heap to be locked in memory. Should be set in conjunction with HEAP_NEWSIZE.

HEAP_NEWSIZE

100 MB per physical CPU core

Size of young generation. Larger value leads to longer GC pause times while smaller value will typically lead to more expensive GC. Set in conjunction with MAX_HEAP_SIZE.

com.sun.management.jmxremote.port

7199

Port on which Cassandra listens for JMX connections.

com.sun.management.jmxremote.ssl

false

Enable/disable SSL for JMX.

com.sun.management.jmxremote.authenticate

false

Enable/disable remote authentication for JMX.

-Djava.rmi.server.hostname

N/A

Sets the interface hostname or IP that JMX should use to connect. Set if you have trouble connecting.

Logging

Changes to logging are made through the log4j-server.properties and log4j-tools.properties files. Within these files, it is possible to change the default logging level (log4j.rootLogger), the logging handlers, log message templates (ConversionPattern), as well as the default log file path (log4j.appender.R.File).

Example:

# output messages into a rolling log file as well as stdout
log4j.rootLogger=DEBUG,stdout,R

# stdout
log4j.appender.stdout=org.apache.log4j.ConsoleAppender
log4j.appender.stdout.layout=org.apache.log4j.PatternLayout
log4j.appender.stdout.layout.ConversionPattern=%5p %d{HH:mm:ss,SSS} %m%n

# rolling log file
log4j.appender.R=org.apache.log4j.RollingFileAppender
log4j.appender.R.maxFileSize=20MB
log4j.appender.R.maxBackupIndex=50
log4j.appender.R.layout=org.apache.log4j.PatternLayout
log4j.appender.R.layout.ConversionPattern=%5p [%t] %d{ISO8601} %F (line %L) %m%n
# Edit the next line to point to your logs directory
log4j.appender.R.File=C:\Cassandra\logs\system.log

# Application logging options
#log4j.logger.org.apache.cassandra=DEBUG
#log4j.logger.org.apache.cassandra.db=DEBUG
#log4j.logger.org.apache.cassandra.service.StorageProxy=DEBUG

# Adding this to avoid thrift logging disconnect errors.
log4j.logger.org.apache.thrift.server.TNonblockingServer=ERROR
Comments or questions about this documentation? Contact us for support!