Notes On Cassandra: Cassandra Is A Nosql Database
Notes On Cassandra: Cassandra Is A Nosql Database
1
Notes on Cassandra
Cassandra is
• Massively scalable, free, open source
– Like Mongo, the community edition is free
– Enterprise C* users can purchase support and add-on
features through a vendor like DataStax
• Designed for High Performance and High Scalability
• Designed for High Availability, fault tolerant with no SPOF
• Abbreviated as C*
2
Notes on Cassandra
3
Notes on Cassandra
Cassandra
4
Notes on Cassandra
Cassandra Concepts
5
Notes on Cassandra
Cassandra Terminology
6
Notes on Cassandra
Cassandra Terminology
7
Notes on Cassandra
Cassandra Concepts
– BUT – no joins
8
Notes on Cassandra
C* is a peer-to-peer, fully distributed system where
11
Notes on Cassandra
• Each node is assigned a token (a number), whose value
determines
– The logical position of the node in the ring
– The range of data the node is assigned
12
Notes on Cassandra
13
Notes on Cassandra
14
Notes on Cassandra
15
Notes on Cassandra
Let's look at an example
16
Notes on Cassandra
Let's look at an example
• CQL:
17
Notes on Cassandra
Let's look at an example
18
Notes on Cassandra
19
Notes on Cassandra
Configurable Replication
21
Notes on Cassandra
22
Notes on Cassandra
CAP Theorem: It is Impossible for a distributed system to
guarantee all three of the above
– It illustrates the tradeoffs when building a system that
may suffer partition failures - which is any distributed
system
24
Notes on Cassandra
CAP Theorem:
25
Notes on Cassandra
Trade-Offs:
26
Notes on Cassandra
27
Notes on
Notes on Cassandra
Cassandra
C*provides
C* provideswhat
whatisiscalled
calledTunable
TunableConsistency
Consistency
Dataisisreplicated
Data replicatedto
tomultiple
multiplenodes
nodes
Considerwhat
Consider whathappens
happenswhen
whenan
anupdate
updateoccurs?
occurs?
––The
Theupdate
updatemay
maynot
notpropagate
propagateto
toall
allreplicas
replicasimmediately
immediately
Whathappens
What happensififaaread
readoccurs
occursbefore
beforean
anupdate
updateisis
propagatedto
propagated toall
allreplicas?
replicas?
––The
Theread
readmay
mayretrieve
retrieveolder
older(pre-update)
(pre-update)data
datafrom
fromaa
replica
replica
––What
Whatabout
aboutconsistency?
consistency?
28
28
Notes on Cassandra
29
Notes on Cassandra
30
Notes on Cassandra
LEVEL DESCRIPTION
ANY A write must be written to at least one node. If all replica nodes for the given row key are down,
the write can still succeed once a hinted handoff has been written. Note that if all replica nodes
are down at write time, an ANY write will not be readable until the replica nodes for that row key
have recovered.
ONE A write must be written to the commit log and memory table of at least one replica node.
QUORUM A write must be written to the commit log and memory table on a quorum of replica nodes.
LOCAL_QUORUM A write must be written to the commit log and memory table on a quorum of replica nodes in the
same data center as the coordinator node. Avoids latency of inter-data center communication.
EACH_QUORUM A write must be written to the commit log and memory table on a quorum of replica nodes in a//
data centers.
ALL A write must be written to the commit log and memory table on all replica nodes in the cluster for
that row key.
31
Notes on Cassandra
32
Notes on Cassandra
ONE Returns a response from the closest replica (as determined by the
snitch). By default, a read repair runs in the background to make the
other replicas consistent.
QUORUM Returns the record with the most recent timestamp once a quorum of
replicas has responded.
LOCAL_QUORUM Returns the record with the most recent timestamp once a quorum of
replicas in the current data center as the coordinator node has
reported. Avoids latency of inter-data center communication.
EACH_QUORUM Returns the record with the most recent timestamp once a quorum of
replicas in each data center of the cluster has responded.
ALL Returns the record with the most recent timestamp once all replicas
have responded. The read operation will fail if a replica does not
respond.
34