0% found this document useful (0 votes)
9 views

CAP Theorem

Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views

CAP Theorem

Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 15

CAP Theorem

Dr. Richa Sharma


Commonwealth University

1
Introduction

• Actions performed through online applications reflect real-world


database transactions (involving real-time updates) that are
triggered by events such as buying a product, registering for a
course, or making a deposit into a checking account.

• Transactions are likely to contain many parts, such as updating


a customer’s account, updating the seller’s accounts receivable,
adjusting product inventory etc.

• All parts of a transaction must be successfully completed to


prevent data integrity problems – these transactions work well
with RDBMS. Executing and managing transactions are
important relational database system activities!
2
Introduction
• A transaction is a logical unit of work that must be entirely
completed or entirely aborted; no intermediate states acceptable!

• Transactions do not occur in isolation. Multiple users are


connected to the database, therefore, many transactions take
place at the same time – that’s why there could be problems with
their execution!!

• To ensure successful execution of concurrent transactions and


consistent state of the database, there are 4 properties of
transactions that are always maintained.

• These properties include: atomicity, consistency, isolation, and


durability. These are popularly known as ACID properties!
3
ACID Properties
• Atomicity – requires that all operations of a transaction be
completed successfully; if not, the transaction is aborted. In
other words, a transaction is treated as a single, indivisible,
logical unit of work.

• Partial completion will result in inconsistent state of the


database. A consistent database state is one in which all data
integrity constraints are satisfied.

• If any of the SQL statements fail, the entire transaction should


be rolled back to the original database state that existed
before the transaction started. A successful transaction
changes the database from one consistent state to another.
4
ACID Properties
• Consistency – refers to a database condition in which all
data integrity constraints are satisfied. When a transaction is
completed, the database must be in a consistent state! If any
of the transaction parts violates an integrity constraint, the
entire transaction is aborted.

• Isolation – requires that a data item used by one transaction


is not available to other transactions until the first one ends.
• An example: if transaction T1 is being executed and is using the
data item X, that X cannot be accessed by any other transaction
(T2 … Tn) until T1 ends.

5
ACID Properties
• Durability – ensures that once transaction changes are done
and committed, they cannot be undone or lost, even in the
event of a system failure. The consistent database is then
available to users.

• These ACID properties ensure that the transaction execution


on RDBMS will result in consistent database state available
for future transactions.

• Which properties are satisfied by distributed databases where


we want to harness performance capabilities of distributed
databases? New parameter here – partitions/ distributed
nature of data! Can ACID properties be satisfied?
6
CAP Theorem
• CAP theorem or Brewer’s theorem proves that we can create
a distributed database that can have one or two (but not all
three at one point in time) of the following qualities:
consistency, availability, partition-tolerance.

If a cloud application is designed,


then we should choose a data
management system that delivers
the characteristics that the
application needs most of these
three qualities!

Image Source: https://en.wikipedia.org/wiki/CAP_theorem


7
CAP Theorem
• Consistency – Consistency means that all clients see the
same data at the same time, no matter which node they
connect to, i.e. every read receives the most recent write or
an error.
• For this to happen, whenever data is written to one node, it
must be instantly forwarded or replicated to all the other nodes
in the system before the write is deemed ‘successful’.

• Availability – Availability means that any client making a


request for data gets a response, even if one or more nodes
are down. Another way to state this—all working nodes in the
distributed system return a valid response for any request,
without the guarantee that it contains the most recent write.
8
CAP Theorem
• Partition tolerance – A partition is a communications break
within a distributed system – a lost or temporarily delayed
connection between two nodes. Partition tolerance means that
the cluster must continue to work despite any number of
communication breakdowns between nodes in the system.

• When a network partition failure happens, choose between


consistency or availability:
• cancel the operation and thus decrease the availability but
ensure consistency.
• proceed with the operation and thus provide availability but risk
inconsistency.
9
NoSQL Databases
• NoSQL databases - ideal for distributed network applications.
• Unlike vertically scalable (SQL-based) RDBMS, NoSQL
databases are horizontally scalable and distributed by design
– that allows these databases to rapidly scale across a
growing network consisting of multiple interconnected nodes.

• Based on CAP theorem, NoSQL databases are classified as:

• CP database: – delivers consistency and partition tolerance


at the expense of availability! When a partition occurs, the
system has to shut down the non-consistent node (i.e., make
it unavailable) until the partition is fixed. Example: HBase,
MongoDB.
10
NoSQL Databases
• CA database: – delivers consistency and availability across
all nodes. Example: Redis, Neo4j
• This won’t be possible if there is a partition between any two nodes in
the system. This will be satisfied only in the absence of partition.
• Database systems with traditional ACID guarantees in mind such as
RDBMS choose consistency over availability.
• Systems based on BASE** philosophy (eventual consistency),
common in NoSQL databases, choose availability over consistency.

• AP database: delivers availability and partition tolerance at


the expense of consistency. When a partition occurs, all
nodes remain available but those at the wrong end of a
partition might return an older version of data than others.
Example: CouchDB, Cassandra. 11
Summary - CAP Theorem
• Understanding the CAP theorem helps choose the best
database to be used for the application.

• Example: If the ability to quickly iterate the data model and


scale horizontally is essential to the application, but eventual
(as opposed to strict) consistency can be tolerated, an AP
database like Cassandra or Apache CouchDB.

• Example: if the application depends heavily on data


consistency, such as in an eCommerce application or a
payment service, then a relational database is a better
choice!

12
Microservices
• Microservices are loosely coupled, independently deployable
application components that incorporate their own stack –
including their own database and database model, and
communicate with each other over a network.

• Microservices have become highly popular for hybrid and


multi-cloud applications.

• Related terms: API, web services!

• A microservice contains the code required for a particular


application function whereas an API is a communication
mechanism to access that function.
13
Microservices
• Microservices expose functionality via APIs so other
microservices can use them when required.
• However, there are APIs and being used as well that are
unrelated to microservices, such as APIs from third-party
vendors and partners.

• Microservice architecture is the evolution of the service-


oriented architecture (SOA).
• Developers decompose an entire application into individual
functions that run as small, standalone programs – the
microservices interact with each other to perform more
complex tasks.
14
References

• https://www.ibm.com/topics/cap-theorem
• https://en.wikipedia.org/wiki/CAP_theorem

• https://www.abstractapi.com/guides/api-vs-web-services

• ** : Basically available, Soft-state, Eventually Consistent

15

You might also like