17 DatabaseArchitectures
17 DatabaseArchitectures
Architectures
CS220
• Introduce Homework
• Parallelism
• Distributed Databases
• Introduction to NoSQL
Homework
Parallelism
We Need More Power!
• Definitions
• Point query: query to look up a single tuple (i.e. age = 25)
• Range query: query to look up a range of values
(i.e. age > 25 and age < 60)
Partitioning Techniques:
Round Robin
• Partitioning techniques (number of disks = n):
• Round-robin: Send the Ith tuple inserted in the relation to disk
i mod n.
• Good for sequential reads of entire table
• Even distribution of data over disks
• Range queries are expensive
Partitioning Techniques:
Hash Partitioning
• Partitioning techniques (number of disks = n):
• Hash partitioning: Choose one or more partitioning
attribute(s) and apply a hashing function to their values that
produces a value within the range of 0…n – 1 disks
• Good for sequential or point queries based on partition attribute(s)
• Range queries are expensive
Partitioning Techniques:
Range Partitioning
• Partitioning techniques (number of disks = n):
• Range partitioning: Choose a partitioning attribute, and
divide its values into ranges, tuples that match a given range
go in the corresponding partition
• Clusters data by partition value (i.e. by date range)
• Good for sequential access and point queries on partitioning
attribute
• Supports range queries on partitioning attribute
Potential Problems
• Skew: non-uniform distribution of database records
• Hash partitioning: bad hash function (not uniform or random)
• Range partitioning: lots of records going In the same partition
(web traffic/orders stored in a date-partitioned table, more
during the shopping season)
Distributed
Databases
One Database, Multiple
Locations
• Distributed database is stored on several
computers located at multiple physical sites
• Types of distributed database
• Homogeneous – all systems run the same brand of
DBMS software on the same OS and hardware
• Coordination is easier in this setup
• Heterogeneous – system run different DBMS on
potentially different OS and hardware
Advantages of Distributed
Systems
• Sharing of data generated at different sites
• Increased complexity
• Difficult to debug
Fragmentation
• Splitting a table up between sites
• Also called sharding
• Horizontal fragmentation
• Vertical Fragmentation
• Advantages
• Allows data to be moved without user needing to know
• Allows query planner to determine the most efficient way to
get data
• Allows access of replicated data from another site if local
copy is unavailable
Names of Data Items
• Criteria – Each data item in a distributed system should be
• Uniquely named
• Efficient to find
• Easy to relocate
• Each site should be able to create new items autonomously
• Approaches
• Centralized naming server
• Keeps item names unique, easy to find, easy to move (via lookup)
• Names cannot be created locally -- high communication cost to get new
names
• What happens if the naming server goes down?
• Incorporate site ID into names
• Meets criteria, but at the cost of location transparency
• Maintain a set of aliases at each site mapping local to actual names
• i.e. customer => site17.customer
Querying Distributed
Data
• Queries and transactions can be either
• Local – all data is stored at current site
• Global – it needs data from one or more remote
sites
• Transaction might originate locally and need data from
elsewhere
• Transaction might originate elsewhere, and need data
stored locally
• Semijoin -- |X
• r1 |X r2 = π R1 ( r1 |X| r2 )
• Transfer only those tuples in r1 which match in the natural join with
r2 between sites
Global Query Library
Example
• Given
• checkout relation stored locally
• (Large) book_info relation (call_no, title, etc.) stored centrally
• One site (usually the site originating the update) acts as the
coordinator
• Each site completes work on the transaction, becomes partially
committed, and notifies the coordinator
• Once coordinate receives completion messages from all sites, it
can begin the commit protocol
• If coordinator receives a failure message from one or more sites, it
instructs all sites to abort the transaction
• If the coordinator does not receive any message from a site in a
reasonable amount of time, it instructs all sites to abort the
transaction
• Site or communication link might have failed during the transaction
2PC Phase 1: Obtaining a
Decision
• Coordinator writes a <prepare T> entry to its log and
forces all log entries to stable storage
• Coordinator sends a prepare-to-commit message to all
participating sites
• Ideally, each site writes a <ready T> entry to its log,
forces all log entries to stable storage, and sends a ready
message to the coordinator
• If a site needs to abort the transaction, it writes a <no T>
entry to its log, forces all entries to stable storage, and sends
an abort message to the coordinator
• Once a site sends a ready message to the coordinator, it
gives up its right to abort the transaction
• It must commit if/when the coordinator instructs it to
2PC Phase 2: Recording
the Decision
• Coordinator waits for each site to respond to the prepare-to-commit
message
• If any site responds negatively or fails to respond, coordinator writes an
<abort T> entry to its log and sends an abort message to all sites
• If all responses are positive, coordinator writes a <commit T> entry to its log
and sends a commit message to all sites
• At this point, the coordinator’s decision is final
• 2PC protocol will work to carry it out even if a site fails
• …after sending a final decision to at least one site, it will figure out
what to do after it recovers based on its log
• <start T> but no <prepare T> abort transaction
• <prepare T> but no <commit T> find out status of sites or abort
transaction
• <abort T> or <commit T>, but no <complete T> restart sending of
commit/abort messages and waiting for acknowledgements
• Sites may be able to find out what to do from each other when the
coordinator is down
Updating Replicated
Data
• All replicas of a given data item must be
kept synchronized when updates occur
• How to do this
• Simultaneous updates of all replicas for each
transaction
• Ensures consistency across replicas
• Slows down update transactions and breaks
replication transparency
• What happens if a replica is unreachable during
an update?
Primary Copy
• Designate a primary copy of the data at some site
• Reads can happen on any replica, but updates happen on primary
copy first
• Primary copy’s site sends updates to replica sites
• Immediately after each update or periodically (if eventual consistency is
OK)
• Resending updates periodically to sites that are down
• Disadvantages
• More message overhead – need to send lock request,
receive lock granted, and unlock message in addition
to the data involved
• Deadlock detection gets harder
• Further complications to updating replicated data
• How many replica locks are needed to do an update (all
of them? Most of them?)
• Primary copy method helps with this, as only primary
copy needs to be locked
Timestamps for Distributed
Concurrency Control
• Must ensure consistency and uniqueness of
timestamps across sites
• Combine locally generated timestamp and site id into a
transaction’s global timestamp