Unit 5 Lecture 1
Unit 5 Lecture 1
SQL or Structured Query Language is the primary interface used to communicate with Relational Databases. SQL became a standard
of the American National Standards Institute (ANSI) in 1986. The standard ANSI SQL is supported by all popular relational database
engines, and some of these engines also have extension to ANSI SQL to support functionality which is specific to that engine. SQL is
used to add, update or delete rows of data, retrieving subsets of data for transaction processing and analytics applications, and to
manage all aspects of the database.
Relational Databases
• There are two types of data storage database one is row oriented database and
another one is column oriented database.
• Row oriented database is traditional database like Oracle ,MySql and etc. It stores
data table by row and common method of storing a table is to serialize each row of
data. Row-based systems are designed to efficiently return data for an entire row, or
record.
• On the other hand, column based database are "No SQL" database such as HBase
and Cassandra. Column oriented databases do not support "traditional" transactional
secondary indices. It is the responsibility of the user to maintain "inverted index"
Sr. No. Key Row Oriented Database Column Oriented Database
2 Data Data accessing happens row by row Data accessing happens column by
Accessing column
3 Storage Storage size optimization limited due Column based systems provide better
to reduced ability of data compression storage size optimization capabilities.
in row based systems
4. Performance It takes longer time than column It is faster than row oriented database
oriented database because it requires
multiple disk read
5. Use Case Best suited for OLTP Best suited for OLAP
Data Storage Techniques
Types of data stores
Key / value stores (opaque)
key value
key value
Example values:
{ name: „foo“, age: 25, city: „bar“ } => JSON, but store will not care about it
\xde\xad\xb0\x0b => binary, but store will not care about it
Key / value stores (typed)
Document stores (non-shaped)
Document stores (shaped)
Parallel Database Architectures
• Shared memory
– Suitable for servers with multiple CPUs
– Memory address space is shared and managed by a symmetric multi-processing (SMP)
operating system
– SMP:
• Schedules processes in parallel exploiting all the processors
• Shared nothing
– Cluster of independent servers each with its own disk space
– Connected by a network
• Shared disk
– Hybrid architecture
– Independent server clusters share storage through high-speed network storage viz. NAS
(network attached storage) or SAN (storage area network)
– Clusters are connected to storage via: standard Ethernet, or faster Fiber Channel or Infiniband
connections
Parallel Database Architectures
Advantages of Parallel DB over Relational DB
• Efficient execution of SQL queries by exploiting multiple processors
• For shared nothing architecture:
– Tables are partitioned and distributed across multiple processing nodes
– SQL optimizer handles distributed joins
• Distributed two-phase commit locking for transaction isolation between processors
• Fault tolerant
– System failures handled by transferring control to “stand-by” system [for transaction
processing]
– Restoring computations [for data warehousing applications]
• A transaction is said to follow Two Phase Locking protocol if Locking and Unlocking can be done in
two phases.
1.Growing Phase: New locks on data items may be acquired but none can be released.
2.Shrinking Phase: Existing locks may be released but no new locks can be acquired.
Advantages of Parallel DB over Relational DB
1. What is NoSQL?
5. Explain the difference between row and column oriented data storage.
References
Dan C Marinescu: “ Cloud Computing Theory and Practice.” Elsevier(MK) 2013.
RajkumarBuyya, James Broberg, Andrzej Goscinski: “Cloud Computing Principles
and Paradigms”, Willey 2014.
https://www.ques10.com/p/13989/explain-architecture-of-google-file-system-1/
https://www.sciencedirect.com/topics/computer-science/google-file-system
https://www.researchgate.net/publication/220910111_The_Google_File_System