0% found this document useful (0 votes)
3 views

UNIT 2-rdbms

The document covers relational query languages, including relational algebra and SQL, and discusses their applications in database management systems. It explains normalization, its types, advantages, and disadvantages, as well as the importance of relational decomposition and dependency preservation. Additionally, it compares Oracle Database and IBM Db2, highlighting their features and use cases.

Uploaded by

9923022056
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views

UNIT 2-rdbms

The document covers relational query languages, including relational algebra and SQL, and discusses their applications in database management systems. It explains normalization, its types, advantages, and disadvantages, as well as the importance of relational decomposition and dependency preservation. Additionally, it compares Oracle Database and IBM Db2, highlighting their features and use cases.

Uploaded by

9923022056
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 16

UNIT 2

Relation Query Languages, Relational Algebra, Tuple and Domain Relational


Calculus, SQL and QBE. Relational Database Design: Domain and Data dependency,
Armstrong’s Axioms, Normal Forms, Dependency Preservation, Lossless design,
Comparison of Oracle and DB2.

Relational Query Languages in DBMS

Relational Query Languages are used to interact with and manipulate relational
databases. These languages allow users to retrieve, insert, delete, and update data
stored in a relational model.

1. Types of Relational Query Languages

1.1 Relational Algebra

 A procedural query language used for retrieving data in a relational database.


 Operations produce new relations from input relations, which can then be used
in further operations.
 Basic operations:
1. Selection (σ): Filters rows based on a condition.

σ condition (Relation)
Example: σ Salary > 50000 (Employees)
2. Projection (π): Selects specific columns.
π Column1, Column2 (Relation)
Example: π Name, Salary (Employees)
3. Union (∪): Combines results of two relations with no duplicates.
Relation1 ∪ Relation2
4. Set Difference (-): Finds tuples in one relation but not in another.
Relation1 - Relation2
5. Cartesian Product (×): Combines every tuple of one relation with every
tuple of another.
Relation1 × Relation2
6. Join: Combines related tuples from two relations.
Relation1 ⨝ Relation2
7. Division (÷): Finds tuples in one relation that match all tuples in
another.
Relation1 ÷ Relation2

1.2 Structured Query Language (SQL)

 A high-level, declarative language widely used in relational databases.


 Key components:
1. Data Query Language (DQL): For retrieving data.

SELECT Name, Salary FROM Employees WHERE Salary > 50000;


2. Data Definition Language (DDL): For defining database structures.
CREATE TABLE Employees (
EmployeeID INT PRIMARY KEY,
Name VARCHAR(50),
Salary DECIMAL(10, 2)
);
3. Data Manipulation Language (DML): For modifying data.
INSERT INTO Employees (Name, Salary) VALUES ('Alice', 60000);
4. Data Control Language (DCL): For managing permissions.
GRANT SELECT ON Employees TO User1;

Comparison of Relational Query Languages

Feature Relational Algebra SQL

Type Procedural Declarative

Complexity Requires step-by-step instructions User-friendly

Standardization No standard Standardized (ANSI SQL)

Usage Academic and theoretical Practical and widely used

Ease of Learning Moderate Easy


2. Practical Examples

Using Relational Algebra

Find the names and salaries of employees earning more than 50,000:
π Name, Salary (σ Salary > 50000 (Employees))

Using SQL

SELECT Name, Salary FROM Employees WHERE Salary > 50000;

What is Normalization?

o Normalization is the process of organizing the data in the database.


o Normalization is used to minimize the redundancy from a relation or set
of relations. It is also used to eliminate undesirable characteristics like
Insertion, Update, and Deletion Anomalies.
o Normalization divides the larger table into smaller and links them using
relationships.
o The normal form is used to reduce redundancy from the database table.
Need of Normalization?

The main reason for normalizing the relations is removing these anomalies.
Failure to eliminate anomalies leads to data redundancy and can cause data
integrity and other problems as the database grows. Normalization consists of
a series of guidelines that helps to guide you in creating a good database
structure.

Types of Normal Forms:


Normalization works through a series of stages called Normal forms. The
normal forms apply to individual relations. The relation is said to be in
particular normal form if it satisfies constraints.

Following are the various types of Normal forms:


Normal Description
Form

1NF A relation is in 1NF if it contains an atomic value.

A relation will be in 2NF if it is in 1NF and all non-key attributes


2NF
are fully functional dependent on the primary key.

A relation will be in 3NF if it is in 2NF and no transition


3NF
dependency exists.

A stronger definition of 3NF is known as Boyce Codd's normal


BCNF
form.

A relation will be in 4NF if it is in Boyce Codd's normal form and


4NF
has no multi-valued dependency.

A relation is in 5NF. If it is in 4NF and does not contain any join


5NF
dependency, joining should be lossless.

Advantages of Normalization

o Normalization helps to minimize data redundancy.


o Greater overall database organization.
o Data consistency within the database.
o Much more flexible database design.
o Enforces the concept of relational integrity.

Disadvantages of Normalization

o You cannot start building the database before knowing what the user
needs.
o The performance degrades when normalizing the relations to higher
normal forms, i.e., 4NF, 5NF.
o It is very time-consuming and difficult to normalize relations of a higher
degree.
o Careless decomposition may lead to a bad database design, leading to
serious problems.

Normalization and its types:-

o Normalization is the process of organizing the data in the database.


o Normalization is used to minimize the redundancy from a relation or set
of relations. It is also used to eliminate undesirable characteristics like
Insertion, Update, and Deletion Anomalies.
o Normalization divides the larger table into smaller and links them using
relationships.
o The normal form is used to reduce redundancy from the database table.

Need of Normalization?

The main reason for normalizing the relations is removing these anomalies.
Failure to eliminate anomalies leads to data redundancy and can cause data
integrity and other problems as the database grows. Normalization consists of
a series of guidelines that helps to guide you in creating a good database
structure.

Normal Description
Form

1NF A relation is in 1NF if it contains an atomic value.

2NF
A relation will be in 2NF if it is in 1NF and all non-key
attributes are fully functional dependent on the primary key.

A relation will be in 3NF if it is in 2NF and no transition


3NF
dependency exists.

First Normal Form (1NF)

o A relation will be 1NF if it contains an atomic value.


o It states that an attribute of a table cannot hold multiple values. It must
hold only single-valued attribute.
o First normal form disallows the multi-valued attribute, composite
attribute, and their combinations.
Example: Relation EMPLOYEE is not in 1NF because of multi-valued
attribute EMP_PHONE.

EMPLOYEE table:

EMP_ID EMP_NAME EMP_PHONE EMP_STATE

7272826385,
14 John UP
9064738238

20 Harry 8574783832 Bihar

7390372389,
12 Sam Punjab
8589830302

The above EMPLOYEE table is an un-normalized relation as it contains


multiple values corresponding to EMP_PHONE attribute i.e. these values are
non-atomic. So relations with multi value entries are called un-normalized
relations.

To overcome this problem, we have to eliminate the non atomic values of


EMP_PHONE attribute.

The decomposition of the EMPLOYEE table into 1NF has been shown below:
EMP_ID EMP_NAME EMP_PHONE EMP_STATE

14 John 7272826385 UP

14 John 9064738238 UP

20 Harry 8574783832 Bihar

12 Sam 7390372389 Punjab

12 Sam 8589830302 Punjab

Second Normal Form (2NF)

o In the 2NF, relational must be in 1NF.


o In the second normal form, all non-key attributes are fully functional
dependent on the primary key
Example: Let's assume, a school can store the data of teachers and the
subjects they teach. In a school, a teacher can teach more than one subject.

TEACHER table

TEACHER_ID SUBJECT TEACHER_AGE

25 Chemistry 30

25 Biology 30

47 English 35

83 Math 38
83 Computer 38

In the given table, non-prime attribute TEACHER_AGE is dependent on


TEACHER_ID which is a proper subset of a candidate key. That's why it
violates the rule for 2NF.

To convert the given table into 2NF, we decompose it into two tables:

TEACHER_DETAIL table:

TEACHER_ID TEACHER_AGE

25 30

47 35

83 38

TEACHER_SUBJECT table:

TEACHER_ID SUBJECT

25 Chemistry

25 Biology

47 English

83 Math

83 Computer
Third Normal Form (3NF)

o A relation will be in 3NF if it is in 2NF and not contain any transitive


partial dependency.
o 3NF is used to reduce the data duplication. It is also used to achieve the
data integrity.
o If there is no transitive dependency for non-prime attributes, then the
relation must be in third normal form.
A relation is in third normal form if it holds atleast one of the following
conditions for every non-trivial function dependency X → Y.

1. X is a super key.
2. Y is a prime attribute, i.e., each element of Y is part of some candidate
key.

To explain the 3NF, let us consider the example of Employee_Detail


relation as shown below.

Example: EMPLOYEE_DETAIL table

Example:

EMPLOYEE_DETAIL table:

EMP_ID EMP_NAME EMP_ZIP EMP_STATE EMP_CITY

222 Harry 201010 UP Noida

333 Stephan 02228 US Boston

444 Lan 60007 US Chicago

555 Katharine 06389 UK Norwich

666 John 462007 MP Bhopal

Super key in the table above:

1. {EMP_ID}, {EMP_ID, EMP_NAME}, {EMP_ID, EMP_NAME, EMP_ZIP}....so o


n
Candidate key: {EMP_ID}
Non-prime attributes: In the given table, all attributes except EMP_ID are
non-prime.

1. Here, EMP_STATE & EMP_CITY dependent on EMP_ZIP and


EMP_ZIP dependent on EMP_ID. The non-prime attributes
(EMP_STATE, EMP_CITY) transitively dependent on super
key(EMP_ID).
2. It violates the rule of third normal form.The reduction of 2NF relation into
3NF consists of splitting the 2NF into appropriate relations such that
every non-key attribute are functionally dependent on the primary key
not transitively or indirectly of the respective relations.

That's why we need to move the EMP_CITY and EMP_STATE to the new
<EMPLOYEE_ZIP> table, with EMP_ZIP as a Primary key.

EMPLOYEE table:

EMP_ID EMP_NAME EMP_ZIP

222 Harry 201010

333 Stephan 02228

444 Lan 60007

555 Katharine 06389

666 John 462007

EMPLOYEE_ZIP table:

EMP_ZIP EMP_STATE EMP_CITY

201010 UP Noida

02228 US Boston
60007 US Chicago

06389 UK Norwich

462007 MP Bhopal

Relational Decomposition

o When a relation in the relational model is not in appropriate normal form


then the decomposition of a relation is required.
o In a database, it breaks the table into multiple tables.
o If the relation has no proper decomposition, then it may lead to
problems like loss of information.
o Decomposition is used to eliminate some of the problems of bad design
like anomalies, inconsistencies, and redundancy.

Types of Decomposition

Lossless Decomposition

o If the information is not lost from the relation that is decomposed, then
the decomposition will be lossless.
o The lossless decomposition guarantees that the join of relations will
result in the same relation as it was decomposed.
o The relation is said to be lossless decomposition if natural joins of all
the decomposition give the original relation.
Example:

EMPLOYEE_DEPARTMENT table:

EMP_I EMP_NAM EMP_AG EMP_CIT DEPT_I DEPT_NAM


D E E Y D E

22 Denim 28 Mumbai 827 Sales

33 Alina 25 Delhi 438 Marketing

46 Stephan 30 Bangalore 869 Finance

52 Katherine 36 Mumbai 575 Production

60 Jack 40 Noida 678 Testing

The above relation is decomposed into two relations EMPLOYEE and


DEPARTMENT

EMPLOYEE table:

EMP_ID EMP_NAME EMP_AGE EMP_CITY

22 Denim 28 Mumbai

33 Alina 25 Delhi

46 Stephan 30 Bangalore

52 Katherine 36 Mumbai
60 Jack 40 Noida

DEPARTMENT table

DEPT_ID EMP_ID DEPT_NAME

827 22 Sales

438 33 Marketing

869 46 Finance

575 52 Production

678 60 Testing

Now, when these two relations are joined on the common column "EMP_ID",
then the resultant relation will look like:

Employee ⋈ Department

EMP_I EMP_NAM EMP_AG EMP_CIT DEPT_I DEPT_NAM


D E E Y D E

22 Denim 28 Mumbai 827 Sales

33 Alina 25 Delhi 438 Marketing

46 Stephan 30 Bangalore 869 Finance

52 Katherine 36 Mumbai 575 Production

60 Jack 40 Noida 678 Testing


Hence, the decomposition is Lossless join decomposition.

Dependency Preserving

o It is an important constraint of the database.


o In the dependency preservation, at least one decomposed table must
satisfy every dependency.
o If a relation R is decomposed into relation R1 and R2, then the
dependencies of R either must be a part of R1 or R2 or must be
derivable from the combination of functional dependencies of R1 and
R2.
o For example, suppose there is a relation R (A, B, C, D) with functional
dependency set (A->BC). The relational R is decomposed into R1(ABC)
and R2(AD) which is dependency preserving because FD A->BC is a
part of relation R1(ABC).

Comparison of Oracle and IBM Db2

Oracle Database and IBM Db2 are two popular Relational Database Management
Systems (RDBMS) widely used in enterprise environments. Both have distinct
features, strengths, and use cases. Here's a detailed comparison:

Aspect Oracle Database IBM Db2


IBM (International Business
Developer Oracle Corporation
Machines)
Initial Release 1979 1983
Platform Cross-platform (Linux, Windows, Cross-platform (Linux,
Support Solaris, etc.) Windows, z/OS, AIX, etc.)
General-purpose enterprise
Enterprise environments,
Primary Use database with advanced analytics,
especially in mainframes and
Cases transaction processing, and
transactional systems.
scalability.
Strong integration with IBM
Multi-tenant architecture with
mainframes, z/OS, and robust
Architecture Pluggable Databases (PDBs) for
OLTP (Online Transaction
scalability and consolidation.
Processing).
SQL (Structured Query
Query SQL with procedural extension
Language) with PL/SQL for
Language SQL PL.
procedural extensions.
- Advanced Data Guard for - Strong support for transactional
disaster recovery. systems and analytics.
Key Features
- Automatic Storage Management - BLU Acceleration for in-
(ASM). memory analytics.
Aspect Oracle Database IBM Db2
- Multi-model support (relational, - Deep integration with IBM
JSON, XML, spatial). Watson and AI/ML tools.
- Real Application Clusters - Native XML support and JSON
(RAC) for high availability. compatibility.
- Oracle Autonomous Database
for self-tuning.
Highly optimized for mixed Excellent for high-volume OLTP
Performance workloads (transactional and workloads, especially in
analytical). mainframe environments.
Extremely scalable with RAC for Highly scalable, especially in
Scalability
horizontal scaling. mainframe (z/OS) environments.
- Advanced encryption (TDE). - Row and column-level access
Security - Fine-grained access control. control.
- Data masking and redaction. - Native encryption.
Typically higher TCO (Total Cost
Comparatively cost-effective,
Cost of Ownership), especially for
especially in IBM ecosystems.
licenses and support.
Oracle Cloud Infrastructure (OCI)
Cloud IBM Cloud and integration with
with Autonomous Database
Integration hybrid cloud solutions.
offerings.
- Oracle Enterprise Manager for
- Db2 Administration Tool.
Tooling and comprehensive management.
- IBM Data Studio for
Management - Oracle GoldenGate for
development and management.
replication.
Widely adopted across industries, Strong presence in banking,
Market
especially in finance, healthcare, government, and industries
Popularity
and retail. leveraging IBM mainframes.
Large community and extensive Strong support ecosystem within
Community
online resources. Excellent IBM and a loyal user base in
and Support
vendor support. mainframe environments.
Excellent integration with Oracle Seamless integration with IBM
Integration applications (e.g., Oracle ERP, tools like Cognos, Watson, and
CRM). WebSphere.

Key Strengths of Each

Oracle Database:

1. High Availability: Real Application Clusters (RAC) and Data Guard offer
excellent fault tolerance and disaster recovery.
2. Autonomous Features: Self-tuning and machine learning-driven
optimizations.
3. Broad Application Support: Widely used across diverse industries for
enterprise applications.
4. Multi-Model Database: Supports relational, document (JSON), spatial, and
graph data.

IBM Db2:

1. Mainframe Integration: Best suited for environments running IBM z/OS.


2. BLU Acceleration: In-memory processing for fast analytical queries.
3. Hybrid Transaction and Analytical Processing (HTAP): Allows
simultaneous OLTP and OLAP workloads.
4. Cost-Effective for IBM Ecosystem Users: Offers better pricing and
performance in IBM-centric environments.

You might also like