DBMS – Unit 2, Lecture 1
Relational Model & Relational Algebra (Basics)
1. Relational Model
Introduced by E.F. Codd in 1970, the Relational Model is the most popular way of organizing data. It stores data in the
form of relations (tables) made of tuples (rows) and attributes (columns).
Key Terms
- Relation (Table): A collection of rows and columns.
- Tuple (Row): A single record in the relation.
- Attribute (Column): A property that describes the entity.
- Domain: The set of valid values for an attribute.
- Degree: Number of attributes (columns) in a relation.
- Cardinality: Number of tuples (rows) in a relation.
- Relation Schema: Structure/design → name + attributes.
- Relation Instance: Data stored at a particular moment.
Example: STUDENT Relation
RollNo Name Dept Semester
101 Amit CSE 5
102 Neha ECE 3
103 Rahul CSE 5
Schema: STUDENT(RollNo, Name, Dept, Semester)
- Tuple Example: (101, Amit, CSE, 5)
- Degree: 4
- Cardinality: 3
2. Relational Keys
Keys uniquely identify tuples and maintain integrity.
1. Super Key – Any set of attributes that can uniquely identify a tuple.
Example: {RollNo}, {RollNo, Name}
2. Candidate Key – Minimal super key (no unnecessary attributes).
Example: {RollNo}
3. Primary Key – One candidate key chosen as the main identifier.
Example: RollNo
4. Foreign Key – An attribute that refers to the primary key of another relation.
Example: ENROLL(RollNo, CourseID) → RollNo is foreign key referencing STUDENT.
5. Composite Key – Key formed using more than one attribute.
Example: (RollNo, CourseID) in ENROLL relation.
3. Integrity Constraints
Constraints are rules to ensure correctness of data.
- Domain Constraint – Attribute values must be from the defined domain.
Example: Age must be an integer between 18–60.
- Entity Integrity – Primary key cannot be NULL or duplicated.
Example: RollNo must always be unique.
- Referential Integrity – Foreign key must refer to an existing primary key or be NULL.
Example: [Link] must match [Link].
4. Database Languages
Languages used to interact with databases:
- Procedural (HOW + WHAT): Relational Algebra, Tuple Relational Calculus, PL/SQL
Example (RA): σ Dept='CSE' (STUDENT)
- Non-Procedural (WHAT only): SQL
Example: SELECT * FROM STUDENT WHERE Dept='CSE';
5. Relational Algebra (Basic Operators)
Relational Algebra is a procedural query language – tells how to retrieve data. In this lecture, we cover only Selection
(σ) and Projection (π).
5.1 Selection (σ) – Row Filtering
- Used to select tuples (rows) that satisfy a condition.
Syntax:
- All rows & columns: σ (Relation)
- Specific rows: σ condition (Relation)
Comparison Operators:
= (equal), != or <> (not equal), <, >, <=, >=
Character Data Rule:
- Must be enclosed in single quotes ' '
- Case-sensitive (e.g., 'Ram' ≠ 'ram')
Logical Operators:
- AND (∧) – all conditions must hold
- OR (∨) – at least one condition holds
- NOT (¬) – negates a condition
Examples (STUDENT):
1. σ Dept='CSE' (STUDENT) → Students from CSE
2. σ Semester>3 (STUDENT) → Students in higher semesters
3. σ Dept='CSE' ∧ Semester=5 (STUDENT) → CSE students in 5th semester
4. σ Dept='CSE' ∨ Dept='ECE' (STUDENT) → Students from CSE or ECE
5. σ ¬(Dept='CSE') (STUDENT) → Students not in CSE
⚠️ Common Mistake Alert:
We cannot write: σ Color='Red' ∧ Color='Green' (BOAT)
Because a single tuple cannot have two different values in the same column.
5.2 Projection (π) – Column Filtering
- Used to select attributes (columns).
Syntax:
- All rows, specific columns: π column1, column2 (Relation)
- Specific rows + columns: π column1, column2 (σ condition (Relation))
Examples (STUDENT):
1. π Name, Dept (STUDENT) → Shows only Name and Dept of all students.
2. π Name (σ Dept='CSE' (STUDENT)) → Shows Names of students from CSE.
DBMS – Unit 2, Lecture 2
Relational Algebra (Advanced Operators)
1. Reminder: Relational Algebra
A procedural query language (tells how to retrieve data).
Works as a mathematical foundation for SQL.
Queries are expressed using operators that take relations as input and produce
relations as output.
2. Assignment Operator ( ← )
Purpose: To store the result of a relational algebra query into a temporary
relation.
Makes complex queries easier by reusing results.
Notation:
r ← relational_algebra_expression
Example
r ← π s_name, course (σ fee_bal < 10000 (STUDENT))
π s_name (r)
First query selects students having fee balance < 10000, projects s_name and
course, and stores the result in r.
Second query displays only s_name from r.
👉 Think of it as creating a temporary logical table that can be used later.
3. Union Operator ( ∪ )
Definition: Merges the tuples of two relations into one.
Result: All tuples from both relations (duplicates removed).
Conditions:
1. Both relations must have the same degree (same number of attributes).
2. Corresponding attributes must have the same domain/data type.
Example
π c_name (ACCOUNTS) ∪ π c_name (BORROWER)
→ Displays names of all customers who either have an account or a loan (or both).
4. Intersection Operator ( ∩ )
Definition: Returns only the tuples that are common in both relations.
Conditions: Same as Union (same degree and same data type).
Example
π c_name (ACCOUNTS) ∩ π c_name (BORROWER)
→ Displays names of customers who have both an account and a loan.
5. Set Difference / Minus Operator ( – )
Definition: Returns tuples that are in the first relation but not in the second.
Conditions: Same as Union and Intersection.
Example
π c_name (ACCOUNTS) – π c_name (BORROWER)
→ Displays names of customers who have only an account but no loan.
Operator Meaning Example Result
Union All tuples from both Accounts ∪ Customers with Account OR
(∪) relations Borrower Loan
Intersection Accounts ∩ Customers with BOTH
Common tuples only
(∩) Borrower Account & Loan
Set
Tuples in first, not in Accounts – Customers with ONLY
Difference
second Borrower Account
(–)
DBMS – Unit 2, Lecture 3
Relational Algebra – Cartesian Product ( × )
1. Why Study Cartesian Product?
In relational algebra, projection (π) works only on columns of the same relation.
But real-world queries often need attributes from different relations.
Example:
o STUDENT(roll_no, name, reg_no)
o FEE_BAL(reg_no, fee_amount)
o Query: Display student names with their fee balances.
👉 Problem: name exists in STUDENT, fee_amount exists in FEE_BAL.
👉 Solution: Use Cartesian Product ( × ) to merge both relations into a single logical
relation at runtime.
2. Definition of Cartesian Product
Binary operator in relational algebra.
Combines every tuple of one relation with every tuple of another relation.
Result Relation:
o Degree (columns): sum of attributes of both relations.
o Cardinality (rows): product of the number of tuples.
Notation:
R×S
where R and S are two relations.
3. Example Working
STUDENT Relation (3 tuples): FEE_BAL Relation (2 tuples):
roll_no name reg_no
1 Amit 101 reg_no fee_amount
2 Neha 102 101 5000
3 Rahul 103 102 7000
Step 1: Cartesian Product
STUDENT × FEE_BAL
Result (3 × 2 = 6 tuples):
roll_no name reg_no (S) reg_no (F) fee_amount
1 Amit 101 101 5000
1 Amit 101 102 7000
2 Neha 102 101 5000
2 Neha 102 102 7000
3 Rahul 103 101 5000
3 Rahul 103 102 7000
👉 This table contains many meaningless combinations (e.g., Rahul with fee records of reg_no
101 or 102).
Step 2: Apply Selection Condition
σ STUDENT.reg_no = FEE_BAL.reg_no (STUDENT × FEE_BAL)
Filtered Result:
roll_no name reg_no fee_amount
1 Amit 101 5000
2 Neha 102 7000
Now the result makes sense: student names with their fee balances.
4. Key Points About Cartesian Product
Combines tuples from two relations (all possible pairs).
Degree = sum of degrees, Cardinality = product of cardinalities.
Produces many meaningless tuples → usually followed by Selection (σ).
DBMS – Unit 2, Lecture 4
Relational Algebra — Division Operator (÷)
1) Why Division Operator?
Most operators in relational algebra handle queries with conditions like “some” or “at least
one” (e.g., Union, Selection). But some queries require a “for all” condition.
Example: Find the names of students who have registered for all courses offered by the
department. Here, the keyword 'all' indicates the need for the Division Operator (÷).
2) Definition of Division Operator (÷)
The Division operator is used when:
• One relation (Dividend) has two attributes.
• Another relation (Divisor) has one attribute.
• One column of the Dividend must match the column of the Divisor.
• The result is the set of values from the other column of the Dividend that are associated
with all values in the Divisor.
Notation: R ÷ S
Where:
• R (Dividend) → must have 2 columns.
• S (Divisor) → must have 1 column.
• Result → single column relation.
3) Example 1 — Students Enrolled in All Courses
STUDENT_COURSE (s_name, c_name)
s_name c_name
Amit DBMS
Amit OS
Neha DBMS
Neha OS
Rahul DBMS
COURSE (c_name)
c_name
DBMS
OS
Query: Find students who have registered for all courses.
Operation: STUDENT_COURSE ÷ COURSE
Result:
s_name
Amit
Neha
Explanation: Amit has both DBMS and OS. Neha also has both DBMS and OS. Rahul has only
DBMS, so he is not included.
4) Example 2 — Sailors Who Reserved All Boats
BOAT (b_id, b_name, color)
b_id b_name color
1 A Red
2 B Blue
3 C Green
SP (s_id, b_id) — shows which sailor reserved which boat
s_id b_id
101 1
101 2
101 3
102 1
102 2
103 1
103 3
Query: Find s_id of sailors who reserved all boats.
Step 1: Dividend = SP(s_id, b_id), Divisor = BOAT(b_id)
Step 2: Operation: SP ÷ BOAT(b_id)
Step 3: Result:
s_id
101
Explanation: Sailor 101 reserved all boats {1, 2, 3}. Sailor 102 reserved only {1, 2} and Sailor
103 reserved only {1, 3}, so both miss at least one boat and are excluded.
5) Key Insights for Students
• Division is used only for “for all” queries.
• Dividend relation → 2 columns.
• Divisor relation → 1 column.
• Result → tuples from Dividend that are associated with all tuples of Divisor.
• If even one value is missing, the tuple is excluded.
Joins in Relational Algebra
Introduction
In Relational Algebra, a Join is basically a combination of Cartesian Product with an equality
condition. First, we do a Cartesian product between two relations, then we apply a
condition (generally equality) to keep only the matching tuples.
Types of Joins
We will discuss the following types of joins:
1. Theta Join
Definition: A general join where the condition (θ) can be any comparison operator like =, <,
>, ≤, ≥.
Equi Join is a special case of Theta Join.
Example: Employee ⨝ [Link] > [Link] Manager
2. Equi Join
Definition: A special case of Theta Join where the condition uses only equality (=). The result
includes all columns from both relations, so common attributes appear twice.
Example: Student ⨝ [Link] = [Link] Department
3. Natural Join
Definition: A special case of Equi Join. We do not need to write the equality condition
explicitly. It automatically joins on all columns with the same name in both relations and
removes duplicates.
Example: Student ⨝ Department
Key Differences between Equi Join and Natural Join
Join Type Condition Result
Equi Join Equality condition explicitly Duplicate common columns
written included
Natural Join Automatically joins on same Duplicate columns removed
column names
Example with Tables
Consider the following relations:
Student(SID, SName, DeptID)
Department(DeptID, DName)
Student Table
SID SName DeptID
1 A 10
2 B 20
3 C 10
Department Table
DeptID DName
10 CS
20 Math
30 Physics
Cartesian Product: 3 × 3 = 9 rows (all combinations).
Equi Join ([Link] = [Link]): 3 rows (students matched with their
department).
Natural Join: Same as Equi Join, but DeptID appears only once.
Lecture: Aggregate Functions in Extended Relational Algebra
1. Motivation
- Basic Relational Algebra (RA) can only select, project, join, and manipulate tuples.
- It cannot compute summaries/statistics (like total salary, average marks, number of
students).
- For this, Extended Relational Algebra introduces the aggregate operator.
2. Definition
Aggregate Function → A function that takes a set of values as input and returns a single
value.
Common Aggregate Functions:
- SUM(A) → sum of values of attribute A
- AVG(A) → average of values of attribute A
- MAX(A) → maximum value
- MIN(A) → minimum value
- COUNT(A) → number of tuples
- COUNT DISTINCT(A) → number of unique values
3. Notations
Two notations are commonly used:
(i) General / Mathematical Notation:
G_{grouping_attributes, F(A)}(R)
(ii) Book-style Notation:
grouping_attributes G F(A)(R)
Both mean the same thing, just written differently.
4. Simple Examples (No Grouping)
Relation: Student(rollno, name, dept, marks)
1. Sum of all students’ marks:
G_{SUM(marks)}(Student) or G SUM(marks)(Student)
2. Average marks of all students:
G_{AVG(marks)}(Student)
3. Total number of students:
G_{COUNT(rollno)}(Student)
5. Grouping Examples
Relation: Student(rollno, name, dept, marks)
Example: Average marks per department:
G_{dept, AVG(marks)}(Student)
or dept G AVG(marks)(Student)
Output:
| dept | AVG(marks) |
|------|------------|
| CS | 75 |
| IT | 75 |
6. Multiple Aggregates Together
Example: Find average and maximum marks in each department:
dept G AVG(marks), MAX(marks)(Student)
Output:
| dept | AVG(marks) | MAX(marks) |
|------|------------|------------|
| CS | 75 | 80 |
| IT | 75 | 90 |
7. Aggregates with Selection
We can filter rows before applying aggregation.
Example: Average marks of CS students only:
G_{AVG(marks)}(σ_{dept='CS'}(Student))
8. Complex Example (Nested Aggregates)
Query: Find the names of students whose age is greater than the age of all IT department
students.
Relation: Student(rollno, name, dept, age)
Stepwise Solution:
1. Find maximum age in IT dept:
T1 = π_{MAX(age)}(σ_{dept='IT'}(dept G MAX(age)(Student)))
2. Remove IT students from the table:
T2 = Student - σ_{dept='IT'}(Student)
3. Compare other students’ ages with IT max age:
T3 = σ_{[Link] > [Link](age)}(T2 × T1)
4. Project only names:
Answer = π_{name}(T3)
9. General Strategy for Complex Queries
1. Break the query into sub-tasks.
2. Store each result as an intermediate relation (logical table).
3. Reuse them with set operations (−, ×) and selection/projection.
✅Summary
- Aggregate functions = SUM, AVG, MIN, MAX, COUNT, COUNT DISTINCT
- Use G operator with/without grouping attributes
- Combine with selection (σ), projection (π), set operations (−, ×) for complex queries
- Store intermediate results in logical tables for readability
Lecture Notes: Generalized Projection & Outer Join in Extended Relational Algebra
1. Generalized Projection
1.1 Motivation
- Basic Projection (π) can only select specific attributes (columns).
- In real queries, we may need computed attributes (like salary * 1.10), renaming of attributes, or storing results for
reuse.
- For this, Extended Relational Algebra introduces Generalized Projection.
1.2 Definition
Generalized Projection allows:
1. Selecting attributes (columns)
2. Applying arithmetic operations (+, −, *, /) on attributes
3. Renaming attributes at runtime
4. Storing results in auxiliary tables for reuse
5. Updating relations using projection + set operations
1.3 Notation
Generalized Projection is denoted as:
π_{expressions}(R)
Where expressions may include:
- Attributes
- Arithmetic operations
- Renaming (→ or ← notation)
1.4 Examples
(i) Basic Projection:
π_{e_name, salary}(Emp)
(ii) Projection with Arithmetic Operation (Increase salary by 10%):
π_{e_name, salary * 1.10}(Emp)
(iii) Projection with Renaming:
π_{e_name, salary * 1.10 ← new_sal}(Emp)
(iv) Projection + Selection:
π_{e_name, salary * 1.10 ← new_sal}(σ_{c_name='SBC'}(Emp))
1.5 Using Generalized Projection for Updates
Example: Increase salary of all employees in company SBC by 10%.
Step 1: Compute updated salaries for SBC employees:
R1 = π_{e_name, salary * 1.10 ← salary, c_name}(σ_{c_name='SBC'}(Emp))
Step 2: Remove old SBC employees from Emp:
Emp := Emp - σ_{c_name='SBC'}(Emp)
Step 3: Add updated employees back:
Emp := Emp ∪ R1
→ Now the Emp table is updated with new salaries.
1.6 Summary
- Basic Projection (π) → only selects columns.
- Generalized Projection → allows arithmetic, renaming, and updates.
- Strategy: Compute new values, remove old tuples, add updated tuples.
2. Outer Join
2.1 Motivation
- Inner Join (⋈) drops unmatched tuples.
- In many cases, we need to keep unmatched tuples (with NULL for missing values).
- Outer Join solves this by retaining unmatched tuples and filling NULLs.
2.2 Types of Outer Join
1. Left Outer Join (⟕): Keeps all tuples from left relation, unmatched filled with NULL.
2. Right Outer Join (⟖): Keeps all tuples from right relation, unmatched filled with NULL.
3. Full Outer Join (⟗): Keeps all tuples from both relations, unmatched filled with NULL.
2.3 Notation
- Left Outer Join: R ⟕_θ S
- Right Outer Join: R ⟖_θ S
- Full Outer Join: R ⟗_θ S
where θ is the join condition.
2.4 Example Scenario
Relations:
Supplier(s_id, s_name)
Supply(s_id, p_id, quantity)
Normal join (⋈): Supplier ⋈ Supply → drops suppliers without parts or with NULL quantity.
Outer join includes them with NULLs.
2.5 Example
Query: Find all supplier IDs, including those who supply some parts but whose quantity is not known.
Left Outer Join:
Supplier ⟕_{Supplier.s_id = Supply.s_id} Supply
Result includes:
- Suppliers with matching parts.
- Suppliers without parts (NULL in part fields).
- Suppliers with unknown quantity (NULL in quantity).
2.6 Step-by-Step Illustration
Supplier Table:
| s_id | s_name |
|------|---------|
| S1 | Alpha |
| S2 | Beta |
| S3 | Gamma |
Supply Table:
| s_id | p_id | quantity |
|------|------|----------|
| S1 | P1 | 100 |
| S2 | P2 | NULL |
Left Outer Join Result:
| s_id | s_name | p_id | quantity |
|------|--------|------|----------|
| S1 | Alpha | P1 | 100 |
| S2 | Beta | P2 | NULL |
| S3 | Gamma | NULL | NULL |
2.7 Why Outer Join?
- Ensures no tuple is lost from one or both relations.
- Useful for incomplete data queries, such as:
• Suppliers with no parts.
• Parts not supplied by anyone.
• Employees without departments, or departments without employees.
2.8 Summary
- Inner Join (⋈) drops unmatched tuples.
- Outer Join (⟕, ⟖, ⟗) preserves them with NULLs.
- Left Outer Join: keeps left.
- Right Outer Join: keeps right.
- Full Outer Join: keeps both.