UNIT 1
UNIT 1
MANAGEMENT SYSTEMS
Authors – Raghu Ramakrishna, Johannes Gehrke
Edition – 3
Unit -1
• History of Database Systems, Database System Applications, database
System vs file System – View of Data – Data Abstraction –Instances and
Schemas – data Models – the ER Model – Relational Model – Other
Models
• Database Languages – DDL, DML –Transaction Management
• Database System Structure – Storage Manager – the Query Processor.
• Database design and E-R diagrams – Beyond E-R Design Entities,
Attributes and Entity sets – Relationships and Relationship sets –
Additional features of ER Model – Concept Design with the ER Model –
Conceptual Design for Large enterprises.
Introduction to DBMS
• The database management systems consists of two parts:
• Database and
• Management System
What is Database?
• To find out what database is, we have to start from data, which is the
basic building block of any DBMS.
• Data: Raw information, Facts, Figures, Statistics etc. having no
particular meaning [ e.g. 1, ABC, 10 etc]
• Record: collection of related data items. In the previous example the
three data items had no meaning. But if we organize them in the
following way, then they collectively represent meaningful
information.
Table 1 Table 2
• Like Object based model, they also describe data at the conceptual and
view levels. These models specify logical structure of database with
records, fields and attributes.
1. Relational Model:
1. This is the most widely accepted data model.
2. In this model, the database is represented as a collection of
relations in the form of rows and columns of a two-
dimensional table.
3. Each row is known as a tuple (a tuple contains all the data for an
individual record) while each column represents an attribute.
2. Hierarchical Model:
• In this data model, the data is organized in a hierarchical tree-like structure.
3. Network Model:
• Network Model is same as hierarchical model except that it has graph-like
structure rather than a tree-based structure. Unlike hierarchical model, this
model allows each record to have more than one parent record
3. Physical Data Models - Types of Data Model
1. Frame Memory Model:
• Frame memory is a virtual view of secondary storage that can be implemented
with reasonable overhead to support database record storage and accessing
requirements
2. Unifying Model:
• A unified data model brings together data from different sources and platforms
in one place so that all your data is considered when conducting analyses and
making decisions.
• Unified data is important because it enables you to look at every collected data
point so you can understand the entire data narrative.
Data Independence
• Data independence refers characteristic of being
able to modify the schema at one level of the
database system without altering the schema at the
next higher level.
Types of Data Independence:
1. Physical Data Independence: This is defined as
the ability to modify the physical schema of the
database without the modification causing any
changes in the logical/conceptual or
view/external level.
2. Logical Data Independence: Logical data
independence is the ability to modify logical
schema without causing any unwanted
modifications to the external schema or the
application programs to be rewritten.
Database Languages
• Database Language is a particular type of programming language used to define
and manipulate a database. Based on their application, database languages are
classified into four types: DDL, DML, DCL, and TCL. Database languages are
used to perform various critical tasks that help a database management system
function correctly.
1. Data Definition Language:
• Data definition language is used to store the information of metadata
like the number of tables and schemas, their names, indexes, columns
in each table, constraints, etc.
• Here are some tasks that come under DDL:
• Create: It is used to create objects in the database.
• Alter: It is used to alter the structure of the database.
• Drop: It is used to delete objects from the database.
• Truncate: It is used to remove all records from a table.
• Rename: It is used to rename an object.
• Comment: It is used to comment on the data dictionary.
2. Data Manipulation Language:
• DML stands for Data Manipulation Language. It is used for accessing
and manipulating data in a database. It handles user requests.
• Here are some tasks that come under DML:
• Select: It is used to retrieve data from a database
• Insert: It is used to insert data into a table
• Update: It is used to update existing data within a table
• Delete: It is used to delete all records from a table.
• Merge: It performs UPSERT operation, i.e., insert or update
operations
• Call: It is used to call a structured query language or a Java
subprogram
• Explain Plan: It has the parameter of explaining data
• Lock Table: It controls concurrency.
3. Data Control Language:
• DCL stands for Data Control Language. It is used to retrieve the stored or
saved data.
• The DCL execution is transactional. It also has rollback parameters.
• Here are some tasks that come under DCL:
• Grant: It is used to give user access privileges to a database
• Revoke: It is used to take back permissions from the user
Structure
Database System Structure
• The typical structure of DBMS is based on Relational data model.
• The top part of the architecture shows. application interfaces used by
naive users, application programs created by application programmers,
query tools used by sophisticated users and administration tools used by
database administrator.
• A database system is partitioned into modules that deal with each of the
responsibilities of the overall system.
• The functional components of a database system can be broadly divided
into the storage manager and the query processor components.
• The storage manager is important because databases typically require a
large amount of storage space.
• The query processor is important because it helps the database system
simplify and facilitate access to data.
Two – tier architectures
• A two-tier architecture, also known as a
client-server architecture, consists of two
main layers: a client layer and a server layer.
• The client layer is responsible for interacting
with the user and presenting data to them,
while the server layer is responsible for
processing requests and managing the
application's data and business logic
Three – tier architectures
• A three-tier architecture, also known as a
multi-tier architecture, consists of three main
layers: a presentation layer, an application
layer, and a data layer.
• The presentation layer is responsible for
interacting with the user and presenting data
to them, the application layer is responsible
for processing requests and managing the
application's business logic, and the data
layer is responsible for managing the
application's data.
Query Processor: The interactive query processor helps the database
system to simplify and facilitate access to data. It consists of DDL
interpreter, DML compiler and query evaluation engine.
The following are various functionalities and components of query
process:
• DDL interpreter: This is basically a translator which interprets the
DDL statements in data dictionaries.
• DML compiler: It translates DML statements query language into an
evaluation plan. This plan consists of the instructions which query
evaluation engine understands.
• Query evaluation engine: It executes the low-level instructions
generated by the DML compiler.
Storage manager:
• The storage manager is responsible for storing, retrieving, and
updating data in the database. The storage manager components
include
• Authorization and integrity manager: Validates the users who want
to access the data and tests for integrity constraints.
• Transaction manager: Ensures that the database remains in consistent
despite of system failures and concurrent transaction execution
proceeds without conflicting.
• File manager: Manages allocation of space on disk storage and
representation of the information on disk.
• Buffer manager: Manages the fetching of data from disk storage into
main memory. The buffer manager also decides what data to cache in
main memory. Buffer manager is a crucial part of database system.
Disk Storage: It contains the following components –
Data Files: It stores the data.
Relational Algebra
2.Optimization:
1. After doing query parsing, the DBMS starts finding the most efficient way
to execute the given query.
2. The optimization process follows some factors for the query. These factors
are indexing, joins, and other optimization mechanisms. So, query
optimization tells the DBMS what the best execution plan is for it. The main
goal of this step is to retrieve the required data with minimal cost in terms of
resources and time.
3. Evaluation:
• After finding the best execution plan, the DBMS starts the execution of the
optimized query. And it gives the results from the database.
• In this step, DBMS can perform operations on the data. These operations are
selecting the data, inserting something, updating the data, and so on.
Example:
Let's consider an example to show you the query processing steps. Suppose we have a
database with a table “student”. It contains the stud_id, first_name, last_name, and
ques_solved. The following SQL query is used to retrieve the names of all students
whose ques_solved is greater than 50:
select first_name, last_name from student where ques_solved >50;
Parsing: Firstly, the query will be parsed. This will also check whether the syntax is
correct or not. Then this query will be converted into a parse tree. This tree will look
like this:
+--- student
• Optimization:
• The DBMS determines the most efficient way to execute the query
by considering factors such as whether an index exists on the
ques_solved field. In this case, the DBMS might use an index on
the ques_solved field to efficiently retrieve the matching rows.
• Evaluation:
• The DBMS executes the optimized query. It retrieves the results
from the database. Then it returns the first_name and last_name of
all ninjas whose ques_solved is greater than 50.
Database Design - Entity Relationship(ER)
Modelling
Database design process can be divided into 6 steps. The ER model is
most relevant to the first three steps:
• Requirement Analysis : involves understanding user needs,
analyzing existing systems, and identifying frequent operations
for performance requirements.
• Conceptual database design: employed to create a broad
overview of the data and associated constraints for the database.
Typically, the (ER) model is used in this step.
• Logical database design: In this step we convert Conceptual
database design (ER) into a relational DBMS.
Database Design - Entity Relationship(ER)
Modelling
Beyond ER Design,
• Schema Refinement: involves analyzing the relations in the
relational database schema to identify and address potential issues
through a more objective and theory-guided process.
• Physical Database Design: It is important to consider anticipated
workloads in the database design process, refining it to meet
desired performance criteria
• Application and Security Design: This step involves recognizing
distinct user groups and their respective roles. It requires specifying
which parts of the database each group can access, implementing
restrictions to ensure authorized access, and safeguarding against
unauthorized access to sensitive areas.
ER (Entity Relationship ) Model
• ER model stands for an Entity-Relationship model. It is a high-level
data model. This model is used to define the data elements and
relationship for a specified system.
• It develops a conceptual design for the database. It also develops a
very simple and easy to design view of data.
• For example, Suppose we design a school database. In this database,
the student will be an entity with attributes like address, name, id, age,
etc.
ID Name
Student
Address Phn_No
Components of ER diagram
1. Entities: An entity is anything in the real world, such as an object,
class, person, or place.
Example:
STUDENT
• Entity types: are the basic building blocks for describing the structure
of data
Types of Entity:
1. Strong Entity:
• A strong entity is not dependent on any other entity in the schema.
• A strong entity will always have a primary key.
• Strong entities are represented by a single rectangle.
• The relationship of two strong entities is represented by a single diamond.
2. Weak Entity:
• A weak entity is dependent on a strong entity to ensure its existence.
• A weak entity does not have any primary key.
• It instead has a partial discriminator key.
• A weak entity is represented by a double rectangle.
• The relation between one strong and one weak entity is represented by a
double diamond.
• This relationship is also known as identifying relationship.
2. Attributes:
• An attribute in an Entity-Relationship
Model describes the properties or
characteristics of an entity.
• It is represented by an oval or
ellipse shape in the ER diagram.
• Every oval shape represents one attribute
and is directly connected to its entity
which is in the rectangle in shape.
• For example, employee_id,
employe_name, Gender,
employee_age, Salary, and Mobile
no. are the attributes which define entity
type Employee
Types of attributes:
• Simple Attributes: Cannot be divided into sub-attributes. Also called
atomic
Example:
Middle_Name
First_Name
Last_Name
Name
• Single Valued Attributes: Single-valued attributes are attributes that have only
one value for each instance of an entity and cannot store more than one value.
Example: Single-valued attributes are like your name – you can only have one
name, and it stays the same for you all your life.
• Multi Valued Attributes: An attribute -> Can store {Multiple Values} at a time
from a set of possible values.
• Derived attributes : Values that can be derived from other attributes
and are always dependent on other attributes for their value.
derived
Types of Relationships:
• One to One relationship (1: 1) : When a single element of an entity is
associated with a single element of another entity, it is called a one-to-one
relationship.
• One to Many relationship (1: M): When a single element of an entity is
associated with more than one element of another entity, it is called a one-to-
many relationship.
• Many to One relationship (M: 1): When more than one element of an entity is
related to a single element of another entity, then it is called a many-to-one
relationship.
• Many to Many relationship (M: M): When more than one element of an entity
is associated with more than one element of another entity, this is called a many-
to-many relationship.
4. Relationship Set:
• A relationship set is a set of relationships of same type.
Example:
• {Emp_ID}
• {Emp_Aadhar}
Example
The candidate keys Stud ID, Roll No., and Email allow us to identify
each student record individually.
Primary Key: Any key from the candidate key but it should be not null or
updated / combinations in candidate key.
• A value must be present in the Primary Key column, and the Primary Key
field cannot be left NULL.
• No two rows in the table may have the same values in that particular
column.
• No value can be changed or modified in this primary key column.
• Example:
Emp_ID Emp_Name Emp_Aadhar Emp_Salary Emp_phone Emp_Email
• {Emp_ID} / {Emp_Aadhar}
Alternate Key: An alternate is a secondary candidate key that is capable of
identifying a row uniquely. However, such a key is not used as a primary
key because, out of all the generated candidate keys, only one key is
selected as the primary key. Thus, the other remaining keys are known
as Alternate Keys or Secondary Keys.
Emp_ID Emp_Name Emp_Aadhar Emp_Salary Emp_phone Emp_Email
Example:
• Emp_phone and Emp_Email can be unique key.
Foreign Key: A foreign key is different from a super key, candidate key
or primary key because a foreign key is the one that is used to link two
tables together or create connectivity between the two.
Student table Department table
Composite Key: Two or more attributes together form a composite key
that can uniquely identify a tuple in a table. We need to find out such
table columns combination that can form a candidate key and hence a
composite key.
Example:
Partial Key: The set of attributes that are used to uniquely identify a
weak entity set is called the Partial key. The partial Key of the weak
entity set is also known as a discriminator.
Example
EXAMPLE
Additional features of ER model
2. Participation Constraints:
• Total Participation: Each entity is involved in the relationship. Total
participation is represented by double lines.
• Partial Participation: Not all entities are involved in the relationship.
Partial participation is represented by single lines.
Additional features of ER model - 3. Generalization,
Specialization and Aggregation:
Generalization :
• Generalization is like a bottom-up approach in which two or more
entities of lower level combine to form a higher level entity if they
have some attributes in common.
• Generalization is more like subclass and superclass system, but the
only difference is the approach.
• In generalization, entities are combined to form a more generalized
entity, i.e., subclasses are combined to make a superclass.
Bottom- Up
Approach
Specialization:
• It’s a Top Down approach to design ER model where
subclasses of entity types are defined that inherits
attributes of another entity type.
• The entity type which passes on attributes to other entities
is called the super class of the specialization.
• In other words, Specialization is the process of classifying
class of objects into more specialized subclasses. It is a
conceptual refinement.
TOP- DOWN
Approach
Aggregation:
• Aggregation is an abstraction through which we can represent
relationships as a higher level entity.
• It is an abstraction concept for building composite objects from their
component objects.
• There is a limitation in ER modeling that we don’t have a way to
represent relationship among relationships.
• To overcome this limitation aggregation is the solution.
CONCEPTUAL DATABASE DESIGN WITH THE
ER MODEL
• Entity versus Attribute
• Entity versus Relationship
• Binary versus Ternary Relationships
• Aggregation versus Ternary Relationships