0% found this document useful (0 votes)
23 views

DBMS Unit-1

Data is raw unprocessed facts, while information is derived from processed data that has meaning and context. A database is an organized collection of related data or information. Database management systems (DBMS) allow users to define, construct, and manipulate databases through functions like querying, updating, and generating reports. Some key advantages of DBMSs over file-based systems include reduced data redundancy, improved data consistency, easier data integration, enhanced security, and built-in integrity checks to ensure atomic transactions. Major DBMSs have evolved from hierarchical and network models to the dominant relational model used today.

Uploaded by

prabandha putti
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
23 views

DBMS Unit-1

Data is raw unprocessed facts, while information is derived from processed data that has meaning and context. A database is an organized collection of related data or information. Database management systems (DBMS) allow users to define, construct, and manipulate databases through functions like querying, updating, and generating reports. Some key advantages of DBMSs over file-based systems include reduced data redundancy, improved data consistency, easier data integration, enhanced security, and built-in integrity checks to ensure atomic transactions. Major DBMSs have evolved from hierarchical and network models to the dominant relational model used today.

Uploaded by

prabandha putti
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 38

UNIT - I

Data:
 Facts that can be recorded.
 It is raw or unprocessed form.
 You cannot take decision based on data.
 Data has no meaning.
Example(s):
 Text, numbers, images, videos etc….
 25, Karthik, Karimnagar.

Information: When data is organized, processed in a given context.


Example(s):

Data Data Data

Processing

Information

Example(s):

 The age of a karthik is 25.


 505467 ----  pincode
Has a meaning/ meaningfull
 Marks/result of a class is a data.
Pass percentage of that class is information.

1
Database (db): Collection of similar or related data or
The related information when placed in an organized form makes a database or
organized collection of related information is known as database.

Example:

 A Book can be treated as a database, we will pen to manipulate (Manually).


 Dictionary, University (System).

Various types of databases can be Traditional databases (text & numbers), Multimedia
databases (audio, video, movies, speech), Geographically Information System (satellite
images), Real time databases (store).

Data ware house: Data ware house is a kind of database in which the data is going to be
very huge and historical.

Example: Data stored about past 100 years of a company .stock market rates.

***Good decisions require good information that is derived from raw facts (Data).

Database Management systems: It is the software system that allows the user to define
and maintain the database and provide control access to the database or
DBMS is a collection of interrelated data and a set of programs to access data. The
primary goal of a DBMS is to provide a way to store and retrieve database information
that is both convenient and efficient.

** DBMS is a software or set of programs to define, Construct and Manipulate database


i.e it helps us to add, modify, delete data in the database.

 DBMS is an intermediate layer between programs and data.

Database(Book) + Database Management Systems(Pen)=Database Systems

2
Summary:

Some of the Database Management systems are:

 Oracle, MySQL, MongoDB, Microsoft SQL server, PostgreSQL, NoSQL,


Informix, DB2, Sybase etc….

Functionalities:

1. Define: Specifying the data type, structure and constraints for the data to be
stored.
2. Construct: Process of storing data on some storage medium.
3. Manipulate: Querying the database to retrieve specific data, updating database
and generating reports.
4. Share: Allows multiple users and programs to access the database concurrently.

 Others functionalities can be protection and maintenance.

3
Database System Application:

1. Banking: For customer information, accounts, loans, and banking transactions.


2. Airlines: For reservations and schedule information. Airlines were among the first
to use databases in a geographically distributed manner.
3. Universities: For student information, course registration and grades.
4. Credit card transactions: For purchase on credit cards and generations of
monthly statements.
5. Telecommunications: For keeping records of calls made, generating monthly
bills, maintaining balances on prepaid calling cards, and storing information about
the communication networks.
6. Finance: For storing information about sales, and purchases of financial
instruments such as stocks and bonds also for storing real-time market data to
enable on-line trading by customers and automated trading by firm.
7. Railway Reservation System: Database is required to keep record of ticket
booking, train’s departure and arrival status. Also if trains get late then people get
to know it through database update.
8. Social media sites: We all are on social media websites to share our views and
connect with our friends. Daily millions of users signed up for these social media
accounts like facebook, whatsapp, etc. But how all the information of users are
stored and how we become able to connect to other people , yes this is all because
DBMS.

A Historical Perspective:

Manual System – 1950s:

 Data was stored as paper records.


 Huge man power involved.
 Unnecessary time was wasted like when searching for a particular record.
 This was inefficient.

1950s and Early 1960s:

 Data processing using magnetic tapes for storage.


 Tapes provided only sequential access.
 Data stored in files known as File based System.

4
1960s:

 Charles Bachman designed first general – purpose DBMS, was called integrated
data store.
 IDS formed the basis for the network data model.

Late 1960s:

 IBM (International Business Machines) developed the Information Management


System (IMS).
 IMS formed the basis for an alternative data representation framework called the
hierarchical data model.

1970:

 In 1970s, Edgar Codd, at IBM research Laboratory, proposed a new data


representation framework called relational data model

1980s:

 The relational model consolidated its position as the dominant DBMS paradigm,
and database systems continued to gain widespread use,
 The SQL query language for relational databases, developed as part of IBM’s
project, is now the standard query language.

late 1980s and 1990s:

 Several vendors (e.g IBMS’s DB2, Oracle 8, Informix) extended their systems
with the ability to store new data types such as images, videos and to ask more
complex queries.
 Specialized systems have been developed by numerous vendors for creating data
warehouses.

Present era:

 DBMS are entered the Internet age. While the first generation of websites stored
their data in operating system files, the use of a database accessed through a web
browser is becoming wide spread.
 Queries are generated through web-accessible forms and answers are formatted
using a markup language such as HTML.

5
File Systems vs a DBMS

Drawbacks of File Systems:

1. Data Redundancy:
There are no methods to validate the insertion of duplicate data in the system. Any user
can enter any data. File system doesn’t validate for the kind of data being entered nor
doesn’t validate for previous existence of the same data in the same file. Duplicate data in
the system is not appreciated as it is a waste of space and always lead to confusion and
misleading of data. When there are duplicate data in the file and if we need to update or
delete the record, we might end up in updating / deleting one of the records leaving the
other record in the file.
2. Data inconsistency:
For Example student and student_report files have student’s address in it, and there was a
change request for one particular student’s address. The program search only student file
for the address and updated it correctly. There is another program which prints the
student’s report and mails it to the address mentioned in the student_report file. There is a
mismatch in the actual address and his report is sent to his old address. This mismatch in
different copies of same data is called data inconsistency.
3. Data Isolation:
Imagine we have to generate a single report of student, who is studying in particular
class, his study report, his library book details and hostel information. All these
information are stored in different files. How do we get all these details in one report?
We have to write a program. But before writing the program , the programmer should
find out which all files have the information needed, what is the format of each file, how
to search data in each file etc. Once all these analysis is done, he writes a program. If
there is 2-3 files involved, programming would be bit simple. Imagine if there is lot many
files involved in it? It would require lot of effort for the programmer. Since all the data
are isolated from each other in different files, programming becomes difficult.
4. Security:
Each file can be password protected. But what if we have to give access to only few
records in the file? For Example, user has to be given access to view only their bank
account details in the file. This is very difficult in the file system.

6
5. Integrity:
If we need to check for certain insertion criteria while entering the data in to file, it is not
possible directly. We can do it writing program, say, if we have to restrict the students
above age 18 then it is by means program alone. There is no direct checking facility in
the file system. Hence these of integrity checks are not easy in file system.
6. Atomicity:
If there is any failure to insert, update, or delete in the file system, there is no mechanism
to switch back to the previous state. Consider a program to transfer 500 from the account
of A to the account of B. If a system failure occur during the execution of the program, it
is possible that the 500 as removed from the account A but was not credited to the
account B, resulting in an inconsistent database state. It is essential to database
consistency that either both the credit and debit occur, or that neither occur i.e the funds
transfer must be atomic-it must happen in its entirely or not at all. It is difficult to ensure
atomicity in a file processing system.
7. Limited data sharing:
Data are scattered in various files also different files may have different formats and these
files may be stored in different folders may be of different departments. So, due to this, it
is difficult to share data among different applications.
8. Data Mapping and Access:
Although all the related information is grouped and stored in different files, there is no
mapping between any two files. i.e.; any two dependent files are not linked. Even though
Student files and Student_Report files are related, they are two different files and they are
not linked by any means. Hence if we need to display student details along with his
report, we cannot directly pick from those two files. We have to write a lengthy program
to search the Student file first, get all details, then go Student_Report file and search for
his report. When there is a very huge amount of data, it is always a time-consuming task
to search for particular information from the file system. It is always an inefficient
method to search for the data.

7
S.No File System DBMS
1 Suitable for small organization. Suitable for both small and large organization.
2 It can have unstructured data Can have structured data.
3 Storing and retrieving of data can't be done DBMS is efficient to use as there are a wide variety
efficiently in a file system. of methods to store and retrieve data.
4 The file system doesn't have a crash
DBMS provides a crash recovery mechanism
recovery mechanism.
5 Protecting a file system is very difficult. DBMS offers good protection mechanism.
6 In a file management system, the
The redundancy of data is low in the DBMS system.
redundancy of data is greater.
7 Data inconsistency is higher in the file Data inconsistency is low in a database management
system. system.
8 It doesn't offer backup and recovery of DBMS system provides backup and recovery of data
data if it is lost. even if it is lost.
9 There is no efficient query processing in You can easily query data in a database using the
the file system. SQL language.
10 This system doesn't offer concurrency. DBMS system provides a concurrency facility.
11 Data cannot be shared Data can be shared.

8
Data Models:

 Planning the structure of data in database is called Data Models.


 A data model is a collection of high level data description construct that hide
many low-level storage details. A DBMS allows a user to define the data to
be stored in terms of a data model.

Data Models are:

1. Hierarchical Model.
2. Network Model
3. Entity – Relationship Model
4. Relational Model

Hierarchical Model:
This database model organizes data into a tree-like-structure, with a single root, to which all the
other data is linked. The hierarchy starts from the Root data, and expands like a tree, adding
child nodes to the parent nodes. In this model, a child node will only have a single parent node.
This model efficiently describes many real-world relationships like index of a book, recipes etc.
It has one-to-many relationship between two different types of data, for example, one department
can have many courses, many professors and many students.

9
Network Model:

This is an extension of the Hierarchical model. In this model data is organized more like a graph,
and are allowed to have more than one parent node.

In this database model data is more related as more relationships are established in this database
model. Also, as the data is more related, hence accessing the data is also easier and fast. This
database model was used to map many-to-many data relationships.

Entity – relationship Model:

E-R Models are defined to represent the relationships into pictorial form to make it easier for
different stakeholders to understand.

This model is good to design a database, which can then be turned into tables in relational model.

Let's take an example, If we have to design a School Database, then Student will be an entity
with attributes name, age, address etc. As Address is generally complex, it can be another
entity with attributes street name, pincode, city etc, and there will be a relationship between
them.

10
Relational Model:
In this model, data is organised in two-dimensional tables and the relationship is maintained by
storing a common field.
This model was introduced by E.F Codd in 1970, and since then it has been the most widely used
database model, infact, we can say the only database model used around the world.
The basic structure of data in the relational model is tables. All the information related to a
particular type is stored in rows of that table.
Hence, tables are also known as relations in relational model.

Student_id Name of the student Subject_name


501 Raju Java
502 Ravi dbms

Demerits of database Systems:

1. Complex design: Database design is complex, difficult and time consuming.


2. Hardware and software cost: Large amount of invest is needed to setup the
required hardware or to repair software failure.
3. Damaged part: If one part of database is corrupted or damaged, then entire database
may get affected.
4. Conversion cost: If the current system is in conventional file system and if we need to
convert it to database system then large amount of cost is incurred in purchasing different
tools, and adopting different techniques as per the requirement.
5. Training: For designing and maintaining the database systems, the people need to be
trained.

11
Levels of Abstraction in a DBMS
Data abstraction is a process of hiding the implementation details (such as how the data are
stored and maintained) and representing only the essential features to simplify user's interaction
with the system.

To simplify user's interaction with the system, the complexity is hidden from the database users
through several levels of abstraction.

Three levels of data abstraction are:

1. View Level / External Schema


2. Conceptual Level / Logical Schema
3. Physical Level / Internal Schema

View Level:

 Highest level of abstraction.


 Describes only part of the database for a particular group of users.
 Can be many different views of a database.

Example: If we have a login-id and password in a university system, then as a student, we can
view our marks, attendance, fee structure, etc. But the faculty of the university will have a

12
different view. He will have options like salary, edit marks of a student, enter attendance of the
students, etc. So, both the student and the faculty have a different view. By doing so, the security
of the system also increases. In this example, the student can't edit his marks but the faculty who
is authorized to edit the marks can edit the student's marks. Similarly, the dean of the college or
university will have some more authorization and accordingly, he will have his view. So,
different users will have a different view according to the authorization they have.

Logical / Conceptual Level:

 Next highest level of abstraction.


 Describes what data are stored and what relationships exit among those data.
 Database administrator level.

Example: Let us take an example where we use the relational model for storing the data. We
have to store the data of a student, the columns in the student table will be student_name, age,
mail_id, roll_no etc. We have to define all these at this level while we are creating the database.
Though the data is stored in the database but the structure of the tables like the student table,
teacher table, books table, etc are defined here in the conceptual level or logical level. Also, how
the tables are related to each other is defined here. Overall, we can say that we are creating a
blueprint of the data at the conceptual level.

Physical Level:

 Lowest level of abstraction.


 Describes how the data are stored.
 Complex low-level data structures described in detail.
 E.g: index, B-tree, hashing.

It tells the actual location of the data that is being stored by the user. The Database
Administrators (DBA) decides that which data should be kept at which particular disk drive, how
the data has to be fragmented, where it has to be stored etc. They decide if the data has to be
centralized or distributed. Though we see the data in the form of tables at view level the data

13
here is actually stored in the form of files only. It totally depends on the DBA, how he/she
manages the database at the physical level.

Data Independence

Data independence is ability to modify a schema definition in one level without affecting a
schema definition in the next higher level. or Capacity of changing the schema at one level
without effecting the another level.

There are two levels of data independence:

1. Physical Data Independence


2. Logical Data Independence

1. Physical Data Independence:

Physical Data Independence refers to the characteristic of changing the physical level without
affecting the logical level or conceptual level. Using this property we can easily change the
storage device of the database without affecting the logical schema.

14
Examples of changes under Physical Data Independence

Due to Physical independence, any of the below change will not affect the conceptual layer.

 Using a new storage device like Hard Drive or Magnetic Tapes


 Switching to different data structures.
 Changes to compression techniques or hashing algorithms.
 Change of Location of Database from say C drive to D Drive

2. Logical Data Independence:

It refers to the characteristics of changing the logical level without affecting the external or
view level. This also helps in separating the logical level from the view level. If we do any
changes in the logical level then the user view of the data remains unaffected. The changes in
the logical level are required whenever there is a change in the logical structure of the
database.

Examples of changes under Logical Data Independence

Due to Logical independence, any of the below change will not affect the external layer.

1. Add/Modify/Delete a new attribute, entity or relationship is possible without a rewrite of


existing application programs
2. Merging two tables into one
3. Breaking an existing table into two or more tables

15
Structure of a DBMS
 DBMS (Database Management System) acts as an interface between the user and the
database. The user requests the DBMS to perform various operations such as insert,
delete, update and retrieval on the database.
 The components of DBMS perform these requested operations on the database and
provide necessary data to the users.

16
Naive users:
Any user who does not have any knowledge about database can be in this category. There task is
to just use the developed application and get the desired results. For example: Clerical staff in
any bank is a naive user. They don’t have any dbms knowledge but they still use the database
and perform their given task.

Application programmers:

As its name shows, application programmers are the one who writes application programs that
uses the database. These application programs are written in programming languages like PHP,
Java etc. These programs meet the user requirement and made according to user requirements.
Retrieving information, creating new information and changing existing information is done by
these application programs.

Sophisticated Users:

They are database developers, who write SQL queries to select/insert/delete/update data. They do
not use any application or programs to request the database. They directly interact with the
database by means of query language like SQL. These users will be scientists, engineers,
analysts who thoroughly study SQL and DBMS to apply the concepts in their requirement.

Database Administrators:

The life cycle of database starts from designing, implementing to administration of it. A database
for any kind of requirement needs to be designed perfectly so that it should work without any
issues. Once all the design is complete, it needs to be installed. Once this step is complete, users
start using the database. The database grows as the data grows in the database. When the
database becomes huge, its performance comes down. Also accessing the data from the database
becomes challenge. There will be unused memory in database, making the memory inevitably
huge. These administration and maintenance of database is taken care by database Administrator
(DBA).A DBA has many responsibilities. A good performing database is in the hands of DBA.

17
 Installing and upgrading the DBMS Servers:

DBA is responsible for installing a new DBMS server for the new projects. He is also
responsible for upgrading these servers as there are new versions comes in the market or
requirement. If there is any failure in up gradation of the existing servers, he should be
able revert the new changes back to the older version, thus maintaining the DBMS
working.

 Design and implementation:

Designing the database and implementing is also DBA’s responsibility. He should be


able to decide proper memory management, file organizations, error handling, log
maintenance etc for the database.

 Performance tuning:

Since database is huge and it will have lots of tables, data, constraints and indices, there
will be variations in the performance from time to time. Also, because of some designing
issues or data growth, the database will not work as expected. It is responsibility of the
DBA to tune the database performance. He is responsible to make sure all the queries and
programs works in fraction of seconds.

 Migrate database servers:

Sometimes, users using oracle would like to shift to SQL server or MySQL. It is the
responsibility of DBA to make sure that migration happens without any failure, and there
is no data loss.

 Backup and Recovery:

Proper backup and recovery programs needs to be developed by DBA and has to be
maintained him. This is one of the main responsibilities of DBA. Data/objects should be
backed up regularly so that if there is any crash, it should be recovered without much
effort and data loss.

 Security:

DBA is responsible for creating various database users and roles, and giving them
different levels of access rights.

18
 Documentation:

DBA should be properly documenting all his activities so that if he quits or any new
DBA comes in, he should be able to understand the database without any effort. He
should basically maintain all his installation, backup, recovery, security methods. He
should keep various reports about database performance.

Data Interpreter: It interprets DDL statements and records them in tables containing metadata.

DML Compiler: The DML commands such as insert, update, delete, retrieve from the
application program are sent to the DML compiler for compilation into object code for database
access. The object code is then optimized in the best way to execute a query by the query
optimizer and then send to the data manager.

Query Evaluation Engine: It executes low-level instructions generated by the DML compiler.

Buffer Manager: It is responsible for retrieving data from disk storage into main memory. It
enables the database to handle data sizes that are much larger than the size of main memory.

File Manager: It manages the allocation of space on the disk storage and the data structure used
to represent information stored on disk.

Authorization and Integrity Manager: Checks the authority of users to access data and
satisfaction of the integrity constraints.

Transaction Manager: It ensures that the database remains in a consistent state despite the
system failures and that concurrent transaction execution proceeds without conflicting.

Data Files: Which store the database itself.

Data dictionary: Data Dictionary, which stores metadata about the database, in particular the
schema of the database such as names of the tables, names of attributes of each table, length of
attributes, and number of rows in each table.

Indices: An index is a small table having two columns in which the 1st column contains a copy
of the primary or candidate key of a table and the second column a set of pointers holding the
address of the disk block.

19
Statistical data:

These statistics provide the optimizer with information about the state of the tables that will be
accessed by the SQL statement that is being optimized. The types of statistical information
stored in the system catalog include:

 Information about tables including the total number of rows, information about
compression, and total number of pages.
 Information about columns including number of discrete values for the column and the
distribution range of values stored in the column.
 Information about table spaces including the number of active pages.
 Current status of the index including whether an index exists or not, the organization of
the index (number of leaf pages and number of levels), the number of discrete values for
the index key, and whether the index is clustered.
 Information about the table space and partitions.

20
ER Diagrams

ER diagram or Entity Relationship diagram is a conceptual model that gives the


graphical representation of data or it is a visual representation of data that describes
how data is related to each other.

An ER diagram is mainly composed of following three components

1. Entity sets
2. Attributes
3. Relationship sets

Entity: An entity is a thing or an object in the real world that is distinguishable


from other based on the value of the attribute is posses. (Any noun can be called as
Entity)

Types of Entities:

Tangible / concrete: Entities which physically exist in real world.


Example: car, pen, book etc.

Intangible / abstract: Entities which exists logically.


Example: Account.

Entity set: Entity set is a group of similar entities that share the same properties i.e
it represents schema / structure.

PERSON (name, age, address) ----------------------- Entity set


(Raju, 26, knr) Entity

 Entity cannot be represented in an ER diagram as it is instance / data.


 Entity set is represented by rectangle in ER diagram.

Symbol:

Student

Attributes: Attributes are the units that describe the characteristics / properties of
entities.

21
 For each attribute there is a set of permitted values called domains.
 In ER Diagram, represented by ellipse or oval.

Types of Attributes:

Simple Attributes: Simple attribute cannot be divided further, represented by


simple oval.

Example: DOB
Name

Student

Composite Attribute: Composite attribute can be further divided in simple


attribute, represented by oval connected to an oval.

Example:
L_name F_name
Name

Student Roll_No

Single attribute: Single attribute can have only one value at an instance of time.
Represented by oval

Example:
DOB

Student

Multi-valued attribute: Multi-valued attribute can have more than one value at an
instance of time.
Represented by double oval.

22
Example:
Phne_No

Student

Stored attribute: Stored attribute is an attribute which are physically stored in the
database.

Example:

DOB

Student

Derived Attribute: Derived attribute are the attributes that do not exist in the
physical database, but their values are derived from other attributes present in the
database.
Represented by dotted oval.

Example:
DOB

Student age

Complex Attribute: Complex Attribute is a type of attribute in database. It is


formed by nesting composite attributes and multi-valued attributes.
Example: A Person can have more than one address (multivalued) and address can
have city, pin code, country (composite).
Relationship: Relationship is an association between two or more entities of same
or different entity set. (Any verb can be treated as Relationship)
 No representation in ER diagram as it is an instance or data.

23
T1 S1
T2 S2
T3
. S3
. .
. .
. .
tn
sn

Relationship type / set: Relationship set is a set similar type of relationship.


 In ER Diagram it is represented using diamond symbol.

Example:

teaches Student
Teacher

A Relationship may also have attributes called descriptive attributes.


Example:

Since

works Department
Employee
for

Case Study:

Requirement Analysis: Every employee works for exactly a department and a


department can have many employees. New department need not have any
employee.

24
employee Works for department

E1. . D1
E2. .
. D2
E3. .
E4. . D3
E5. .
. D4
E6.

Degree: Number of entities that are participating in a relationship.


In the above example degree is 2.(binary relationship)

Cardinality ratio / Mapping cardinalities: Cardinality ratio is the max number of


relationships in which an entity can participate.

In the above diagram e1 can participate only one relationship i.e 1.


Department is participating more than one relationship i.e N.

Participation or existence: Minimum number of relationships in which an entity


can participate, sometimes it is also called as min – cardinality.

In the above figure e1 can participate min 1.


Department can participate 0.

Converting above set theory diagram into ER diagram.

N Works 1
employe Department
for

Figure - 1

25
Rectangles Represents entity sets

Ellipses Represents attributes

Diamonds Represents relationships

Attributes to entity sets &


Lines
entity sets to relationships

Double ellipses Represents multivalued

Dashed ellipses Denotes derived attributes

Which indicates total


Doubles lines participation of an entity in a
relationship sets

Double rectangles Represents weak entity sets

26
Additional Features of ER Model:

1. Participation constraints

There are two types of participation constraints-

Total participation: If all the entities are participating in a relationship then it is


called total participation. It is also called as mandatory participation.

Total participation is represented using a double line between the entity set and
relationship set.

Example:

Works 1
Employee Department
for

Partial participation: It specifies that each entity in the entity set may or may not
participate in the relationship instance in that relationship set. It is also called as
optional participation.
Partial participation is represented using a single line between the entity set and
relationship set.

27
Example:

N
Works 1
employee Department
for

2. Cardinality ratio / Mapping Cardinalities / Key constraints: It defines


the maximum number of relationship instances in which an entity can
participate.

There are 4 types of key constraints

 One to one
 One to many
 Many to one
 Many to many
1. One to one:

An entity in A is associated with at most one entity in B, and an entity in B


is associated with at most one entity in A.

Example: Every dept should have a hod and only one employee manages a
dept and an employee can manage only one dept.

28
2. Many to one: An entity in A is associated with at most one entity in B. An
entity in B, however can be associated with any number (zero or more) of
entities in A.

Example: Every employee works for exactly a department and a department


can have many employees. New department need not have any employee.

Employee works for department

E1. . D1
E2. .
. D2
E3. .
E4. . D3
E5. .
E6. . D4

N
Works 1
employee Department
for

3. One to many: An entity in A is associated with any number (zero or more)


of entities in B. An entity in B, can be associated with at most one entity in
A.

1 N
Has
Dept Employee

29
4. Many to many: An entity in A is associated with any number (zero or
more) of entities in B, and an entity in B is associated with any number (zero
or more) of entities in A.

Example: Every emp is supported to work at least one project and he can
work on many projects, a project is supported to have many emp and a
project should have at least one emp.

Employee works for project

E1 . P1
.
E2 . .
. P2
E3 .
.
E4 . . P3
E5 . .
. P4
E6 . .
.

M N
Employee Works project
on

30
Key: Key is an attribute or collection of attributes that uniquely indentifies an
entity among entity set. For example the roll number of a student makes him or her
identifiable among students.

Roll No Name

Student

Strong and weak entities:

Strong entity: An entity type is strong if its existence does not depend on some
other entity type. Such entity is called Strong Entity.

 A strong entity set is an entity that contains sufficient attributes to uniquely


identify all its entities.
 An entity set which has a key is called strong entity set.

Representation

 The strong entity is represented by a single rectangle.


 Relationship between two strong entities is represented by a single diamond.

Example:

31
 Consider the ER diagram which consists of two entities student and course
 Student entity is a strong entity because it consists of a key called student id
which is enough for accessing each record uniquely
 The same way, course entity contains of course ID attribute which is capable
of uniquely accessing each row it is each course details

Weak Entity:

A weak entity is an entity set that does not have sufficient attributes for Unique
Identification of its records. Simply a weak entity is nothing but an entity which
does not have a key attribute.
 It contains a partial key called as discriminator which helps in identifying a
group of entities from the entity set.
 Discriminator is represented by underlining with a dashed line.

Representation

 A double rectangle is used for representing a weak entity set.


 The double diamond symbol (Identifying relationship) is used for
representing the relationship between a strong entity and weak entity which
is known as identifying relationship.
 Double lines are used for presenting the connection with a weak entity set
with relationship.

Example:

32
Note:

 Without strong entity there is no weak entity.


 If it is weak entity then it must be total participation.
 But total participation doesn’t make an entity weak.

Class Hierarchies / Extended features of ER Model.

Specialization: Specialization means creating new subclass from an existing class.


If it turns out that certain attributes only apply to some of the object class sub class
can be created.

 Top – down design approach.


 Converting high – level entity into low – level entity by converting some
adding some attributes.
 Low – level entity inherits all the attributes from high – level entity.

Example:

If we have a person entity type who has attributes such as Name, Phone_no,
Address. Now, suppose we are making software for a shop then the person can be

33
of two types. This Person entity can further be divided into two entity i.e.
Employee and Customer. How we will specialize it? We will specialize the Person
entity type to Employee entity type by adding attributes like Emp_no and Salary.
Also, we can specialize the Person entity type to Customer entity type by adding
attributes like Customer_no, Email, and Credit. These lower-level attributes will
inherit all the properties of the higher-level attribute. The Customer entity type will
also have attributes like Name, Phone_no, and Address which was present in the
higher-level entity.

Generalization:

It is the process of extracting shared characteristics from two or more classes and
combining them into a generalized super class, shared characteristics can be
attributes.

 Reverse of Specialization
 Bottom – up approach.

Example: Suppose we have two entity types, Employee and Customer.


The attributes of Employee entity type are Name, Phone, Salary, Employee_id, and
Address.

34
The attributes of Customer entity type are Name, Phone, Address, Credit,
Customer_id, and Email.

We can see that the three attributes i.e. Name, Phone, and Address are common
here. When we generalize these two entities, we form a new higher-level entity
type Person. The new entity type formed is a generalized entity. We can see in the
below E-R diagram that after generalization the new generalized entity Person
contains only the common attributes i.e. Name, Phone, and Address. Employee
entity contains only the specialized attribute like Employee_id and Salary.
Similarly, the Customer entity type contains only specialized attributes like
Customer_id, Credit, and Email. So from this example, it is also clear that when
we go from bottom to top, it is a Generalization and when we go from top to
bottom it is Specialization. Hence, we can say that Generalization is the reverse of
Specialization.

35
Aggregation: Aggregation is the process of merging. It completely different from
generation. In generation, we merge entities of same domain into one entity; In
this case we merge related entities into one entity.

 We cannot express relationship among relationship

Representation:

This is indicated on an ER Diagram by drawing a dashed box around the


aggregation.

Example:

36
Before aggregation

Drawbacks:

1. Relationship among relation not possible.


2. Redundant data.

After aggregation

37
Conceptual Design with ER Model

Developing the ER diagram presents several choices, including the following:

 Should a concept be modeled as an entity or an attribute?


 Should a concept be modeled as an entity or a relationship?
 What are the relationship sets and their participating entity sets? Should we
use binary or ternary relationships?
 Should we use aggregation?

Entity versus Attribute

While identifying the attributes of an entity set, it is sometimes not clear whether a
property should be modeled as an attribute or as an entity set. For example,
consider adding address information to the Employees entity set. One option is to
use an attribute address. This option is appropriate if we need to record only one
address per employee, an alternate is to create an entity set called Addresses and to
record associations between employees and address using a relationship (say,
Has_address). This more complex alternative is necessary in two situations.

We have to record more than one address for an employee.


By representing an address as an entity with these attributes (such as city,
state, country, and Pin code) we can support queries such as “Find all
employees with an address in Hyderabad”.

Address as attribute Address as entity set

38

You might also like