0% found this document useful (0 votes)
17 views

DBMS 1 Chapter

The document discusses key concepts in database design including: 1) Characteristics and advantages of the database management system (DBMS) approach such as data integrity, security, consistency, and scalability. 2) Data models, schemas, and instances which provide abstraction of database structure. A data model defines concepts to describe database structure. A schema describes a database and instances represent the current data. 3) Database design principles like entity-relationship modeling to define entities, attributes, relationships and constraints to reduce anomalies and NULL values.

Uploaded by

01fe22bca106
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views

DBMS 1 Chapter

The document discusses key concepts in database design including: 1) Characteristics and advantages of the database management system (DBMS) approach such as data integrity, security, consistency, and scalability. 2) Data models, schemas, and instances which provide abstraction of database structure. A data model defines concepts to describe database structure. A schema describes a database and instances represent the current data. 3) Database design principles like entity-relationship modeling to define entities, attributes, relationships and constraints to reduce anomalies and NULL values.

Uploaded by

01fe22bca106
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 16

* Characteristics of the Database Approach :

■ Self-describing nature of a database system


■ Insulation between programs and data, and data abstraction
■ Support of multiple views of the data
■ Sharing of data and multiuser transaction processing
Advantages of Using the DBMS Approach :
* Data Integrity: DBMS ensures data integrity by enforcing data constraints (such
as unique keys, foreign keys, and check constraints), which helps maintain
accurate and reliable data.
* Data Security: Access to the database can be controlled through authentication
and authorization mechanisms.
* Data Consistency: DBMS helps maintain consistency in data by providing
mechanisms like transactions, which ensure that the database moves from one
consistent state to another, even in the presence of failures.
* Data Independence: DBMS provides a layer of abstraction between the physical
data storage and the application, allowing changes to be made to the database
structure without affecting the applications that use the data.
* Concurrent Access and Transaction Management: DBMS allows multiple users
to access the database concurrently while ensuring that transactions are executed in
an isolated and consistent manner.
* Data Retrieval and Query Optimization: DBMS provides a query language (e.g.,
SQL) that allows users to retrieve and manipulate data easily.
* Scalability: DBMS systems are designed to handle large amounts of data and
users, making them scalable as an organization's data needs grow.
* Data Recovery and Backup: DBMS systems typically include mechanisms for
data backup and recovery, reducing the risk of data loss due to hardware failures,
human errors, or other unforeseen events.

Data Models, Schemas, and Instances:


A data model—a collection of concepts that can be used to describe the structure
of a database—provides the necessary means to achieve this abstraction.

Categories of Data Models :


* conceptual data models provide concepts that are close to the way many users
perceive data.
* representational (or implementation) data models,4 which provide concepts
that may be easily understood by end users but that are not too far removed from
the way data is organized in computer storage.

* Representational data models represent data by using record structures and hence
are sometimes called record-based data models.
* self-describing data models The data storage in systems based on these models
combines the description of the data with the data values themselves.

Schemas, Instances, and Database State:


The description of a database is called the database schema.

The data in the database at a particular moment in time is called a database


state or snapshot. It is also called the current set of occurrences or instances in
the database

valid state—that is, a state that satisfies the structure and constraints specified in
the schema
Three-Schema Architecture and Data Independence :

* The internal level has an internal schema, which describes the physical
storage structure of the database. The internal schema uses a physical data
model and describes the complete details of data storage and access paths for
the database.

* The conceptual level has a conceptual schema, which describes the structure
of the whole database for a community of users.
* The conceptual schema hides the details of physical storage structures and
concentrates on describing entities, data types, relationships, user operations, and
constraints.
* Usually, a representational data model is used to describe the conceptual schema
when a database system is implemented.

* The external or view level includes a number of external schemas or user


views. Each external schema describes the part of the database that a particular
user group is interested in and hides the rest of the database from that user group.
Data Independence:

Logical data independence is the capacity to change the conceptual schema


without having to change external schemas.

Physical data independence is the capacity to change the internal schema


without having to change the conceptual schema.

Using High-Level Conceptual Data Models for Database Design :


1.Entity Type:
An entity type is a category of objects or things in the real world that can be
identified and about which data can be stored.
Examples of entity types include "Person," "Product," or "Employee." Each entity
type has a set of common properties.
2.Entity Set:
An entity set is a collection of similar entities. It is the actual set of instances
(individual occurrences) of an entity type.
For example, the entity type "Person" may have an entity set consisting of
individual persons, each representing a specific person in the real world.
3.Attributes:
Attributes are properties or characteristics that describe the entities in an entity set.
Each entity in an entity set has values for its attributes.
For instance, a "Person" entity type may have attributes such as "Name," "Date of
Birth," and "Address."
4.Key:
A key is a unique identifier for an entity within an entity set. It helps distinguish
one entity from another. There are two main types of keys:
* Primary Key:
A primary key is a minimal set of attributes that uniquely identifies each entity in
an entity set.
For example, a "Person" entity set might have a primary key attribute such as
"Social Security Number."
* Composite Key:
a single attribute may not be sufficient to uniquely identify an entity. In such cases,
a combination of attributes, known as a composite key, can be used.
For instance, a composite key for an entity representing an "Address" might
include both "Street" and "City."
* Single-Valued and Multi-Valued Attributes:
1.Single-Valued Attribute:
An attribute that holds a single value for each entity. For example, the "Date of
Birth" attribute for a "Person" entity is typically single-valued.
2.Multi-Valued Attribute:
An attribute that can hold multiple values for each entity. For instance, the "Phone
Numbers" attribute for a "Person" entity could be multi-valued.
Relationship Types, Relationship Sets, Roles, and Structural
Constraints :

1.Relationship Type:
A relationship type is a category of association between two or more entity types. It
describes how entities are related to each other.
Examples of relationship types include "Works For," "Owns," or "Manages." Each
relationship type has a name that describes the nature of the association.
2.Relationship Set:
A relationship set is a collection of similar relationships. It represents the actual
instances (associations) between entities participating in a particular relationship
type.
For example, if "Employee" and "Department" are two entity types related by the
"Works For" relationship type, the "Works For" relationship set would consist of
specific instances of employees working for specific departments.
3.Roles:
In a binary relationship (involving two entity types), each entity type may play a
role in the relationship. The role of an entity type in a relationship is defined by its
function within that relationship.
For example, in the "Works For" relationship between "Employee" and
"Department," the "Employee" entity type may play the role of "Employee" while
the "Department" entity type plays the role of "Department."
4.Structural Constraints:
Structural constraints define the rules or conditions that limit the structure of a
relationship set. There are two main types of structural constraints:
* Cardinality Ratio:
The cardinality ratio specifies the number of instances of one entity that can be
related to a single instance of another entity. It is expressed as a ratio, such as 1:1,
1:N (one-to-many), or M:N (many-to-many).
* Participation Constraint:
The participation constraint specifies whether every instance of an entity type must
participate in a relationship. It can be total or partial. In a total participation, every
entity in the entity type must participate in the relationship, while in partial
participation, participation is optional.
Let's illustrate these concepts with an example:
Example: "Works For" Relationship
Relationship Type: Works For
Entities Involved: Employee, Department
Relationship Set: Instances of specific employees working for specific
departments.
Roles: Employee (plays the role of Employee), Department (plays the role of
Department).
DATABSE DESIGN

14.1 Informal Design Guidelines for Relation Schemas

1.Making Sure That The Semantics Of The Attributes Is Clear In The Schema
:
*The semantics of a relation refers to its meaning resulting from the
interpretation of attribute values in a tuple.
Guideline 1.
* Do not combine attributes from multiple entity types and relationship types into a
single relation. Intuitively, if a relation schema corresponds to one entity type or
one relationship type, it is straightforward to explain its meaning.
Otherwise, if the relation corresponds to a mixture of multiple entities and
relationships, semantic ambiguities will result and the relation cannot be easily
explained.

2. Reducing The Redundant Information In Tuples :


*One goal of schema design is to minimize the storage space used by the base
relations (and hence the corresponding files).
*Storing natural joins of base relations leads to an additional problem referred to as
update anomalies.
These can be classified into :
* insertion anomalies * deletion anomalies *modification anomalies
*Guideline 2. Design the base relation schemas so that no insertion, deletion, or
modification anomalies are present in the relations.
3. Reducing The NULL Values In Tuples :
* If many of the attributes do not apply to all tuples in the relation, we end up with
many NULLs in those tuples. This can waste space at the storage level and may
also lead to problems with understanding the meaning of the attribute.

* NULLs can have multiple interpretations,


such as the following:
■ The attribute does not apply to this tuple.
For example, Visa_status may not apply to U.S. students.
■ The attribute value for this tuple is unknown.
For example, the Date_of_birth may be unknown for an employee.
■ The value is known but absent; that is, it has not been recorded yet. Forexample,
the Home_Phone_Number for an employee may exist, but may not be available
and recorded yet

Guideline 3. As far as possible, avoid placing attributes in a base relation whose


values may frequently be NULL. If NULLs are unavoidable, make sure that they
apply in exceptional cases only and do not apply to a majority of tuples in the
relation.

4. Disallowing The Possibility Of Generating Spurious Tuples :


*Guideline 4. Design relation schemas so that they can be joined with equality
conditions on attributes that are appropriately related (primary key, foreign key)
pairs in a way that guarantees that no spurious tuples are generated. Avoid
relations
that contain matching attributes that are not (foreign key, primary key)
combinations

14.2 Functional Dependencies

* Functional Dependency (FD): In a relation (or table), an attribute (or a set of


attributes) A is functionally dependent on another attribute (or set of attributes) B
if, for every possible value of B, there is exactly one corresponding value of A.
a. Ssn → Ename
b. Pnumber → {Pname, Plocation}
c. {Ssn, Pnumber} → Hours
14.3 Normal Forms Based on Primary Keys

* Normalization of data can be considered a process of analyzing the given


relation schemas based on their FDs and primary keys to achieve the desirable
properties of
(1) minimizing redundancy
(2) minimizing the insertion, deletion, and update anomalies
Definition : Denormalization is the process of storing the join of higher normal
form relations as a base relation, which is in a lower normal form.
Super Key : A super key is a set of one or more attributes (columns) in a database
table that can uniquely identify a record (row) within that table.
Candidate Key : A candidate key is a minimal super key, meaning it is a set of
attributes that uniquely identifies records without any unnecessary attributes.
Primary Key : The primary key is a specific candidate key chosen as the main
method for uniquely identifying records within a table.
14.3.4 First Normal Form :

1NF disallows relations within relations. The only attribute values permitted by
1NF are single atomic values.
14.3.5 Second Normal Form :
Definition. A relation schema R is in 2NF if every nonprime attribute A in R is
fully functionally dependent on the primary key of R.

14.3.6 Third Normal Form :


Definition : According to Codd’s original definition, a relation schema R is in 3NF
if it satisfies 2NF and no nonprime attribute of R is transitively dependent on the
primary key.

* Alternative Definition : A relation schema R is in 3NF if every nonprime


attribute of R meets both of the following conditions:
■ It is fully functionally dependent on every key of R.
■ It is nontransitively dependent on every key of R.

14.5 Boyce-Codd Normal Form


* Boyce-Codd Normal Form (BCNF) is a higher level of normalization in database
management systems (DBMS).
It is an extension of the third normal form (3NF) and was proposed by Raymond F.
Boyce and Edgar F. Codd to address certain anomalies that may still exist in a 3NF
schema.
* To understand BCNF, let's review the basics of normalization:
# First Normal Form (1NF): Ensures that data is stored in a tabular format with no
repeating groups or arrays. Each column contains atomic (indivisible) values, and
each cell in the table must hold a single, indivisible value.
# Second Normal Form (2NF): Requires the table to be in 1NF and ensures that all
non-prime attributes (attributes not part of any candidate key) are fully functionally
dependent on the primary key.
# Third Normal Form (3NF): Requires the table to be in 2NF and ensures that there
are no transitive dependencies. In other words, no non-prime attribute should
depend on another non-prime attribute.
* BCNF takes normalization further by addressing certain situations that might still
lead to anomalies in a 3NF schema. Specifically, BCNF deals with cases where
there are overlapping candidate keys.
A relation is in BCNF if, for every non-trivial functional dependency (X → Y), X
is a superkey. Here, a non-trivial functional dependency means that Y is not a
subset of X, and X is not a superkey.
Consider a relation R(A, B, C, D) with the functional dependencies:
A→B
B, C → D
In this example, {A} is a superkey because it uniquely determines B. However, {B,
C} is also a superkey because it uniquely determines D. This scenario violates
BCNF because the non-prime attribute D is dependent on a non-superkey {B, C}.
To bring the relation into BCNF, you might decompose it into two relations:

R1(A, B)
R2(B, C, D)
Now, both R1 and R2 satisfy BCNF.

You might also like