Chapter 1: The Relational Data Model: Dbms Module 2
Chapter 1: The Relational Data Model: Dbms Module 2
Module 2
Chapter 1: The Relational Data Model
Introduction
The relational data model was first introduced by Ted Codd of IBM Research in 1970 in a
classic paper (Codd 1970), and it attracted immediate attention due to its simplicity and
mathematical foundation. The model uses the concept of a mathematical relation which
looks somewhat like a table of values as its basic building block, and has its theoretical basis
in set theory and first-order predicate logic.
The first commercial implementations of the relational model became available in the early
1980s, such as the SQL/DS system on the MVS operating system by IBM and the Oracle
DBMS. Since then, the model has been implemented in a large number of commercial
systems. Current popular relational DBMSs (RDBMSs) include DB2 and Informix
Dynamic Server (from IBM), Oracle and Rdb (from Oracle), Sybase DBMS (from Sybase)
and SQLServer and Access (from Microsoft). In addition, several open source systems, such
as MySQL and PostgreSQL, are available.
The preceding are called logical definitions of domains. A data type or format is also
specified for each domain. For example, the data type for the domain
Usa_phone_numbers can be declared as a character string of the form (ddd)ddddddd,
where each d is a numeric (decimal) digit and the first three digits form a valid telephone
area code. The data type for Employee_ages is an integer number between 15 and 80.
Attribute
An attribute Ai is the name of a role played by some domain D in the relation schema
R. D is called the domain of Ai and is denoted by dom(Ai).
Tuple
Mapping from attributes to values drawn from the respective domains of those attributes.
Tuples are intended to describe some entity (or relationship between entities) in the
miniworld Example: a tuple for a PERSON entity might be
Relation schema
The terms relation intension for the schema R and relation extension for a relation
state r(R) are also commonly used.
The order of attributes and their values is not that important as long as the
correspondence between attributes and values is maintained. An alternative
definition of a relation can be given, making the ordering of values in a tuple
unnecessary. In this definition A relation schema R(A1, A2, ...,An), set of attributes
and a relation state r(R) is a finite set of mappings r = {t1, t2, ..., tm}, where each
tuple ti is a mapping from R to D.
According to this definition of tuple as a mapping, a tuple can be considered as a set
of (<attribute>, <value>) pairs, where each pair gives the value of the mapping from
an attribute Ai to a value vi from dom(A i) .The ordering of attributes is not important,
because the attribute name appears with its value.
3. Values and NULLs in the Tuples
Each value in a tuple is atomic. NULL values are used to represent the values of
attributes that may be unknown or may not apply to a tuple. For example some
STUDENT tuples have NULL for their office phones because they do not have an
office .Another student has a NULL for home phone In general, we can have several
meanings for NULL values, such as value unknown, value exists but is not
available, or attribute does not apply to this tuple (also known as value undefined).
4. Interpretation (Meaning) of a Relation
Constraints that cannot be directly expressed in the schemas of the data model,
and hence must be expressed and enforced by the application programs.
Examples of such constraints are the salary of an employee should not exceed
the salary of the and the maximum number of hours an
employee can work on all projects per week is 56.
1.2.1 Domain Constraints
Domain Constraints specify that within each tuple, the value of each attribute A must
be an atomic value from the domain dom(A). The data types associated with
domains typically include standard numeric data types for integers (such as short
integer, integer, and long integer) and real numbers (float and doubleprecision
float). Characters, Booleans, fixed-length strings, and variable-length strings are
also available, as are date, time, timestamp, and money, or other special data types.
1.2.2 Key Constraints and Constraints on NULL Values
All tuples in a relation must also be distinct.This means that no two tuples can
have the same combination of values for all their attributes.There are other subsets
of attributes of a relation schema R with the property that no two tuples in any
relation state r of R should have the same combination of values for these
attributes.
Suppose that we denote one such subset of attributes by SK; then for any two
distinct tuples t1 and t2 in a relation state r of R, we have the constraint that: t1
t2[SK] . such set of attributes SK is called a superkey of the relation
schema R
Key
A key K of a relation schema R is a superkey of R with the additional property that
removing any attribute A from K leaves a set of attributes K that is not a superkey
of R anymore. Hence, a key satisfies two properties:
1. Two distinct tuples in any state of the relation cannot have identical values for
(all) the attributes in the key. This first property also applies to a superkey.
2. It is a minimal superkey that is, a superkey from which we cannot remove any
attributes and still have the uniqueness constraint in condition 1 hold. This
property is not required by a superkey.
Example: Consider the STUDENT relation
The attribute set {Ssn} is a key of STUDENT because no two student tuples can
have the same value for Ssn
Any set of attributes that includes Ssn for example, {Ssn, Name, Age} is a
superkey
The superkey {Ssn, Name, Age} is not a key of STUDENT because removing
Name or Age or both from the set still leaves us with a superkey
In general, any superkey formed from a single attribute is also a key. A key with multiple
attributes must require all its attributes together to have the uniqueness property.
Primary key
It is common to designate one of the candidate keys as the primary key of the
relation. This is the candidate key whose values are used to identify tuples in the
relation. We use the convention that the attributes that form the primary key of a
relation schema are underlined. Other candidate keys are designated as unique keys
and are not underlined
Another constraint on attributes specifies whether NULL values are or are not permitted.
For example, if every STUDENT tuple must have a valid, non-NULL value for the Name
attribute, then Name of STUDENT is constrained to be NOT NULL.
Figure1.2.3 (a): Schema diagram for the COMPANY relational database schema.
A Relational database state is a set of relation states DB = {r1, r2, ..., rm}.Each ri is a
state of R and such that the ri relation states satisfy integrity constraints specified in IC.
Figure 1.2.3(b) :One possible database state for the COMPANY relational database schema.
To define referential integrity more formally, first we define the concept of a foreign key. The
conditions for a foreign key, given below, specify a referential integrity constraint between the
two relation schemas R1 and R2.
A set of attributes FK in relation schema R1 is a foreign key of R1 that references relation R2
if it satisfies the following rules:
1 Attributes in FK have the same domain(s) as the primary key attributes PK
of R2; the attributes FK are said to reference or refer to the relation R2.
2 A value of FK in a tuple t1 of the current state r1(R1) either occurs as a value
of PK for some tuple t2 in the current state r2(R2) or is NULL.
In the former case, we have t1[FK] = t2[PK], and we say that the tuple t1 references or refers
to the tuple t2.
In this definition, R1 is called the referencing relation and R2 is the referenced relation. If
these two conditions hold, a referential integrity constraint from R1 to R2 is said to hold.
Semantic integrity constraints can be specified and enforced within the application
programs that update the database, or by using a general-purpose constraint specification
language. Examples of such constraints are the salary of an employee should not exceed
the salary of the supervisor and the maximum number of hours an employee
can work on all projects per week is 56. Mechanisms called triggers and assertions can
be used. In SQL, CREATE ASSERTION and CREATE TRIGGER statements can be
used for this purpose.
Functional dependency constraint establishes a functional relationship among two sets of attributes X and
Y. This constraint specifies that the value of X determines a unique value of Y in all states of a
relation; it is denoted as a functional dependency X Y. We use functional dependencies and other
types of dependencies as tools to analyze the quality of
relations to improve their quality.
Define the constraints that a valid state of the database must satisfy
1. Domain constraints : if an attribute value is given that does not appear in the correspondingdomain
or is not of the appropriate data type
2. Key constraints : if a key value in the new tuple t already exists in another tuple in the relation
r(R)
3. Entity integrity: if any part of the primary key of the new tuple t is NULL
4. Referential integrity : if the value of any foreign key in t refers to a tuple that does not existin the
referenced relation
Examples:
1. Operation:
-04-
F, 28000, NULL, 4>
Result: This insertion violates the entity integrity constraint (NULL for the primary key
Ssn), so it is rejected
2. Operation:
-04-
Result: This insertion violates the key constraint because another tuple with the same Ssn
value already exists in the EMPLOYEE relation, and so it is rejected.
3. Operation:
-04-
Result: This insertion violates the referential integrity constraint specified on Dno in
EMPLOYEE because no corresponding referenced tuple exists in DEPARTMENT
with Dnumber = 7.
4. Operation:
-04-
If an insertion violates one or more constraints, the default option is to reject the insertion. It would
be useful if the DBMS could provide a reason to the user as to why the insertion was rejected.
Another option is to an attempt to correct the reason for rejecting the insertion
Examples:
1. Operation:
Delete the WORKS_
Result: This deletion is acceptable and deletes exactly one tuple.
2. Operation:
Result: This deletion is not acceptable, because there are tuples in WORKS_ON
that refer to this tuple. Hence, if the tuple in EMPLOYEE is deleted,
referential integrity violations will result.
3. Operation:
Result: This deletion will result in even worse referential integrity violations,
because the tuple involved is referenced by tuples from the EMPLOYEE,
DEPARTMENT, WORKS_ON, and DEPENDENT relations.
Several options are available if a deletion operation causes a violation
2. cascade, is to attempt to cascade (or propagate) the deletion by deleting tuples that
reference the tuple that is being deleted
3. Set null or set default - is to modify the referencing attribute values that cause the
violation; each such value is either set to NULL or changed to reference another
default valid tuple.
1. Operation:
3. Operation: