Principles and Techniques of
Principles and Techniques of
DATABASE DESIGN
Introduction to Database design Principles
Is the process of producing a detailed data model of a database. This logical data model contains all the
needed logical and physical design choices and physical storage parameters needed to generate a design
in a data definition language, which can then be used to create a database.
1. Requirements analysis
2. Logical design
3. Physical design
4. Implementation
5. Monitoring, modification, and maintenance
The first three stages are database-design stages, which are briefly described below.
I. Requirements analysis
Requirements Analysis is the first and most important stage in the Database Life Cycle.
It is the most labor-intensive for the database designer.
Page 16 of 72
This stage involves assessing the informational needs of an organization so that a database can
be designed to meet those needs.
II. Logical design
During the first part of Logical Design, a conceptual model is created based on the needs
assessment performed in stage one. A conceptual model is typically an entity-relationship (ER)
diagram that shows the tables, fields, and primary keys of the database, and how tables are
related (linked) to one another.
The tables sketched in the ER diagram are then normalized. The normalization process resolves
any problems associated with the database design, so that data can be accessed quickly and
efficiently.
Certain database design books consider converting an ER diagram into SQL statements to be the
final task in the logical-design stage. According to such books, implementation is just a matter of
feeding SQL statements into an RDBMS and populating the database with data. The difference is
not especially important.
Page 17 of 72
Monitoring, modification, and maintenance
A successfully implemented database must be carefully monitored to ensure that it‘s functioning
properly and that it‘s secure from unauthorized access. The RDBMS usually provides utilities to
help monitor database functionality and security.
Database modification involves adding and deleting records, importing data from other systems
(as needed), and creating additional tables, user views, and other objects and tools. As an
organization grows, its information system must grow to remain useful.
information system: Interrelated components (e.g., people, hardware, software, databases,
telecommunications, policies, and procedures) that input, process, output, and store data to
provide an organization with useful information. Well-designed Database
A well-designed database enhances the organization's ability to expand its information system.
Ongoing maintenance procedures include periodic database backups, for example, an important
and ongoing maintenance procedure. Again, the RDBMS provides utilities to assist in this task.
Basic Design Principles
Avoid redundant information
I.e., keep duplication of data to a minimum. The process to find such a design is called
normalization. The normalization process can be formalized to a set of strict rules, and the level
of normalization can be expressed in terms of different normal forms, e.g., First Normal Form
(1NF), Second Normal Form (2NF), Third Normal Form (3NF), Boyce-Codd Normal Form
(BCNF), Fourth Normal Form (4NF), and Fifth Normal Form (5NF). However, in most cases it
is enough to follow the simple rule "avoid redundant information" with your common sence.
This is the approach in this material.
is the field value always required, i.e., are NULL values denied (NOT NULL constraint in SQL)**
should the field values be unique (PRIMARY KEY constraint, UNIQUE constraint)***
what set of values is allowed (CHECK constraint, FOREIGN KEY constraint)
is there some more complex constraints or business rules to be set up (ASSERTION constraint,
triggers)
Page 18 of 72
* Techically speaking, date and time data types fall into the cathegory of numeric data types.
** Make sure that you understand the difference between empty string and NULL. With string data
types they can look the same from the user's point of view, but they behave differently with search
conditions. To avoid confusion, some developers advice to deny NULLs in fields with string data type.
*** Table can have only one primary key but many alternate keys, i.e., keys defined with the UNIQUE
constraint. Notice that both constraints can involve a set of fields rather than only one field. Make sure
that you understand the difference between these two:
a) table has two UNIQUE constraints: UNIQUE(a), UNIQUE(b);
b) table has one UNIQUE constraint: UNIQUE(a,b).
Another way to speed up data searches is denormalization, in the case that table joins should be
made in the query. Denormalization simply means joining two or more tables into one table.
Then the join operation hasn't have to be done by the query, which saves time. Again, this means
slowing down data update operations, because in denormalized tables there are redundancies:
same information has to be updated in many places.
From the below mentioned models the relational model is the most commonly used model for
most database designs. But in some special cases other models can be more beneficial.
Page 19 of 72