0% found this document useful (0 votes)
27 views

Chap - 6 Normalization Database Tables

The document discusses database table normalization. Normalization is the process of evaluating and correcting database table structures to minimize data redundancy. It covers normal forms including 1NF, 2NF, 3NF and BCNF and how they are used to transform tables into higher normal forms for improved database design.

Uploaded by

吴阳轩
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
27 views

Chap - 6 Normalization Database Tables

The document discusses database table normalization. Normalization is the process of evaluating and correcting database table structures to minimize data redundancy. It covers normal forms including 1NF, 2NF, 3NF and BCNF and how they are used to transform tables into higher normal forms for improved database design.

Uploaded by

吴阳轩
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 41

Chapter 6

Normalization of Database Tables


Learning Objectives

• After completing this chapter, you will be able to:


• Explain normalization and its role in the database design process
• Identify and describe each of the normal forms: 1NF, 2NF, 3NF, BCNF, and 4NF
• Explain how normal forms can be transformed from lower normal forms to higher
normal forms
• Apply normalization rules to evaluate and correct table structures
• Identify situations that require denormalization to generate information efficiently
• Use a data-modeling checklist to check that the ERD meets a set of minimum
requirements

2
© 2019 Cengage. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website fo
r classroom use.
Database Tables and Normalization (1 of 2)

• Normalization: evaluating and correcting table structures to minimize data


redundancies
• Reduces data anomalies
• Assigns attributes to tables based on determination
• Normal forms
• First normal form (1NF)
• Second normal form (2NF)
• Third normal form (3NF)

3
© 2019 Cengage. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website fo
r classroom use.
Database Tables and Normalization (2 of 2)

• Structural point of view of normal forms


• Higher normal forms are better than lower normal forms
• Properly designed 3NF structures meet the requirement of fourth normal form (4NF)
• Denormalization: produces a lower normal form
• Results in increased performance and greater data redundancy

4
© 2019 Cengage. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website fo
r classroom use.
The Need for Normalization

• Used while designing a new database structure


• Analyzes the relationship among the attributes within each entity
• Determines if the structure can be improved through normalization
• Improves the existing data structure and creates an appropriate database design

5
© 2019 Cengage. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website fo
r classroom use.
The Normalization Process (1 of 5)

• Objective is to ensure that each table conforms to the concept of well-formed


relations
• Each table represents a single subject
• Each row/column intersection contains only one value and not a group of values
• No data item will be unnecessarily stored in more than one table
• All nonprime attributes in a table are dependent on the primary key
• Each table has no insertion, update, or deletion anomalies

6
© 2019 Cengage. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website fo
r classroom use.
The Normalization Process (2 of 5)

• Ensures that all tables are in at least 3NF


• Higher forms are not likely to be encountered in business environment
• Works one relation at a time
• Identifies the dependencies of a relation (table)
• Progressively breaks the relation up into a new set of relations

7
© 2019 Cengage. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website fo
r classroom use.
The Normalization Process (3 of 5)

Table 6.2: Normal Forms

Normal Form Characteristic Section

First normal form (1NF) Table format, no repeating groups, and PK identified 6-3a

Second normal form (2NF) 1NF and no partial dependencies 6-3b

Third normal form (3NF) 2NF and no transitive dependencies 6-3c

Boyce-Codd normal form (BCNF) Every determinant is a candidate key (special case of 3NF) 6-6a

Fourth normal form (4NF) 3NF and no independent multivalued dependencies 6-6b

8
© 2019 Cengage. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website fo
r classroom use.
The Normalization Process (4 of 5)

Table 6.3: Functional Dependence Concepts


Concept Definition

Functional dependence The attribute B is fully functionally dependent on the attribute A


if each value of A determines one and only one value of B.
Example: PROJ_NUM S PROJ_NAME (read as PROJ_NUM
functionally determines PROJ_NAME)
In this case, the attribute PROJ_NUM is known as the
determinant attribute, and the attribute PROJ_NAME is known as
the dependent attribute.
Functional dependence Attribute A determines attribute B (that is, B is functionally
(generalized definition) dependent on A) if all (generalized definition) of the rows in the
table that agree in value for attribute A also agree in value for
attribute B.
Fully functional dependence If attribute B is functionally dependent on a composite key A but
(composite key) not on any subset of that composite key, the attribute B is fully
functionally dependent on A.

9
© 2019 Cengage. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website fo
r classroom use.
The Normalization Process (5 of 5)

• Partial dependency: functional dependence in which the determinant is only


part of the primary key
• Assumption: one candidate key
• Straight forward
• Easy to identify
• Transitive dependency: attribute is dependent on another attribute that is not
part of the primary key
• More difficult to identify among a set of data
• Occur only when a functional dependence exists among nonprime attributes

10
© 2019 Cengage. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website fo
r classroom use.
Conversion to First Normal Form (1NF) (1 of 3)

• Repeating group: group of multiple entries of same type can exist for any single
key attribute occurrence
• Reduces data redundancies
• Three step procedure
• Eliminate the repeating groups
• Identify the primary key
• Identify all dependencies
• Dependency diagram: depicts all dependencies found within given table
structure
• Helps to get an overview of all relationships among table’s attributes
• Makes it less likely that an important dependency will be overlooked

11
© 2019 Cengage. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website fo
r classroom use.
Conversion to First Normal Form (1NF) (2 of 3)

• 1NF describes tabular format in which:


• All key attributes are defined
• There are no repeating groups in the table
• All attributes are dependent on the primary key
• All relational tables satisfy 1NF requirements
• Some tables contain partial dependencies
• Update, insertion, or deletion

12
© 2019 Cengage. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website fo
r classroom use.
Conversion to First Normal Form (1NF) (3 of 3)

13
© 2019 Cengage. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website fo
r classroom use.
Conversion to Second Normal Form (2NF) (1 of 2)

• Conversion to 2NF occurs only when the 1NF has a composite primary key
• If the 1NF has a single-attribute primary key, then the table is automatically in 2NF
• The 1NF-to-2NF conversion is simple
• Make new tables to eliminate partial dependencies
• Reassign corresponding dependent attributes
• Table is in 2NF when it:
• Is in 1NF
• Includes no partial dependencies

14
© 2019 Cengage. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website fo
r classroom use.
Conversion to Second Normal Form (2NF) (2 of 2)

15
© 2019 Cengage. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website fo
r classroom use.
Conversion to Third Normal Form (3NF) (1 of 2)

• The data anomalies created by the database organization shown in Figure 6.4
are easily eliminated
• Make new tables to eliminate transitive dependencies
• Reassign corresponding dependent attributes
• Table is in 3NF when it:
• Is in 2NF
• Contains no transitive dependencies

16
© 2019 Cengage. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website fo
r classroom use.
Conversion to Third Normal Form (3NF) (2 of 2)

17
© 2019 Cengage. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website fo
r classroom use.
Improving the Design

• Normalization is valuable because its use helps eliminate data redundancies


• Evaluate PK assignments and naming conventions
• Refine attribute atomicity
- Atomic attribute: cannot be further subdivided
- Atomicity: characteristic of an atomic attribute
• Identify new attributes and new relationships
• Refine primary keys as required for data granularity
- Granularity: Level of detail represented by the values stored in a table’s row
• Maintain historical accuracy and evaluate using derived attributes

18
© 2019 Cengage. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website fo
r classroom use.
Surrogate Key Considerations

• Used by designers when the primary key is considered to be unsuitable


• System-defined attribute
• Created an managed via the DBMS
• Have a numeric value which is automatically incremented for each new row

19
© 2019 Cengage. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website fo
r classroom use.
The Boyce-Codd Normal Form (1 of 4)

• Every determinant in the table should be a candidate key


• Candidate key: same characteristics as primary key but not chosen to be the primary
key
• Equivalent to 3NF when the table contains only one candidate key
• Violated only when the table contains more than one candidate key
• Considered to be a special case of 3NF

20
© 2019 Cengage. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website fo
r classroom use.
The Boyce-Codd Normal Form (2 of 4)

21
© 2019 Cengage. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website fo
r classroom use.
The Boyce-Codd Normal Form (3 of 4)

22
© 2019 Cengage. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website fo
r classroom use.
The Boyce-Codd Normal Form (4 of 4)

Table 6.5: Sample Data


for a BCNF Conversion
STU_ID STAFF_ID CLASS_CODE ENROLL_GRADE
125 25 21334 A
125 20 32456 C
135 20 28458 B
144 25 27563 C
144 20 32456 B

23
© 2019 Cengage. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website fo
r classroom use.
Fourth Normal Form (4NF) (1 of 2)

• Rules
• All attributes must be dependent on the primary key, but they must be independent
of each other
• No row may contain two or more multivalued facts about an entity
• Table is in 4NF when it:
• Is in 3NF
• Has no multivalued dependencies

24
© 2019 Cengage. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website fo
r classroom use.
Fourth Normal Form (4NF) (2 of 2)

25
© 2019 Cengage. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website fo
r classroom use.
Normalization and Database Design (1 of 6)

• Normalization should be part of the design process


• Proposed entities must meet required the normal form before table structures are
created
• Principles and normalization procedures to be understood to redesign and
modify databases
• ERD is created through an iterative process
• Normalization focuses on the characteristics of specific entities

26
© 2019 Cengage. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website fo
r classroom use.
Normalization and Database Design (2 of 6)

27
© 2019 Cengage. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website fo
r classroom use.
Normalization and Database Design (3 of 6)

28
© 2019 Cengage. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website fo
r classroom use.
Normalization and Database Design (4 of 6)

29
© 2019 Cengage. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website fo
r classroom use.
Normalization and Database Design (5 of 6)

30
© 2019 Cengage. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website fo
r classroom use.
Normalization and Database Design (6 of 6)

31
© 2019 Cengage. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website fo
r classroom use.
Denormalization (1 of 2)

• Design goals
• Creation of normalized relations
• Processing requirements and speed
• Number of database tables expands
• Tables are decomposed to conform to normalization requirements
• Joining a larger number of tables
• Takes additional input/output (I/O) operations and processing logic
• Reduces system speed
• Defects in unnormalized tables
• Data updates are less efficient because tables are larger
• Indexing is more cumbersome
• No simple strategies for creating virtual tables known as views

32
© 2019 Cengage. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website fo
r classroom use.
Denormalization (2 of 2)
Table 6.6: Common
Denormalization Examples
Case Example Rationale and Controls
Redundant data Storing ZIP and CITY attributes in the Avoid extra join operations
AGENT table when ZIP determines CITY Program can validate city (drop-down box)
(see Figure 2.2) based on the zip code
Derived data Storing STU_HRS and STU_CLASS Avoid extra join operations
(student classification) when STU_HRS Program can validate classification
determines STU_CLASS (lookup) based on the student hours
(see Figure 3.28)
Preaggregated data Storing the student grade point average Avoid extra join operations
(also derived data) (STU_GPA) aggregate value in the Program computes the GPA every time a
STUDENT table when this can be grade is entered or updated
calculated from the ENROLL STU_GPA can be updated only via
and COURSE tables (see Figure 3.28) administrative routine

Information Using a temporary denormalized table Impossible to generate the data required
requirements to hold report data; this is required by the report using plain SQL
when creating a tabular report in which No need to maintain table
the columns represent data that are Temporary table is deleted once report is
stored in the table as rows (see Figures done
6.17 and 6.18) Processing speed is not an issue

33
© 2019 Cengage. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website fo
r classroom use.
Data-Modeling Checklist (1 of 6)

• Business rules
• Properly document and verify all business rules with the end users
• Ensure that all business rules are written precisely, clearly, and simply
- The business rules must help identify entities, attributes, relationships, and constraints
• Identify the source of all business rules, and ensure that each business rule is justified,
dated, and signed off by an approving authority

34
© 2019 Cengage. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website fo
r classroom use.
Data-Modeling Checklist (2 of 6)

• Data modeling
• Naming conventions: all names should be limited in length (database-dependent size)
• Entity names:
• Should be nouns that are familiar to business and should be short and meaningful
• Should document abbreviations, synonyms, and aliases for each entity
• Should be unique within the model
• For composite entities, may include a combination of abbreviated names of the
entities linked through the composite entity

35
© 2019 Cengage. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website fo
r classroom use.
Data-Modeling Checklist (3 of 6)

• Attribute names:
• Should be unique within the entity
• Should use the entity abbreviation as a prefix
• Should be descriptive of the characteristic
• Should use suffixes such as _ID, _NUM, or _CODE for the PK attribute
• Should not be a reserved word
• Should not contain spaces or special characters such as @, !, or &
• Relationship names:
• Should be active or passive verbs that clearly indicate the nature of the relationship

36
© 2019 Cengage. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website fo
r classroom use.
Data-Modeling Checklist (4 of 6)

• Entities:
• Each entity should represent a single subject
• Each entity should represent a set of distinguishable entity instances
• All entities should be in 3NF or higher
- Any entities below 3NF should be justified
• Granularity of the entity instance should be clearly defined
• PK should be clearly defined and support the selected data granularity

37
© 2019 Cengage. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website fo
r classroom use.
Data-Modeling Checklist (5 of 6)

• Attributes:
• Should be simple and single-valued (atomic data)
• Should document default values, constraints, synonyms, and aliases
• Derived attributes should be clearly identified and include source(s)
• Should not be redundant unless this is required for transaction accuracy, performance,
or maintaining a history
• Nonkey attributes must be fully dependent on the PK attribute
• Relationships:
• Should clearly identify relationship participants
• Should clearly define participation, connectivity, and document cardinality

38
© 2019 Cengage. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website fo
r classroom use.
Data-Modeling Checklist (6 of 6)

• ER model:
• Should be validated against expected processes: inserts, updates, and deletions
• Should evaluate where, when, and how to maintain a history
• Should not contain redundant relationships except as required (see attributes)
• Should minimize data redundancy to ensure single-place updates
• Should conform to the minimal data rule: All that is needed is there, and all that is
there is needed

39
© 2019 Cengage. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website fo
r classroom use.
Summary (1 of 2)

• Normalization is a technique used to design tables in which data redundancies


are minimized
• A table is in 1NF when all key attributes are defined and all remaining attributes
are dependent on the primary key
• A table is in 2NF when it is in 1NF and contains no partial dependencies
• A table is in 3NF when it is in 2NF and contains no transitive dependencies
• A table that is not in 3NF may be split into new tables until all of the tables
meet the 3NF requirements
• Normalization is an important part—but only a part—of the design process
• A table in 3NF might contain multivalued dependencies that produce either
numerous null values or redundant data

40
© 2019 Cengage. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website fo
r classroom use.
Summary (2 of 2)

• The larger the number of tables, the more additional I/O operations and
processing logic you need to join them
• The data-modeling checklist provides a way for the designer to check that the
ERD meets a set of minimum requirements

41
© 2019 Cengage. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website fo
r classroom use.

You might also like