Chap - 6 Normalization Database Tables
Chap - 6 Normalization Database Tables
2
© 2019 Cengage. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website fo
r classroom use.
Database Tables and Normalization (1 of 2)
3
© 2019 Cengage. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website fo
r classroom use.
Database Tables and Normalization (2 of 2)
4
© 2019 Cengage. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website fo
r classroom use.
The Need for Normalization
5
© 2019 Cengage. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website fo
r classroom use.
The Normalization Process (1 of 5)
6
© 2019 Cengage. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website fo
r classroom use.
The Normalization Process (2 of 5)
7
© 2019 Cengage. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website fo
r classroom use.
The Normalization Process (3 of 5)
First normal form (1NF) Table format, no repeating groups, and PK identified 6-3a
Boyce-Codd normal form (BCNF) Every determinant is a candidate key (special case of 3NF) 6-6a
Fourth normal form (4NF) 3NF and no independent multivalued dependencies 6-6b
8
© 2019 Cengage. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website fo
r classroom use.
The Normalization Process (4 of 5)
9
© 2019 Cengage. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website fo
r classroom use.
The Normalization Process (5 of 5)
10
© 2019 Cengage. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website fo
r classroom use.
Conversion to First Normal Form (1NF) (1 of 3)
• Repeating group: group of multiple entries of same type can exist for any single
key attribute occurrence
• Reduces data redundancies
• Three step procedure
• Eliminate the repeating groups
• Identify the primary key
• Identify all dependencies
• Dependency diagram: depicts all dependencies found within given table
structure
• Helps to get an overview of all relationships among table’s attributes
• Makes it less likely that an important dependency will be overlooked
11
© 2019 Cengage. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website fo
r classroom use.
Conversion to First Normal Form (1NF) (2 of 3)
12
© 2019 Cengage. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website fo
r classroom use.
Conversion to First Normal Form (1NF) (3 of 3)
13
© 2019 Cengage. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website fo
r classroom use.
Conversion to Second Normal Form (2NF) (1 of 2)
• Conversion to 2NF occurs only when the 1NF has a composite primary key
• If the 1NF has a single-attribute primary key, then the table is automatically in 2NF
• The 1NF-to-2NF conversion is simple
• Make new tables to eliminate partial dependencies
• Reassign corresponding dependent attributes
• Table is in 2NF when it:
• Is in 1NF
• Includes no partial dependencies
14
© 2019 Cengage. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website fo
r classroom use.
Conversion to Second Normal Form (2NF) (2 of 2)
15
© 2019 Cengage. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website fo
r classroom use.
Conversion to Third Normal Form (3NF) (1 of 2)
• The data anomalies created by the database organization shown in Figure 6.4
are easily eliminated
• Make new tables to eliminate transitive dependencies
• Reassign corresponding dependent attributes
• Table is in 3NF when it:
• Is in 2NF
• Contains no transitive dependencies
16
© 2019 Cengage. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website fo
r classroom use.
Conversion to Third Normal Form (3NF) (2 of 2)
17
© 2019 Cengage. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website fo
r classroom use.
Improving the Design
18
© 2019 Cengage. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website fo
r classroom use.
Surrogate Key Considerations
19
© 2019 Cengage. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website fo
r classroom use.
The Boyce-Codd Normal Form (1 of 4)
20
© 2019 Cengage. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website fo
r classroom use.
The Boyce-Codd Normal Form (2 of 4)
21
© 2019 Cengage. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website fo
r classroom use.
The Boyce-Codd Normal Form (3 of 4)
22
© 2019 Cengage. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website fo
r classroom use.
The Boyce-Codd Normal Form (4 of 4)
23
© 2019 Cengage. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website fo
r classroom use.
Fourth Normal Form (4NF) (1 of 2)
• Rules
• All attributes must be dependent on the primary key, but they must be independent
of each other
• No row may contain two or more multivalued facts about an entity
• Table is in 4NF when it:
• Is in 3NF
• Has no multivalued dependencies
24
© 2019 Cengage. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website fo
r classroom use.
Fourth Normal Form (4NF) (2 of 2)
25
© 2019 Cengage. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website fo
r classroom use.
Normalization and Database Design (1 of 6)
26
© 2019 Cengage. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website fo
r classroom use.
Normalization and Database Design (2 of 6)
27
© 2019 Cengage. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website fo
r classroom use.
Normalization and Database Design (3 of 6)
28
© 2019 Cengage. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website fo
r classroom use.
Normalization and Database Design (4 of 6)
29
© 2019 Cengage. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website fo
r classroom use.
Normalization and Database Design (5 of 6)
30
© 2019 Cengage. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website fo
r classroom use.
Normalization and Database Design (6 of 6)
31
© 2019 Cengage. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website fo
r classroom use.
Denormalization (1 of 2)
• Design goals
• Creation of normalized relations
• Processing requirements and speed
• Number of database tables expands
• Tables are decomposed to conform to normalization requirements
• Joining a larger number of tables
• Takes additional input/output (I/O) operations and processing logic
• Reduces system speed
• Defects in unnormalized tables
• Data updates are less efficient because tables are larger
• Indexing is more cumbersome
• No simple strategies for creating virtual tables known as views
32
© 2019 Cengage. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website fo
r classroom use.
Denormalization (2 of 2)
Table 6.6: Common
Denormalization Examples
Case Example Rationale and Controls
Redundant data Storing ZIP and CITY attributes in the Avoid extra join operations
AGENT table when ZIP determines CITY Program can validate city (drop-down box)
(see Figure 2.2) based on the zip code
Derived data Storing STU_HRS and STU_CLASS Avoid extra join operations
(student classification) when STU_HRS Program can validate classification
determines STU_CLASS (lookup) based on the student hours
(see Figure 3.28)
Preaggregated data Storing the student grade point average Avoid extra join operations
(also derived data) (STU_GPA) aggregate value in the Program computes the GPA every time a
STUDENT table when this can be grade is entered or updated
calculated from the ENROLL STU_GPA can be updated only via
and COURSE tables (see Figure 3.28) administrative routine
Information Using a temporary denormalized table Impossible to generate the data required
requirements to hold report data; this is required by the report using plain SQL
when creating a tabular report in which No need to maintain table
the columns represent data that are Temporary table is deleted once report is
stored in the table as rows (see Figures done
6.17 and 6.18) Processing speed is not an issue
33
© 2019 Cengage. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website fo
r classroom use.
Data-Modeling Checklist (1 of 6)
• Business rules
• Properly document and verify all business rules with the end users
• Ensure that all business rules are written precisely, clearly, and simply
- The business rules must help identify entities, attributes, relationships, and constraints
• Identify the source of all business rules, and ensure that each business rule is justified,
dated, and signed off by an approving authority
34
© 2019 Cengage. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website fo
r classroom use.
Data-Modeling Checklist (2 of 6)
• Data modeling
• Naming conventions: all names should be limited in length (database-dependent size)
• Entity names:
• Should be nouns that are familiar to business and should be short and meaningful
• Should document abbreviations, synonyms, and aliases for each entity
• Should be unique within the model
• For composite entities, may include a combination of abbreviated names of the
entities linked through the composite entity
35
© 2019 Cengage. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website fo
r classroom use.
Data-Modeling Checklist (3 of 6)
• Attribute names:
• Should be unique within the entity
• Should use the entity abbreviation as a prefix
• Should be descriptive of the characteristic
• Should use suffixes such as _ID, _NUM, or _CODE for the PK attribute
• Should not be a reserved word
• Should not contain spaces or special characters such as @, !, or &
• Relationship names:
• Should be active or passive verbs that clearly indicate the nature of the relationship
36
© 2019 Cengage. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website fo
r classroom use.
Data-Modeling Checklist (4 of 6)
• Entities:
• Each entity should represent a single subject
• Each entity should represent a set of distinguishable entity instances
• All entities should be in 3NF or higher
- Any entities below 3NF should be justified
• Granularity of the entity instance should be clearly defined
• PK should be clearly defined and support the selected data granularity
37
© 2019 Cengage. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website fo
r classroom use.
Data-Modeling Checklist (5 of 6)
• Attributes:
• Should be simple and single-valued (atomic data)
• Should document default values, constraints, synonyms, and aliases
• Derived attributes should be clearly identified and include source(s)
• Should not be redundant unless this is required for transaction accuracy, performance,
or maintaining a history
• Nonkey attributes must be fully dependent on the PK attribute
• Relationships:
• Should clearly identify relationship participants
• Should clearly define participation, connectivity, and document cardinality
38
© 2019 Cengage. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website fo
r classroom use.
Data-Modeling Checklist (6 of 6)
• ER model:
• Should be validated against expected processes: inserts, updates, and deletions
• Should evaluate where, when, and how to maintain a history
• Should not contain redundant relationships except as required (see attributes)
• Should minimize data redundancy to ensure single-place updates
• Should conform to the minimal data rule: All that is needed is there, and all that is
there is needed
39
© 2019 Cengage. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website fo
r classroom use.
Summary (1 of 2)
40
© 2019 Cengage. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website fo
r classroom use.
Summary (2 of 2)
• The larger the number of tables, the more additional I/O operations and
processing logic you need to join them
• The data-modeling checklist provides a way for the designer to check that the
ERD meets a set of minimum requirements
41
© 2019 Cengage. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website fo
r classroom use.