0% found this document useful (0 votes)
19 views

Lunch and Learn SQL

This document discusses data normalization and how it can save time and effort. It covers the three normal forms: first normal form requires unique column values; second normal form removes partial dependencies; and third normal form eliminates non-key column dependencies. Examples are provided for each form. The document also discusses when table splitting may not be necessary, such as with similar tables that can be combined by making columns more generic. Normalization optimizes data structure to reduce duplication and improve code efficiency.

Uploaded by

Nicolas Costa
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views

Lunch and Learn SQL

This document discusses data normalization and how it can save time and effort. It covers the three normal forms: first normal form requires unique column values; second normal form removes partial dependencies; and third normal form eliminates non-key column dependencies. Examples are provided for each form. The document also discusses when table splitting may not be necessary, such as with similar tables that can be combined by making columns more generic. Normalization optimizes data structure to reduce duplication and improve code efficiency.

Uploaded by

Nicolas Costa
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 15

LUNCH

AND
LEARN
SQL and Data Normalization –
How to save time and effort
What is Data Normalization
• It’s the process of organizing data into tables in a way that
will reduce duplication, of data and tables.
• Normalization was first thought of in the 70s.
• It consists of three levels, each one ensures a different level
of normalization.
First Normal Form
The first normal form requires that a table satisfies the
following conditions:
1. Rows are not ordered
2. Columns are not ordered
3. There is duplicated data
4. Row-and-column intersections always have a unique value
5. All columns are “regular” with no hidden values
Example
• The following example is a table that breaks the First Normal
Form

• It contains more than one value in the Dept column. In order


to respect First Normal Form what we should do is split the
table:
Second Normal Form
An entity is in a second normal form if all its attributes depend
on the whole primary key. So, this means that the values in the
different columns have a dependency on the other columns.
1. The table must be already in 1 NF and all non-key columns of the
tables must depend on the PRIMARY KEY
2. The partial dependencies are removed and placed in a separate
table
Note: Second Normal Form (2 NF) is only ever a problem when we’re using a composite primary
key. That is, a primary key made of two or more columns.
Example
• In this example, the Title column is functionally dependent
on Name and Date columns. These two keys form a
composite key.
• To respect Second Normal Form we should separate the Title
Colum from the Date, since there is only a partial
dependency.
Third Normal Form
The third normal form states that you should eliminate fields in
a table that do not depend on the key.
1. A Table is already in 2 NF
2. Non-Primary key columns shouldn’t depend on the other non-
Primary key columns
3. There is no transitive functional dependency
Example
• In the following example on the table Employee department
id determines the department of an employee, and department
id determines the name.
• To solve this dependency we should split the table again.
When to avoid unnecessary table spliting
• Sometimes it may seem that having two separate tables is the
right approach. Consider the following example:

• This is the OrganizationUnits Permissions table for


Documents.
• And this is the User Permissions table for Documents

• As it can be seen, they have a nearly identical data structure,


the only difference between them is the name of the second
column. Both are also tables related to Permissions in the
Document Module, so how can we use those similarities to
improve the design?
• By making the second column more generic and adding a
new property to identify if the Id belongs to a User or an
Organization Unit:

• In the example above we can see that we renamed the second


column as MemberId (which can represent both User and
OU), and we added the fifth column MemberType. Now we
can store all the Checklist Permissions in a single table.
• And the benefits extend further:
• By properly considering how to avoid unnecessary splits you
also reduce the size of the code necessary to manipulate the
data of two tables in a single table.

• Fewer lines of code translate into fewer hours of


implementation, enhancements and maintenance just from
the benefits of considering how to optimize your data
structure.

You might also like