Data Normalization

Data normalization is a process that scales attribute values within a smaller range to make data easier to analyze and understand. It is necessary when attributes have values on different scales, which can dilute the effectiveness of important attributes or lead to poor data models. There are several normalization methods, including decimal scaling, min-max normalization, and z-score normalization, which rescale values based on mean and standard deviation. Normalization transforms data to fall within a common range and can improve performance of machine learning algorithms.

Uploaded by

Ruchira Saha

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

169 views

Data Normalization

Uploaded by

Ruchira Saha

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 7

Data Normalization

Data Normalization
Data normalization makes data easier to classify and understand. It is used to
scale the data of an attribute so that it falls in a smaller range

Need of Normalization?
• Normalization is generally required when multiple attributes are there but attributes
have values on different scales, this may lead to poor data models while performing
data mining operations.
• Otherwise, it may lead to a dilution in effectiveness of an important equally important
attribute(on lower scale) because of other attribute having values on larger scale.
• Heterogenous data with different units usually needs to be normalized. Otherwise, data
has the same unit and same order of magnitude it might not be necessary with
normalization.
• Unless normalized at pre-processing, variables with disparate ranges or varying
precision acquire different driving values.
2. Data Transformation: Data Normalization contd..
Example

Chart for Raw Data

Chart for Normalized Data

2. Data Transformation: Data Normalization contd..
Methods of Data Normalization:
a. Decimal Scaling
b. Min-Max Normalization
c. z-Score Normalization(zero-mean Normalization)

There are several approaches in normalisation which can be used in

deep learning models.

Batch Normalization
Layer Normalization
Group Normalization
Instance Normalization
Weight Normalization
2. Data Transformation: Data Normalization contd..
a. Decimal Scaling Normalization
- It normalizes by moving the decimal point of values of the data.
- To normalize the data by this technique, we divide each data value by the
maximum absolute value of data set.
- The data value, vi, of data is normalized to v'i by using the formula

[where j is the smallest integer such that max(|v'i|)<1.]

In this technique, the computation is generally scaled in terms of decimals. It means that the
result is generally scaled by multiplying or dividing it with pow(10,k).

Example:
- Normalize the input data is: - 15, 121, 201, 421, 561, 601, 850
- Step 1: Maximum value in given data(m): 850 and hence maximum absolute value is
1000
- Step 2: Divide the given data by 1000 (i.e j=3)
2. Data Transformation: Data Normalization contd..
b. Min-Max Normalization (Linear Transformation)
- Minimum and maximum value from data is fetched and each value is
replaced according to the following formula.

Where - A is the attribute data(col)

- v and v’ is the old and new value of each entry in data
- min(A), max(A) are the minimum and maximum of A
- new_max(A), new_min(A) is the max and min value of the
required range(i.e boundary value) respectively.

Example
Input:- 10, 15, 50, 60
Normalized to range 0 to 1.
Here min=10, max= 60, new_min=0, new_max=1
Output:- 0, 0.1, 0.8, 1
2. Data Transformation: Data Normalization contd..
c. z-Score Normalization (zero-mean Normalization)
- Values are normalized based on mean and standard deviation of the data A.
- It is also called Standard Deviation method.
- Unstructured data can be normalized using z-score parameter,

where - - : mean
- S is the standard deviation.
- v and v’ is the old and new value of each data

Input:- 10, 15, 50, 60

n
1
mean  x 
n x
i 1
i  33.75
2

Output:-
SD 0.9515,
 (Xi  X )
x  0.7512, 0.6510, 1.0517
n 1

Amplitude Balancing 6409351 01 6409351 01
100% (1)
Amplitude Balancing 6409351 01 6409351 01
8 pages
Xzno22222222222 PDF
No ratings yet
Xzno22222222222 PDF
278 pages
Slope Failure On Embankment-Clayshale
100% (1)
Slope Failure On Embankment-Clayshale
10 pages
Memory Management in RTOS
No ratings yet
Memory Management in RTOS
20 pages
Data Preprocessing in Python - Handling Missing Data
No ratings yet
Data Preprocessing in Python - Handling Missing Data
8 pages
Convolutional Neural Network
100% (1)
Convolutional Neural Network
3 pages
The Sleeping-Barber Problem.: Answer
No ratings yet
The Sleeping-Barber Problem.: Answer
5 pages
Previous Exam Exercises On Classification: Exercise 4 2012: Classification With 2 Features
No ratings yet
Previous Exam Exercises On Classification: Exercise 4 2012: Classification With 2 Features
9 pages
Final Exam Data Mining and Machine Learning
No ratings yet
Final Exam Data Mining and Machine Learning
5 pages
CS 551: Banker's Algorithm
No ratings yet
CS 551: Banker's Algorithm
4 pages
DC Lab Exp6 17l238 Rep
No ratings yet
DC Lab Exp6 17l238 Rep
12 pages
Ad3311 Set4
No ratings yet
Ad3311 Set4
2 pages
Dekker's Algorithm
No ratings yet
Dekker's Algorithm
9 pages
OpenMP Presentation
No ratings yet
OpenMP Presentation
51 pages
Practical Lab File Based ON Programing in C: Submitted by
No ratings yet
Practical Lab File Based ON Programing in C: Submitted by
6 pages
Confusion Matrix, Accuracy, Precision, Recall, F1 Score
No ratings yet
Confusion Matrix, Accuracy, Precision, Recall, F1 Score
1 page
4-Data Cleaning, Data Integration, Data Transformation, Data Reduction-03-02-2024
No ratings yet
4-Data Cleaning, Data Integration, Data Transformation, Data Reduction-03-02-2024
22 pages
IO Programming 16 Mark
No ratings yet
IO Programming 16 Mark
9 pages
Set 3
No ratings yet
Set 3
16 pages
DCCN Notes
No ratings yet
DCCN Notes
27 pages
Seaborn Python
No ratings yet
Seaborn Python
9 pages
Assignments Week08
No ratings yet
Assignments Week08
4 pages
Exams 2024 Python For Beginners
No ratings yet
Exams 2024 Python For Beginners
22 pages
DAA Practical File Questions
No ratings yet
DAA Practical File Questions
6 pages
DCCN Prefinal Paper
No ratings yet
DCCN Prefinal Paper
2 pages
Image Super Resolution Report
No ratings yet
Image Super Resolution Report
12 pages
Jug Problem Python Code DFS Implementation
No ratings yet
Jug Problem Python Code DFS Implementation
7 pages
Train A Simple NN - Jupyter Notebook
No ratings yet
Train A Simple NN - Jupyter Notebook
4 pages
Random Access Files in C
100% (1)
Random Access Files in C
4 pages
1 FIND+S+Algorithm
No ratings yet
1 FIND+S+Algorithm
2 pages
09 - Thread Level Parallelism
50% (2)
09 - Thread Level Parallelism
34 pages
CS3251 Programming in C 2 Marks
No ratings yet
CS3251 Programming in C 2 Marks
23 pages
Pipeline Hazards
No ratings yet
Pipeline Hazards
39 pages
Unit 2 Fod
No ratings yet
Unit 2 Fod
27 pages
EDA Unit IV
No ratings yet
EDA Unit IV
17 pages
19cs413 Artificial Intelligence
No ratings yet
19cs413 Artificial Intelligence
3 pages
Packages in Python
No ratings yet
Packages in Python
54 pages
Unit 5 Fod (1) (Repaired)
No ratings yet
Unit 5 Fod (1) (Repaired)
28 pages
OOSE LAB 01 - Introduction and Project Definition
No ratings yet
OOSE LAB 01 - Introduction and Project Definition
4 pages
cs8086 Soft Computing
No ratings yet
cs8086 Soft Computing
14 pages
Lab Manual
No ratings yet
Lab Manual
28 pages
Introduction To Real-Time Operating Systems
No ratings yet
Introduction To Real-Time Operating Systems
36 pages
Unit 2a
No ratings yet
Unit 2a
31 pages
Robotics and Machine Vision Internal 3 Important Questions
No ratings yet
Robotics and Machine Vision Internal 3 Important Questions
1 page
Question Bank Module-1: Department of Computer Applications 18mca53 - Machine Learning
No ratings yet
Question Bank Module-1: Department of Computer Applications 18mca53 - Machine Learning
7 pages
CS3491-AIML Lab Manual
No ratings yet
CS3491-AIML Lab Manual
20 pages
Beyond Binary Classification
No ratings yet
Beyond Binary Classification
34 pages
Bankers Algorithm Example
100% (1)
Bankers Algorithm Example
4 pages
Ai-Unit-Iii Notes
No ratings yet
Ai-Unit-Iii Notes
46 pages
Chandigarh Group of Colleges College of Engineering Landran, Mohali
No ratings yet
Chandigarh Group of Colleges College of Engineering Landran, Mohali
47 pages
DAP Lab Manual
No ratings yet
DAP Lab Manual
20 pages
Network Node Architecture For
No ratings yet
Network Node Architecture For
7 pages
Fundamentals of Data Science: Nehru Institute of Engineering and Technology
100% (1)
Fundamentals of Data Science: Nehru Institute of Engineering and Technology
17 pages
Lecture - 2 Classification (Machine Learning Basic and KNN)
No ratings yet
Lecture - 2 Classification (Machine Learning Basic and KNN)
94 pages
Ad3311 - Artificial Intelligence Lab Manual
No ratings yet
Ad3311 - Artificial Intelligence Lab Manual
30 pages
AI Lab MAnual Final
No ratings yet
AI Lab MAnual Final
44 pages
ARM Instruction Set
100% (1)
ARM Instruction Set
75 pages
TYPES OF SCHEDULING ALGORITHMS in Cloud
100% (1)
TYPES OF SCHEDULING ALGORITHMS in Cloud
4 pages
ESIOT_LAB
No ratings yet
ESIOT_LAB
29 pages
Data Mining
No ratings yet
Data Mining
11 pages
3 1 Chapter 3 Normalization
No ratings yet
3 1 Chapter 3 Normalization
22 pages
WINSEM2024-25_MCSE615L_TH_VL2024250502897_2025-01-11_Reference-Material-I
No ratings yet
WINSEM2024-25_MCSE615L_TH_VL2024250502897_2025-01-11_Reference-Material-I
11 pages
A Novel System For Medical Equipment Supply Chain Trace - 2023 - Future Generati
No ratings yet
A Novel System For Medical Equipment Supply Chain Trace - 2023 - Future Generati
17 pages
Nextflow in Bioinformatics Executors Performance - 2023 - Future Generation Co
No ratings yet
Nextflow in Bioinformatics Executors Performance - 2023 - Future Generation Co
12 pages
Semantic Modeling and Design Patterns For Io - 2023 - Future Generation Computer
No ratings yet
Semantic Modeling and Design Patterns For Io - 2023 - Future Generation Computer
3 pages
Editorial Board - 2023 - Future Generation Computer Systems
No ratings yet
Editorial Board - 2023 - Future Generation Computer Systems
1 page
CC Assignment
No ratings yet
CC Assignment
4 pages
Ch02 Project Evaluation
No ratings yet
Ch02 Project Evaluation
47 pages
School of Computer Engineering: Kalinga Institute of Industrial Technology Deemed To Be University Bhubaneswar-751024
No ratings yet
School of Computer Engineering: Kalinga Institute of Industrial Technology Deemed To Be University Bhubaneswar-751024
70 pages
Test Bank Life in The Universe 3rd Edition
100% (1)
Test Bank Life in The Universe 3rd Edition
6 pages
TH I Gian Làm Bài: 60 Phút
100% (1)
TH I Gian Làm Bài: 60 Phút
4 pages
DMX6301_ Final Paper _2021_Moderated Jayamaha
No ratings yet
DMX6301_ Final Paper _2021_Moderated Jayamaha
7 pages
Contentless Syntax, Ineffable Semantics, and Transcendental Ontology. Reflections On Wittgenstein's Tractatus
No ratings yet
Contentless Syntax, Ineffable Semantics, and Transcendental Ontology. Reflections On Wittgenstein's Tractatus
6 pages
AICTE Mandatory Documents
No ratings yet
AICTE Mandatory Documents
86 pages
To Buy or Not To Buy? That Is The Question!: Learning Experiences & Self-Assessment Activities (Saa)
No ratings yet
To Buy or Not To Buy? That Is The Question!: Learning Experiences & Self-Assessment Activities (Saa)
6 pages
CCU Exhibition
No ratings yet
CCU Exhibition
18 pages
Writing A Solution Proposal
No ratings yet
Writing A Solution Proposal
38 pages
Jaipal Horoscope
No ratings yet
Jaipal Horoscope
47 pages
Asdasd
No ratings yet
Asdasd
114 pages
Unit Preparations
No ratings yet
Unit Preparations
15 pages
How To Automate Word From Visual Basic
0% (1)
How To Automate Word From Visual Basic
6 pages
Research Report 2017 en
No ratings yet
Research Report 2017 en
152 pages
Is Mars Effect A Social Effect
No ratings yet
Is Mars Effect A Social Effect
12 pages
Fault Tree Analysis and Hazop
No ratings yet
Fault Tree Analysis and Hazop
42 pages
Choosing The Right GPT Mode: Asynchronous or Synchronous Rendering
No ratings yet
Choosing The Right GPT Mode: Asynchronous or Synchronous Rendering
4 pages
MATLAB Crash Course
No ratings yet
MATLAB Crash Course
11 pages
Arduino Shield Ecg Emg
No ratings yet
Arduino Shield Ecg Emg
20 pages
EPICONDILITIS
No ratings yet
EPICONDILITIS
11 pages
ENGLISH 4 QUARTER 4 DLP
No ratings yet
ENGLISH 4 QUARTER 4 DLP
4 pages
Pressfield Steven-Do The Work
100% (26)
Pressfield Steven-Do The Work
73 pages
Reviewer For General Mathematics
No ratings yet
Reviewer For General Mathematics
13 pages
S.No. District College Code Name of The Institution Phone/Mandatory Disclosure
No ratings yet
S.No. District College Code Name of The Institution Phone/Mandatory Disclosure
4 pages
8 Things You Can Do To Be More Innovative
No ratings yet
8 Things You Can Do To Be More Innovative
71 pages
Day 4
No ratings yet
Day 4
6 pages
Banco PDF
No ratings yet
Banco PDF
26 pages
Country Report On ESD For Brunei Darussalam 2012 (Final)
No ratings yet
Country Report On ESD For Brunei Darussalam 2012 (Final)
6 pages
Standards For Curricula Assessment Systems - PDF 31300458
No ratings yet
Standards For Curricula Assessment Systems - PDF 31300458
24 pages

Data Normalization

Uploaded by

Data Normalization

Uploaded by

Data Normalization

Chart for Raw Data

Chart for Normalized Data

There are several approaches in normalisation which can be used in

[where j is the smallest integer such that max(|v'i|)<1.]

Where - A is the attribute data(col)

Input:- 10, 15, 50, 60

You might also like