DL_EDA_process

Uploaded by

nickn1390

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

14 views2 pages

DL_EDA_process

Uploaded by

nickn1390

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 2

Exploratory Data Analysis (EDA)

Exploratory Data Analysis (EDA) in machine learning is the process of analyzing and
visualizing datasets to understand their main characteristics, identify patterns, detect anomalies,
and check assumptions. EDA helps prepare data for modeling by revealing insights and guiding
feature selection, data transformation, and pre-processing steps. Here’s an overview of EDA’s
purpose and components:

1. Understanding Data Structure

● Data Summary: Get a high-level overview of the data, including the number of rows,
columns, and data types.
● Data Types and Format: Identify the data types (e.g., numerical, categorical) to
determine which statistical or visualization techniques to apply.
● Null Values: Check for missing values using .isna() and .sum() and decide on
strategies for handling them (imputation, deletion, etc.).

2. Descriptive Statistics

● Central Tendency: Examine measures like mean, median, and mode to understand the
central values of each feature.
● Spread and Range: Check variance, standard deviation, and range to understand how
data points are spread out.
● Distribution: Visualize distributions using histograms, box plots, or density plots to spot
skewness, kurtosis, and outliers.

3. Data Relationships

● Correlation Analysis: Use correlation matrices or heatmaps to identify relationships

between numerical features, which can inform feature selection and multicollinearity
concerns.
● Categorical Analysis: Analyze counts and distributions of categorical features using bar
charts, pie charts, and value_counts().

4. Identifying Outliers and Anomalies

● Outlier Detection: Use box plots, scatter plots, and z-scores to detect unusual data
points that may need addressing.
5. Feature Engineering Insights

● Identifying Useful Transformations: Based on data distributions, you may identify

opportunities for transformations (e.g., log transformation, normalization, or encoding of
categorical variables).
● Creating New Features: EDA can reveal patterns suggesting new feature combinations
or aggregations.

6. Data Cleaning

● Address missing values, incorrect or inconsistent data, and outliers based on insights
gained from EDA.

Benefits of EDA in Machine Learning

EDA provides critical information for building effective models by:

● Highlighting patterns that might affect modeling.

● Guiding the choice of algorithms and model parameters.
● Improving model performance by informing data preparation steps.

Overall, EDA is a foundational step in the machine learning pipeline, setting the stage for more
reliable, accurate models.

EXP-12
No ratings yet
EXP-12
4 pages
Exploratory Data Analysis EDA and Feature Engineering 10 Merged
No ratings yet
Exploratory Data Analysis EDA and Feature Engineering 10 Merged
99 pages
EDA Lecture notes
No ratings yet
EDA Lecture notes
205 pages
1.3.1. Exploratory Data Analysis
No ratings yet
1.3.1. Exploratory Data Analysis
24 pages
UNIT 1 Exploratory Data Analysis
100% (2)
UNIT 1 Exploratory Data Analysis
21 pages
Introduction To EDA Method in Machine Learning: by 60 - Soham Pawar
No ratings yet
Introduction To EDA Method in Machine Learning: by 60 - Soham Pawar
10 pages
Unit 3 Ids Notes
No ratings yet
Unit 3 Ids Notes
31 pages
Data Exploration and Visualization
100% (1)
Data Exploration and Visualization
281 pages
4.1 Advanced Data Analysis & Visualization
No ratings yet
4.1 Advanced Data Analysis & Visualization
12 pages
03a EDA
No ratings yet
03a EDA
47 pages
Unit i Exploratory Data Analysis
No ratings yet
Unit i Exploratory Data Analysis
38 pages
Intro
No ratings yet
Intro
26 pages
Document (4)
No ratings yet
Document (4)
21 pages
PDF_Experiments-1_DADV
No ratings yet
PDF_Experiments-1_DADV
41 pages
ML Lac0 Notes
No ratings yet
ML Lac0 Notes
37 pages
05 Exploratory Data Analysis in jamovi
No ratings yet
05 Exploratory Data Analysis in jamovi
18 pages
Exploratory Data Analysis (EDA)
No ratings yet
Exploratory Data Analysis (EDA)
12 pages
Group-7
No ratings yet
Group-7
19 pages
UNIT 1
No ratings yet
UNIT 1
23 pages
Exp-12
No ratings yet
Exp-12
7 pages
Machine
No ratings yet
Machine
10 pages
Mind Map or Summary For Chapter 2
No ratings yet
Mind Map or Summary For Chapter 2
3 pages
What Is EDA in Data Science - Everything About Exploratory Data - by Aman Kharwal - Medium
No ratings yet
What Is EDA in Data Science - Everything About Exploratory Data - by Aman Kharwal - Medium
11 pages
eda1
No ratings yet
eda1
25 pages
DSP UNIT - II
No ratings yet
DSP UNIT - II
14 pages
EDA 2
No ratings yet
EDA 2
69 pages
Exploratory Data Analysis
No ratings yet
Exploratory Data Analysis
2 pages
Unit 1 - Intro To EDA
No ratings yet
Unit 1 - Intro To EDA
40 pages
Exploratory Data Analysis Using Python
No ratings yet
Exploratory Data Analysis Using Python
7 pages
EDA Feature eng- Estimation Inference and Hypothesis
No ratings yet
EDA Feature eng- Estimation Inference and Hypothesis
53 pages
CH4 Exploratory Data Analysis
No ratings yet
CH4 Exploratory Data Analysis
12 pages
Session1-DataCharacteristics
No ratings yet
Session1-DataCharacteristics
41 pages
Exploratory Data Analysis (EDA)
No ratings yet
Exploratory Data Analysis (EDA)
1 page
Unit-1
No ratings yet
Unit-1
52 pages
IOT-Domain Analyst
No ratings yet
IOT-Domain Analyst
11 pages
DSML Notes
No ratings yet
DSML Notes
32 pages
Lesson 5 Exploratory Data Analysis
No ratings yet
Lesson 5 Exploratory Data Analysis
10 pages
unit-1
No ratings yet
unit-1
50 pages
Data Sciecnce
No ratings yet
Data Sciecnce
16 pages
datascience unit-4
No ratings yet
datascience unit-4
6 pages
Eda
No ratings yet
Eda
4 pages
DOC-20250125-WA0000.
No ratings yet
DOC-20250125-WA0000.
15 pages
biplobsinhapython
No ratings yet
biplobsinhapython
6 pages
Exploratory Data Analysis
No ratings yet
Exploratory Data Analysis
13 pages
BI-LEc 3
No ratings yet
BI-LEc 3
24 pages
Exploratory Data Analysis in ML
No ratings yet
Exploratory Data Analysis in ML
7 pages
Eda ML 2
No ratings yet
Eda ML 2
10 pages
FDS Unit 2
No ratings yet
FDS Unit 2
15 pages
Exploratory Dataanalysis (EDA) : Kevin Angelo A. Inlong
No ratings yet
Exploratory Dataanalysis (EDA) : Kevin Angelo A. Inlong
6 pages
Unit 3
No ratings yet
Unit 3
47 pages
Unit 1
No ratings yet
Unit 1
19 pages
Assignment EDA
No ratings yet
Assignment EDA
4 pages
What Is Exploratory Data Analysis (EDA) ?
No ratings yet
What Is Exploratory Data Analysis (EDA) ?
6 pages
22amh32 - Data Analytics and Data Science Unit I & Exploratory Data Analysis (Eda) 1. Exploratory Data Analysis (Eda)
No ratings yet
22amh32 - Data Analytics and Data Science Unit I & Exploratory Data Analysis (Eda) 1. Exploratory Data Analysis (Eda)
9 pages
Lesson 3.5. Basic Statistics For Teachers
No ratings yet
Lesson 3.5. Basic Statistics For Teachers
19 pages
Case Study
No ratings yet
Case Study
13 pages
Exploratory Data Analysis Using Python
No ratings yet
Exploratory Data Analysis Using Python
7 pages
Unit3 Eda
No ratings yet
Unit3 Eda
13 pages
phd coursework sessional SAMPLE PAPER-1
No ratings yet
phd coursework sessional SAMPLE PAPER-1
13 pages
WHY EDA
No ratings yet
WHY EDA
1 page
AGB UNIT 1 pdf
No ratings yet
AGB UNIT 1 pdf
63 pages
Eda Sandhya
No ratings yet
Eda Sandhya
7 pages
Eps 310-400 Na Mairinai Philipo-1
No ratings yet
Eps 310-400 Na Mairinai Philipo-1
48 pages
Educ8 Assessment Test
No ratings yet
Educ8 Assessment Test
13 pages
Exploratory Data Analysis
No ratings yet
Exploratory Data Analysis
3 pages
Dev 1
No ratings yet
Dev 1
2 pages
Educ11 NarrativeUnit-4 Group4 MT10 30-12 00
No ratings yet
Educ11 NarrativeUnit-4 Group4 MT10 30-12 00
41 pages
Statistics 1 1
No ratings yet
Statistics 1 1
46 pages
A Study of Vehicle Kilometres Travel Among Malaysia Motor Vehicle Users
No ratings yet
A Study of Vehicle Kilometres Travel Among Malaysia Motor Vehicle Users
21 pages
Regression Analysis With Excel: 1. How To Instal Statistical Pakage From Excel
No ratings yet
Regression Analysis With Excel: 1. How To Instal Statistical Pakage From Excel
17 pages
Week 2 Lecture 3
No ratings yet
Week 2 Lecture 3
20 pages
Ans of 12th Stats Part 1 SET B 50 Marks Paper
No ratings yet
Ans of 12th Stats Part 1 SET B 50 Marks Paper
10 pages
Análisis Manova en R - Analista de Datos Anca Mihai
No ratings yet
Análisis Manova en R - Analista de Datos Anca Mihai
12 pages
Measures of Variability 32729
No ratings yet
Measures of Variability 32729
15 pages
Case Study Analysis
No ratings yet
Case Study Analysis
7 pages
Meas T
No ratings yet
Meas T
8 pages
Margate Activity 6.1 Eda
No ratings yet
Margate Activity 6.1 Eda
5 pages
Test1 Chapter14 MCQ Full
No ratings yet
Test1 Chapter14 MCQ Full
4 pages
Negative Scores: Table A-2
No ratings yet
Negative Scores: Table A-2
10 pages
Module 3 - Measures of Dispersion and Shape
No ratings yet
Module 3 - Measures of Dispersion and Shape
6 pages
Central Tendency
No ratings yet
Central Tendency
8 pages
Estimation of Parameters 2
No ratings yet
Estimation of Parameters 2
37 pages
Assignment 01
No ratings yet
Assignment 01
2 pages
Mock Exam G6 Model (1) - 1
No ratings yet
Mock Exam G6 Model (1) - 1
2 pages
Bustat M3 Act1 Pojas
No ratings yet
Bustat M3 Act1 Pojas
2 pages
M3 Check in Activity 2
No ratings yet
M3 Check in Activity 2
2 pages
Finding The Standard Deviation
No ratings yet
Finding The Standard Deviation
2 pages
Practice Questions
No ratings yet
Practice Questions
2 pages
Data Analytics with Generative AI
From Everand
Data Analytics with Generative AI
Younish P
No ratings yet
Illuminating Data: A hands on guide to data visualization in R
From Everand
Illuminating Data: A hands on guide to data visualization in R
Eman Ahmad
No ratings yet

DL_EDA_process

Uploaded by

DL_EDA_process

Uploaded by

Exploratory Data Analysis (EDA)

1. Understanding Data Structure

● Correlation Analysis: Use correlation matrices or heatmaps to identify relationships

4. Identifying Outliers and Anomalies

● Identifying Useful Transformations: Based on data distributions, you may identify

Benefits of EDA in Machine Learning

EDA provides critical information for building effective models by:

● Highlighting patterns that might affect modeling.

You might also like