Process of Data Cleaning & Analysis
Process of Data Cleaning & Analysis
Data
Analysis &
Cleaning
NAME – ANKIT KUMAR
– S H R E YA L A D H A
What is Data Analysis
• Remove Duplicates: Identify and remove duplicate records to ensure data integrity.
• Handle Missing Values: Decide on a method to handle missing data, such as
imputation or deletion.
• Correct Inconsistencies: Fix inconsistent data entries (e.g., standardizing job titles).
• Normalize Data: Ensure data is in a consistent format (e.g., date formats, categorical
values).
• Outlier Detection: Identify and manage outliers that could skew the analysis.
Step 4 – Analyzing the data
Data Analyzing involves applying various techniques and methods to the cleaned and
transformed data to uncover patterns, trends, correlations, and insights. This step can
be broken down into several key activities:
1. Exploratory Data Analysis (EDA): Explore data using visualization techniques to uncover patterns, trends, and
relationships.
• Descriptive Analysis: Use descriptive statistics to summarize the main features of the data.
• Correlation Analysis: Assessing relationships between different variables using correlation coefficients or
visual tools like heatmaps.
2. Statistical Analysis / Hypothesis Testing : Apply statistical methods to test hypotheses and make inferences (e.g.,
regression analysis, ANOVA).
3. Predictive Modeling: Build predictive models to forecast future HR trends (e.g., employee turnover, performance).
4. Machine Learning: Implement machine learning algorithms to uncover complex patterns and predictive insights.
Step 5 - Data Visualization &
Interpretation
Data visualization involves creating graphical
representations of data to communicate insights clearly
and effectively. Visualization helps in making complex
data more accessible, understandable, and usable.