EMPLOYEE PERFORMANCE ANALYSIS

The document outlines a project focused on employee performance analysis, utilizing a dataset of 1200 rows and 28 features to predict performance ratings based on various factors. It details the methodologies employed, including univariate, bivariate, and multivariate analysis, as well as machine learning models like Support Vector Machine, Random Forest, and Artificial Neural Networks, with the latter achieving the highest accuracy of 95.80%. The project also emphasizes data preprocessing techniques such as encoding, outlier handling, and scaling to enhance model performance.

Uploaded by

tranthuyhien2005thd

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

18 views3 pages

EMPLOYEE PERFORMANCE ANALYSIS

Uploaded by

tranthuyhien2005thd

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 3

EMPLOYEE PERFORMANCE ANALYSIS

As the name suggests, employee performance analysis is a process of analyzing employee

data to identify patterns and trends that can help improve employee productivity,
engagement, and retention. It can be an excellent practice area as you will deal with data
containing different data types, like numerical (attendance, turnover rates, etc.) and
categorical (job satisfaction, feedback, etc.).

In such a project, you will need to:

● Set goals and decide on performance metrics,

● Collect feedback data,
● Use this data for preprocessing and analysis,
● Infer who performs the best.

BUSINESS CASE & GOAL OF PROJECT: BASED ON GIVEN FEATURE OF DATASET

WE NEED TO PREDICT THE PERFOMANCE RATING OF EMPLOYEE
DEPARTMENT WISE PERFORMANCE:
Top 3 Important Factors effecting employee performance
● A trained model which can predict the employee performance based on factors as
inputs. This will be used to hire employees Recommendations to improve the
employee performance based on insights from analysis The given Employee dataset
consist of 1200 rows. The features present in the data are 28 columns. The shape of
the dataset is 1200x28. The 28 features are classified into quantitative and qualitative
where 19 features are quantitative (11 columns consists numeric data & 8 columns
consists ordinal data) and 8 features are qualitative. EmpNumber consist
alphanumerical data (distinct values) which doesn't play a role as a relevant feature
for performance rating.
● From Correlation we can get the important aspects of the data, Correlation between
features and Performance Rating.Correlation is a statistical measure that expresses the
extent to which two variables are linearly related.The analysis of the project has gone
through the stage of Univariate,Bivariate & Multivariate analysis, correlation analysis
and analysis by each department to satisfy the project goal.
● The dataset consists of Categorical data and Numerical data. The Target variable
consist of ordinal data, so this is a classification problem.The multiple machine
learning model used in this project is Support vector classifier, Random forest
classifier & Artifical neural network[Multilayer percepton]. from above all models
Artifical neural network[Multilayer percepton] predicts higher accuracy 95.80%.
● One of the important goal of this project is to find the important feature affecting the
performance rating. The important features were predicted using the machine learning
model feature importance technique. The main technique used in the preprocessing
data using the Mannual & Frequency encoding method to convert the string -
categorical data into numerical data, because, Most of machine learning methods are
based on numerical methods where strings are not supportive. The overall project was
performed and achieved the goals by using the machine learning model and
visualization techniques.
1. Analysis
Data were analyzed by describing the features present in the data. the features play the bigger
part in the analysis. The features tell the relation between the dependent and independent
variables. Pandas also help to describe the datasets answering following questions early in
our project. The data present in the dataset are divided into numerical and categorical data.

2.Univariate, Bivariate & Multivariate Analysis:

● Univariate Analysis: In univariate analysis we get the unique labels of categorical
features, as well as get the range & density of numbers
● Bivariate Analysis: In bivariate analysis we check the feature relationship with target
veriable.
● Multivariate Analysis: In multivariate Analysis check the relationship between two
veriable with respect to the target veriable.
3.Explotary Data Analysis
Distribution of Continuous Features:
● In general, one of the first few steps in exploring the data would be to have a rough
idea of how the features are distributed with one another. To do so, we shall invoke
the familiar distplot function from the Seaborn plotting library. The distribution has
been done by both numerical features. it will show the overall idea about the density
and majority of data present in a different level.
● The age distribution is starting from 18 to 60 where the most of the employees are
laying between 30 to 40 age count Employees are worked in the multiple companies
up to 8 companies where most of the employees worked up to 2 companies before
getting to work here. The hourly rate range is 65 to 95 for majority employees work in
this company. In General, Most of Employees work up to 5 years in this company.
Most of the employees get 11% to 15% of salary hike in this company.
Check Skewness and Kurtosis of Numerical Features: Checking whether the data is Normally
distributed or Not with Skewness and Kurtosis

4.Data Pre-Processing

a. Check Missing Value

b. Categorical Data Conversion: Handel categorical data with the help of frequency and
mannual encoding, because feature contains lot's of labels

● Mannual Encoding: Mannual encoding is a best techinque to handle categorical

feature with the help of map function, map the labels based on frequency.
● Frequency Encoding: Frequency encoding is an encoding technique to transform an
original categorical variable to a numerical variable by considering the frequency
distribution of the data getting value counts.

c. Outlier Handling: Some features contain outliers so we are impute this outlier with the
help of IQR because in all features data is not normally distributed
d. Feature Transformation: In YearsSinceLastPromotion some skewed & kurtosis is
present, so we are use Square Root Transformation techinque
● Square root transformation: Square root transformation is one of the many types of
standard transformations.This transformation is used for count data (data that follow a
Poisson distribution) or small whole numbers. Each data point is replaced by its
square root. Negative data is converted to positive by adding a constant, and then
transformed.
● Q-Q Plot: Q–Q plot is a probability plot, a graphical method for comparing two
probability distributions by plotting their quantiles against each other.

e. Scaling The Data: scaling the data with the help of Standard scalar

● Standard Scaling: Standardization is the process of scaling the feature, it assumes the
feature follow normal distribution and scale the feature between mean and standard
deviation, here mean is 0 and standard deviation is always

5.Machine learning Model Creation & Evaluation

a. Define Dependant and Independant Features:

b. Balancing the data: The data is imbalance, so we need to balance the data with the
help of SMOTE

SMOTE: SMOTE (synthetic minority oversampling technique) is one of the most commonly
used oversampling methods to solve the imbalance problem. It aims to balance class
distribution by randomly increasing minority class examples by replicating them. SMOTE
synthesises new minority instances between existing minority instances. 3.Splitting Training
And Testing Data: 80% data use for training & 20% data used for testing

1. Support Vector Machine

2. Random Forest
3. Artificial Neural Network [MLP Classifier]

University of Edinburgh
100% (1)
University of Edinburgh
3 pages
Upstream Advanced c1 Students Book 3rd Edition PDF
0% (2)
Upstream Advanced c1 Students Book 3rd Edition PDF
1 page
FRA Milestone1 - Maminulislam
100% (4)
FRA Milestone1 - Maminulislam
23 pages
Practical Statistical Process Control
From Everand
Practical Statistical Process Control
Colin Hardwick
5/5 (9)
Immatrikulationsinformation Uni Kassel (Englisch) - 3
No ratings yet
Immatrikulationsinformation Uni Kassel (Englisch) - 3
2 pages
IB Psychology - Paper 3 Revision Notes
86% (7)
IB Psychology - Paper 3 Revision Notes
5 pages
Machine Learning Project Checklist
No ratings yet
Machine Learning Project Checklist
30 pages
Research Paper (1)
No ratings yet
Research Paper (1)
5 pages
2022UCD2164-1-2
No ratings yet
2022UCD2164-1-2
35 pages
Exp 8_LM
No ratings yet
Exp 8_LM
10 pages
Machine Learning Project Roadmap
No ratings yet
Machine Learning Project Roadmap
4 pages
Unit-2Exploratory-Analysis
No ratings yet
Unit-2Exploratory-Analysis
37 pages
Monika Sree 11-07-2024
No ratings yet
Monika Sree 11-07-2024
36 pages
1
No ratings yet
1
19 pages
TE_ML_LAB_mannual
No ratings yet
TE_ML_LAB_mannual
21 pages
Capastone - Project - Subash Karnatakapu
No ratings yet
Capastone - Project - Subash Karnatakapu
54 pages
DVT Project
No ratings yet
DVT Project
35 pages
Exploratory Data Analysis
No ratings yet
Exploratory Data Analysis
35 pages
UNIT 4
No ratings yet
UNIT 4
42 pages
Machine Learning VIVEK
80% (5)
Machine Learning VIVEK
118 pages
Machine Learning Mindmap PDF
100% (1)
Machine Learning Mindmap PDF
5 pages
Unit - Iii - Eda
No ratings yet
Unit - Iii - Eda
25 pages
Bussiness Report PM
No ratings yet
Bussiness Report PM
44 pages
Articles Xgboost Classification With Smote-Enn Algorithm
No ratings yet
Articles Xgboost Classification With Smote-Enn Algorithm
11 pages
2 - Preprocessing
No ratings yet
2 - Preprocessing
74 pages
EDA_INDEPTH
No ratings yet
EDA_INDEPTH
19 pages
Sukanya December Predictive Modeling 14th Jan 2024
No ratings yet
Sukanya December Predictive Modeling 14th Jan 2024
50 pages
Machine Learning
No ratings yet
Machine Learning
30 pages
FINAL LECTURE 3,4.pptx - AutoRecovered
No ratings yet
FINAL LECTURE 3,4.pptx - AutoRecovered
73 pages
FINAL LECTURE 3,4.pptx - AutoRecovered [Autosaved]
No ratings yet
FINAL LECTURE 3,4.pptx - AutoRecovered [Autosaved]
80 pages
CIS 467 - Topic 2 - Data Exploration and Preprocessing
No ratings yet
CIS 467 - Topic 2 - Data Exploration and Preprocessing
81 pages
TYCS Practical
No ratings yet
TYCS Practical
26 pages
ML_Notes
No ratings yet
ML_Notes
44 pages
ML Complete Notes Hridoy.docx
No ratings yet
ML Complete Notes Hridoy.docx
5 pages
Assignment Solution 074
No ratings yet
Assignment Solution 074
8 pages
Concepts (PPT) - Data Preprocessing
No ratings yet
Concepts (PPT) - Data Preprocessing
19 pages
REVIEWER
No ratings yet
REVIEWER
9 pages
module 3 data preparation
No ratings yet
module 3 data preparation
33 pages
BUSINESS ANALYTICS
No ratings yet
BUSINESS ANALYTICS
14 pages
Answer Report (Preditive Modelling)
100% (1)
Answer Report (Preditive Modelling)
29 pages
data_preprocess_steps
No ratings yet
data_preprocess_steps
2 pages
Predictive Modeling Project
No ratings yet
Predictive Modeling Project
16 pages
DM_merged
No ratings yet
DM_merged
169 pages
Assignment 2 - Factor Hair
No ratings yet
Assignment 2 - Factor Hair
39 pages
ITS665 REPORT
No ratings yet
ITS665 REPORT
45 pages
S-11
No ratings yet
S-11
7 pages
DWDM Unit 1 Chap2 PDF
No ratings yet
DWDM Unit 1 Chap2 PDF
21 pages
DADM Unit 5 Programs
No ratings yet
DADM Unit 5 Programs
63 pages
S-9
No ratings yet
S-9
18 pages
FRA Milestone 1
No ratings yet
FRA Milestone 1
33 pages
Machine Learning
100% (2)
Machine Learning
30 pages
04 DS 2023
No ratings yet
04 DS 2023
63 pages
Machine Learning
100% (1)
Machine Learning
33 pages
Big Data Project
No ratings yet
Big Data Project
20 pages
1data Cleansing Cheklist
No ratings yet
1data Cleansing Cheklist
2 pages
Project Report-Micro Credit Loan
No ratings yet
Project Report-Micro Credit Loan
8 pages
UNIT 2 dt
No ratings yet
UNIT 2 dt
8 pages
Day 1 Article For Discussion
No ratings yet
Day 1 Article For Discussion
5 pages
Project Employee Absenteeism
No ratings yet
Project Employee Absenteeism
33 pages
Employee Turnover Prediction Project
No ratings yet
Employee Turnover Prediction Project
10 pages
The Implication of Statistical Analysis and Feature Engineering For Model Building Using Machine Learning Algorithms
No ratings yet
The Implication of Statistical Analysis and Feature Engineering For Model Building Using Machine Learning Algorithms
11 pages
01 - Feature Engg
No ratings yet
01 - Feature Engg
43 pages
Exploratory Data Analysis
100% (1)
Exploratory Data Analysis
203 pages
DATA MINING and MACHINE LEARNING. PREDICTIVE TECHNIQUES: ENSEMBLE METHODS, BOOSTING, BAGGING, RANDOM FOREST, DECISION TREES and REGRESSION TREES.: Examples with MATLAB
From Everand
DATA MINING and MACHINE LEARNING. PREDICTIVE TECHNIQUES: ENSEMBLE METHODS, BOOSTING, BAGGING, RANDOM FOREST, DECISION TREES and REGRESSION TREES.: Examples with MATLAB
César Pérez López
No ratings yet
FWPS-Vol-1-No-1-Paper-8
No ratings yet
FWPS-Vol-1-No-1-Paper-8
26 pages
Customer Segmentation Analysis
No ratings yet
Customer Segmentation Analysis
5 pages
SUPPLY CHAIN OPTIMIZATION
No ratings yet
SUPPLY CHAIN OPTIMIZATION
1 page
02-Deterding-Khaled-Nacke-Dixon
No ratings yet
02-Deterding-Khaled-Nacke-Dixon
5 pages
XU_WEBER_BUHALIS_2013_Gamificationintourism
No ratings yet
XU_WEBER_BUHALIS_2013_Gamificationintourism
13 pages
2024 2E EOY Section 1 Marking Scheme 2024
No ratings yet
2024 2E EOY Section 1 Marking Scheme 2024
4 pages
DevOps Presentation YKB 20052024 v7.0 Part2
No ratings yet
DevOps Presentation YKB 20052024 v7.0 Part2
60 pages
Syllabus Etika Dan Tata Kelola AACSB - CAU - Febaruary 2023
No ratings yet
Syllabus Etika Dan Tata Kelola AACSB - CAU - Febaruary 2023
11 pages
Disusun Oleh: Roseane Maria Dan Vidianka Rembulan: Karnofsky Performance Status Scale Definitions Rating (%) Criteria
No ratings yet
Disusun Oleh: Roseane Maria Dan Vidianka Rembulan: Karnofsky Performance Status Scale Definitions Rating (%) Criteria
2 pages
Final Updated Arnis Entry Form
No ratings yet
Final Updated Arnis Entry Form
2 pages
Mba Programme Project Report-2022
No ratings yet
Mba Programme Project Report-2022
16 pages
path to prefect sql
No ratings yet
path to prefect sql
2 pages
TLE-ICT-Technical Drafting Grade 10 TG
90% (39)
TLE-ICT-Technical Drafting Grade 10 TG
42 pages
Career Strategy
No ratings yet
Career Strategy
23 pages
RMB301 Chapter 3 Critically Reviewing The Literature
No ratings yet
RMB301 Chapter 3 Critically Reviewing The Literature
34 pages
Ch03 Test Bank
No ratings yet
Ch03 Test Bank
3 pages
Questions
No ratings yet
Questions
1 page
Clinical Case History
No ratings yet
Clinical Case History
5 pages
Magoosh GRE Ebook PDF
86% (7)
Magoosh GRE Ebook PDF
86 pages
Fr. Roberto Exequiel N. Rivera, SJ
No ratings yet
Fr. Roberto Exequiel N. Rivera, SJ
3 pages
Before Reading: Study Skills Tips
No ratings yet
Before Reading: Study Skills Tips
3 pages
Bahasa Inggris Kelas 7
No ratings yet
Bahasa Inggris Kelas 7
2 pages
Journal - Contemporary Gender Roles
No ratings yet
Journal - Contemporary Gender Roles
11 pages
SPOA Sponsorship Opportunities 2025
No ratings yet
SPOA Sponsorship Opportunities 2025
8 pages
Lesson Plan - Ceramics 5th Grade
No ratings yet
Lesson Plan - Ceramics 5th Grade
7 pages
Unit 1 Economics in Modern World
No ratings yet
Unit 1 Economics in Modern World
44 pages
III B.COM(GENERAL) PROJECT FRONT PAGE FOR 2021 BATCH
No ratings yet
III B.COM(GENERAL) PROJECT FRONT PAGE FOR 2021 BATCH
5 pages
Cover Letter
No ratings yet
Cover Letter
3 pages
FTS Pro Forma Flying
No ratings yet
FTS Pro Forma Flying
1 page
373985152-02.01-Kandukuri-S._Norce_Norway
No ratings yet
373985152-02.01-Kandukuri-S._Norce_Norway
8 pages
Speaking About Hobbies
No ratings yet
Speaking About Hobbies
2 pages