0% found this document useful (0 votes)

41 views9 pages

A Novel Optimized Approach For Machine Learning Techniques For Predicting Employee Attrition

Uploaded by

sohamvadje24

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

41 views9 pages

A Novel Optimized Approach For Machine Learning Techniques For Predicting Employee Attrition

Uploaded by

sohamvadje24

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 9

2022 International Conference on Smart Generation Computing, Communication and Networking (SMART GENCON)

Karnataka, India. Dec 23-25, 2022

A Novel Optimized Approach for Machine

Learning Techniques for Predicting Employee

Attrition
Sahil Y. Bansal Baljeet Kaur Jatinderkumar R. Saini*
Symbiosis Institute of Computer Studies Symbiosis Institute of Computer Studies Symbiosis Institute of Computer Studies
and Research, Symbiosis International and Research, Symbiosis International and Research, Symbiosis International
(Deemed University), (Deemed University), (Deemed University),
Pune, India Pune, India Pune, India
0000-0003-1566-2130 0000-0001-5070-1401 0000-0001-5205-5263

Abstract—As there are numerous opportunities for domain have not discussed about optimizing the results of
competent people throughout the world, workers frequently best performing algorithm. This paper attempts to optimize
switch employers to take advantage of these opportunities, the results using ROC curve method.
which causes a high attrition rate inside the organization. All
firms increasingly view employee attrition as a major problem The cost of hiring, training, and supporting employees
because of its negative impact on workplace productivity and can outweigh the value of human capital investment. This
the timely achievement of corporate goals and vision. paper provides a systematic survey of machine learning
Businesses are using machine learning technology to estimate techniques that are used in employee attrition prediction
worker turnover rates in an effort to solve this problem. In models. These techniques aim to predict whether an
order to most precisely anticipate employee attrition, various individual will leave his/her current place of employment
machine learning algorithms are investigated and their results soon or stay longer. LR algorithm in Machine Learning is
are compared in this study. The current study also optimizes used to analyze categorical target data. LR can be classified
the results of the most efficient machine learning algorithm for either as binomial, multinomial or ordinal. Based on the
the given data using the ROC method. Notably, optimization of provided dataset of independent variables, LR evaluates the
machine learning algorithms has not been studied in earlier likelihood of an event occurring, such as whether an
research works related to employee attrition. In the current employee will quit the organization or not. As the sigmoid
study, an attempt is made to optimize the performance of the
function is used to model the data in logistic regression, only
selected algorithm and a model is proposed.
when a decision threshold is introduced into the equation,
Keywords—AUC, confusion matrix, F1-score, employee logistic regression becomes a classification approach [1].
attrition, machine learning, Precision, Recall, ROC curve The threshold value is an important feature of Logistic
regression and is determined by the classification issue itself.
I. INTRODUCTION The precision and recall levels have a large influence on the
threshold value determination. DT has a tree-like structure,
Machine learning has been a central part of business for
with the core having multiple dataset attributes, branches
decades. It is used for innumerable tasks in different
providing a rule base, and each child node representing the
industries and every day across the world. In recent years,
result. The judgments or tests are constructed using the
machine-learning techniques have become more prevalent in
attributes of the provided dataset. It only poses a Yes/No
tactical decision making and prediction.
question and splits into subtrees in accordance with the
Employee attrition is a key downside risk to response. It is a visual representation of each potential
organizations and is difficult to predict. In recent years, the solution to a dilemma or choice under consideration [2]. On
use of machine learning algorithms in the prediction of the other side, Random Forest averages many decision trees
employee turnover has been explored to arrive at promising applied to various subsets of a given dataset to enhance the
results. This paper aims to develop a machine learning model projected efficiency of the prediction on that dataset. The
to predict employee attrition based on the data that has been random forest gathers projections from each decision tree
gathered by human resources personnel and compares the structure and forecasts the ultimate result based on the
performance of the six machine learning techniques majority vote of predictions, as opposed to relying just on
commonly used to predict employee attrition. The six one decision tree [3]. The accuracy increases and the
techniques assessed in this study are Logistic Regression possibility of overfitting decrease as the size of the forest
(LR), Decision Tree Classifiers (DT), Support Vector increases. Both continuous data, as in regression, and
Machines (SVM), Naïve Bayes Classifiers (NB), K-Nearest discrete data, as in classification, may be handled by the
Neighbors Classifiers (KNN), and Random Forests (RF). Random Forest Algorithm. In categorization tasks, it
The previous studies conducted in this performs better than other algorithms [4]. The Naïve Bayes
strategy is a supervised approach to learning that addresses
______________ classification challenges by applying the Bayes theorem. As
*Corresponding author a probabilistic classifier, it makes predictions based on the
likelihood of an item [5]. The Bayes theorem can be used to

978-1-6654-5499-5/22/$31.00 ©2022 IEEE 1

Authorized licensed use limited to: Dr. D. Y. Patil Educational Complex Akurdi. Downloaded on July 23,2024 at 05:38:42 UTC from IEEE Xplore. Restrictions apply.
determine the probability of A happening, knowing B has In [10], the authors used the IBM Employee Attrition
already happened[6]. In this instance, B stands for the dataset and performed the prediction using random forest
evidence whereas A stands for the hypothesis. K-Nearest algorithm and achieved the accuracy of 85.12%. The
Neighbor is a classifier that can be categorized as a comparison of results made with other algorithms is not done
supervised learning classifier that is non-parametric and on the same dataset, hence it should not be a metric to verify
employs closeness to classify or predict the grouping of a the accuracy. In another study [11], the authors compared the
single data point [7]. The k parameter in this algorithm performance of various ML algorithms on the employee
specifies how many neighbors will be evaluated to classify a attrition data and concluded the Gaussian Naïve Bayes
single query point. Different values of k might lead to classifier to be the most optimum with the lowest false
overfitting or under-fitting, therefore defining it can be positive rate of 4.5% and the best recall score of 0.541.
tricky. Lower k values can have high variance but low bias, Techniques like the feature scaling and holdout technique
while higher k values can have high bias but low variation was implemented to minimize the error due to uneven
[7]. Support Vector Machines (SVM), look for a hyperplane distribution in the training and validation set, and identified
that distinctly separates the points of data in an n- the most important factors that may affect employee attrition
dimensional field, where n represents the number of features. the most. However, test accuracy after testing the models
The quantity of features affects the hyperplane's size. To was higher than training accuracy, which is a good sign but
divide the two sets of data points, one might employ a might be improved.
variety of hyperplanes. Finding the plane with the largest
margin, or the spacing among sample points from both Employee attrition in [12] was predicted using four
classes, is the goal. Maximizing the margin distance provides techniques including KNN, Logistic Regression, MLP
some support, enabling the categorization of following data Classifier and Gaussian Naïve Bayes on a sample data after
points with greater assurance. In contrast to other preprocessing and the AUC, accuracy and F1 scores were
classification techniques, SVMs choose the decision compared, with KNN outperforming the other algorithms
boundary that minimizes the distance from the nearest data and yielding an accuracy of 94.32%. KNN also predicted the
points for all categories. The decision boundary produced by most true positive values based on the ROC curve. The data
SVMs is known as the maximum margin classifier or considered to train and test the algorithms was not
maximum margin hyper plane [8]. preprocessed accurately and may not be sufficient, hence,
larger data may produce different results. As the number of
This paper aims to develop a machine learning model to features considered are few, they may not have an impact on
predict employee attrition based on the six machine learning the predictor variable.
algorithms i.e. Logistic Regression (LR), Decision Tree
Classifiers (DT), Support Vector Machines (SVM), Naïve In another study [13], the authors used the deep neural
Bayes Classifiers (NB), K-Nearest Neighbors Classifiers network approach and predicted the employee attrition a 7-
(KNN), and Random Forests (RF). Confusion Matrix and hidden layer neural network with soft plus activation
ROC curve are used to evaluate performance of trained function thereby achieving an accuracy and precision of
models using testing data. The matrices used to evaluate the 94.16% after balancing the data and removing the biases. As
trained models are accuracy, precision, recall, F1-score, and the same dataset was considered, the authors preprocessed
AUC. and normalized the data and derived various correlations in
the data, Since the data had majority records of employees
The rest of the paper is organized in the following not leaving the organization they removed this bias and
sections as follows. Section II presents the motivation for tested their network on this synthetic dataset thereby
this work through the literature review, Sections III presents achieving higher accuracy and precision.
the methodology used and the discussion on the dataset
considered, and the exploration, preprocessing and feature The authors in [14] used logistic regression as the
selection. Section IV presents the results obtained after algorithm after scaling to predict the employee attrition after
testing the model and optimizing the selected algorithm an preprocessing and exploratory data analysis of the IBM
ideal model is proposed which is followed by the conclusion dataset. Hyper-parameters were tuned using the
in Section V. GridSearchCV to achieve the optimum performance. The
precision recorded by using the algorithm could be improved
II. LITERATURE REVIEW from 0.63 by modifying and tuning the hyper-parameters.
The accuracy and AUC ROC score of 0.79 could be further
Measuring and analyzing employee attrition rate helps improved by either training the model further or
companies to develop relevant and effective retention implementing a different technique.
strategies to reduce attrition of good employees. The author
in [9] studied the reasons for employee departure and the In a similar study [15], the author used employee data
challenges in employee retention. Various retention from SAS and predicted the attrition using various ML and
strategies were also proposed which included compensation, ANN algorithms namely MLP, Random Forest, Gradient
leadership, appraisal, job flexibility and training. Moreover Boosting and applied it using the ensemble model on
the recent trends in employee attrition was studied that separate train, test and validate set to conclude that the
included sustainable HRM and Gig economy. According to Neural Network approach or the MLP gave the highest
the author, organizations expend significant efforts to accuracy with the least error and lowest misclassification rate
forestall this undesirable outcome by selecting qualified and of 0.075 and 0.08 respectively. As accuracy is generally not
skilled people. They [9] achieved this by developing 'great considered a reliable metric, the authors additionally
package' incentives to retain brilliant and competent compared the algorithms based on the Gini coefficients, the
personnel and to offer a working environment that is misclassification rates or false positives and the average
harmonious in which these employees may meaningfully square error rate of every algorithm considered to achieve the
contribute to the attainment of the organization’s goals. most accurate comparison. In [15], the employee attrition

2
Authorized licensed use limited to: Dr. D. Y. Patil Educational Complex Akurdi. Downloaded on July 23,2024 at 05:38:42 UTC from IEEE Xplore. Restrictions apply.
was predicted using tree-based models. These models and efficiencies. The accuracy was compared before and
contain random forests and light gradient enhanced trees, after feature selection and parameter tuning, However no
which performed the best. They utilized their own dataset significant change was observed and moreover the execution
with 5550 samples which was generated from anonymously time recorded for these algorithms was also considerably
submitted resumes on Glassdoor. The comparison of the high, hence, fine tuning in the algorithms should be done to
performances is based only on the ROC curves and hence it improve the performance. In [22], the authors compared 4
may not be comprehensive, also the ROC performance Machine Learning algorithms on an employee dataset and
recorded for light gradient enhanced trees of 76% which can observed that KNN and Random Forest returned the most
further be improved and enhanced. accurate results, and random forest algorithm having the
highest precision of 87%. Although the data has been
In [16], the authors performed the EDA of the IBM
visualized and preprocessed, there is still bias in the data
dataset and balanced the dataset, after which they compared
with the number of records for both classes which could have
the results from various classification algorithms namely
resulted in poor results.
random forest, Decision tree, K-Nearest neighbor (KNN),
Logistic regression, and Stochastic Gradient Descent over [23] discovered the disparity in the retrieved data in a
evaluation metrics like Accuracy, Precision, Recall, and F1 study utilizing the IBM HR Employee Attrition &
score with random forest having the most accurate results. Performance dataset. During the data exploration stage, the
The overall comparison has been made on all the algorithms correlation plot and histogram visualization were used to
that were taken into consideration by the authors and it was show the correlation between the continuous variables in the
very comprehensive by considering all the parameters for model. To balance the Attrition class, the SMOTE (Synthetic
accuracy. The authors in [17] selected supervised learning Minority Oversampling Technique) was used. The authors in
techniques to build the ML model to predict the employee [23] concluded that among the five methods, Logistic
attrition and evaluate the performance based on the Regression showed the highest precision accuracy of 87%,
confusion matrix and pseudo R square estimate of error rate however the accuracy could be improved by accurate feature
and concluded that random forest is the ideal model to selection and appropriate scaling.
predict attrition. They have compared various supervised
models for prediction, however they have not compared all Another study was conducted by [16]which trained and
the metrics before arriving at a conclusion. compared various machine learning models including
Decision trees, KNN, SVC and Light GBM and recorded the
In another study [18], the authors proposed a model highest accuracy with Light GBM of 99.13%. The attributes
comparing the performance analysis of 6 ML techniques which impacted the most to the dependent variable were
namely ANN, SVM, Bagging, GBT, RF and DT after carefully selected which led to the overall accuracy of all the
preprocessing and using chi-square feature selection. It was algorithms. They have not only compared the algorithms
observed that GBT performed consistently better than the based on the accuracy that was recorded but also highlighted
other techniques at predicting the employee attrition. The the benefits and tradeoffs of each algorithm. The accuracy
ML techniques were measured for their performance across for all the algorithms discussed was very high although most
various feature selection methods to achieve a clear of them were prone to overfitting.
understanding of the features that are actually contributing to
the accurate prediction. The authors in [19]investigated the A study was conducted in [24] on the data collected from
usage of the Extreme Gradient Boosting (XGBoost) the personnel records of employees in one of the Higher
approach, which is more resilient due to its regularization Institutions in South-West Nigeria which upon cleaning and
formulation. Using data from a multinational retailer's HRIS preprocessing was utilized to predict the employee attrition,
and BLS, XGBoost is compared to six previously used various decision tree models were used and compared
supervised machine learning classifiers, demonstrating much including C4.5 (J48), REPTree, and CART. WEKA
greater accuracy in predicting staff turnover. The findings of Classifier was used to compare the performance based on the
this [19] study show that the XGBoost classifier is a superior TP and FP Rate, precision, recall, F-measure and ROC Area
method for predicting turnover in terms of much greater and concluded that C4.5 performed better than the others
accuracy, comparatively short runtimes, and efficient taken into consideration. Although the data is preprocessed
memory use. In comparison to other classifiers, the before training the models the accuracy recorded is still
formulation of its regularization makes it a robust approach pretty low at 67% for C4.5 and even lower for the other,
capable of tolerating noise in HRIS data, meeting the which could be improved by making minor adjustments
primary issue in this area. thereby improving the prediction accuracy.

In a similar study [20], the authors applied data mining Employee attrition was compared and predicted on
techniques and compared the performance of Decision trees varying sizes of the data using all machine learning
(C4.5, Random Forest) and Neural Networks (MLP, Radial algorithms in [25]. The authors considered two different
Basic Function Network ) and recommended using C4.5 dataset, one collected primarily from a bank and the second
Decision trees with the highest accuracy of 95.14%. The data one being the IBM Watson dataset, and upon preprocessing
was modelled and tested using limited classifiers, and could and feature scaling the models were validated. The effects of
be implemented on other classifiers to obtain the best changes the size of the dataset in accurate prediction and
accuracy and feature selection or attribute reduction could be correlation was studied. The approach towards the problem
performed to ignore the irrelevant attributes. In [21], of employee attrition is very unique and reliable. The
conducted a comparative study to develop machine learning problem is addressed taking all the possible scenarios into
models, i.e., J84 Decision Trees, SVM, and Artificial Neural consideration. Multiple data was considered to provide a
Networks, for predicting probable employee attrition and better understanding of the accuracy of the models.
compare between the algorithms in terms of their accuracy

3
Authorized licensed use limited to: Dr. D. Y. Patil Educational Complex Akurdi. Downloaded on July 23,2024 at 05:38:42 UTC from IEEE Xplore. Restrictions apply.
In [26], the authors aimed at optimizing the K-Nearest strong correspondence with ‘YearsAtCompany’. Likewise,
Neighbor classifier to predict employee attrition more ‘YearsAtCompany’ and ‘YearsWithCurrentManager’ are
accurately and precisely. The correlation among the features also strongly associated as well. Another observation from
of the dataset was explored using the Pearson correlation Fig.4 is that the ‘EmployeeCount’ and ‘StandardHours' have
coefficient method. The strongest correlation was recorded no correlation with the other attributes in the dataset as they
with distance from home, the authors proposed the improved contain a constant value, hence these attributes are dropped
KNN algorithm and upon validating the model it yielded an and finally label encoding is applied to the remaining
accuracy of 86.7%. The authors have tried to improve the categorical attributes before feature selection and training the
performance of the existing KNN classifier to predict the model.
employee attrition by proposing the optimized algorithm,
various metrics were also considered to accurately measure
the performance of the proposed algorithm like the k-value,
ROC Curve, reliability and cumulative curve.

III. METHODOLOGY
A. Dataset
The population that was considered under this study was
employees who had been working in organizations, and their
personal as well as professional details were recorded to
predict the attrition, the data was compiled and made
available by IBM HR Analytics, the dataset contained unique
records of 1470 employees and had no missing values[27].
Each employee had 35 attributes including details about their
current job, their personal background and educational
details as well as their personal details including marital
status and relationship satisfaction based on which the target
variable (i.e. Attrition) which is categorical and binary in
nature is generated. Out of the total of 1470 observations, Fig. 1. Employee Attrition Distribution
1233 is No whereas 167 is Yes for attrition attribute. There
are 588 females and 883 male employee in data. The
attributes like geography, domicile, size of industry were not
available in the dataset which can impact the employee
attrition too.
B. Data Preprocessing and Representation
As discussed the dataset contains no missing values, the
categorical attributes containing a constant value (over18)
are identified and dropped from the data. Since
EmployeeNumber contains discrete values it is dropped as
well because it is supposed not to contribute in the
prediction.
Since this problem falls under Binary Classification
problem, i.e. whether the employee would leave the
company or not the distribution is visualized in Fig. 1. As Fig. 2. Job Satisfaction
observed the number of people who actually left the
company are very less compared to those who didn’t, which
further led to understanding how many employees were
actually satisfied working in their current role. As observed
in Fig. 2 almost 80% of employees that forms the majority of
the population considered in the study are actually satisfied
which could result in lower attrition rate in the data.
As per Fig. 3 the employees in the age group of 25-40
have shown the highest attrition rates. To get a better
understanding of the features, a heat map is also generated to
depict the correlation among the attributes. As observed from
Fig. 4, the ‘MonthlyIncome’ and ‘JobLevel’ have a strong
positive correlation as the job level increases the monthly
income increases as well. ‘SalaryHike’ and
‘PerformanceRating' are also correlated, higher the rating Fig. 3. Age-wise Attrition
higher the hike in salary, ‘TotalWorkingYears' has a weak
correlation with the ‘NumCompaniesWorked’ but has a

4
Authorized licensed use limited to: Dr. D. Y. Patil Educational Complex Akurdi. Downloaded on July 23,2024 at 05:38:42 UTC from IEEE Xplore. Restrictions apply.
Fig. 4. Correlation Heat-map

C. Feature Selection As observed in Fig. 5, BusinessTravel,

PerformanceRating, PercentSalaryHike, HourlyRate,
To get a better understanding of the features, and Gender, Education, EducationField, WorkLifeBalance,
understand which features actually affect the attrition the Department, RelationshipSatisfaction have higher p-value
most, Chi-Square Test, a filtering technique was used. A compared to the rest of the attributes, hence, they cannot be
Chi-Square test is a feature selection statistical tool that considered while training the model and therefore were not
compares the distribution of sample data to a theoretical selected. After applying the chi-square test, 20 features were
distribution generated by theoretical assumptions. It makes extracted from the data and the remaining 15 features were
predictions about whether or not the given data are more dropped as they did not contribute effectively to our target
likely to be drawn from the target population than from other prediction.
populations. A Chi-Square test for feature selection is
primarily used in machine learning, information retrieval and BusinessTravel and PerformanceRating may
bioinformatics. Most common type of test done with Chi- theoretically appear to be important features, but have a very
Square statistics is the one that involves features, such as high p-value which indicates that they are not contributing to
experimental treatments and conditions within an the predictions. After preprocessing and feature selection, the
experiment, that researchers want to see if they are more data was scaled using the StandardScaler to ensure
prevalent in one group than another. Before applying the chi- standardization of the data before applying any machine
square test to compute the p-value, the categorical attributes learning techniques.
from the data must be label encoded.

5
Authorized licensed use limited to: Dr. D. Y. Patil Educational Complex Akurdi. Downloaded on July 23,2024 at 05:38:42 UTC from IEEE Xplore. Restrictions apply.
Fig. 6. Experiment Design

Fig. 5. Attributes v/s p-Value (Chi-Square Test)

D. Experiment Design
The proposed methodology in Fig. 6 involves splitting
the dataset 80:20 into training and testing/validation data
keeping a constant random state as 0 after preprocessing,
exploratory data analysis, and feature selection. The training
subset of the data is used to train the various algorithms and
generate a model. Upon successful training, the model is
checked for accuracy against the testing data. The other
evaluation parameters used in the study are precision, recall,
F1-score, and AUC. Upon testing the models and recording
the predictions for each model, the predicted value and the
actual value is compared and the results are analyzed using Fig. 7. Optimum k- Value
confusion matrix and ROC curve. Since a variety of machine
highest precision of 84.21%. The algorithm that performed
learning algorithms and techniques can be applied to solve
the lowest was Decision Tree classifier, it not only had the
the employee attrition prediction problem, the classification
lowest accuracy but also showed very poor precision in
techniques considered for comparison in this study are
terms of prediction and underperformed in the other metrics
Logistic-Regression, Decision-Trees, Random-Forest-
as well. The confusion-matrix was created for all the
Classifier, Naïve-Bayes, K-Nearest-Neighbor and Support-
classifiers taken into consideration in this study in order to
Vector-Machines. The main goal is to identify the optimum
account for the amount of incorrect positive and
classifier with the most accurate results. The value of k in K-
incorrect negative results, which is crucial to assessing the
Nearest-Neighbor will be determined by the input data, since
performance of the model in its purest meaning [30][28]. The
data with more outliers or noise would most likely perform
confusion-matrix for each of the methods employed in this
better with larger values of k. Fig. 7 depicts the change in
study is shown from Fig. 8 to Fig. 13.
error with respect to k-values for training and testing data, As
observed, for k=5, the test error is the minimum, hence k is
considered as 5 in this study.
IV.RESULTS AND DISCUSSIONS
After exploration and preprocessing of the data, the most
impactful features were selected, the data was trained and
tested and the accuracy was generated of each model,
Logistic Regression had the highest accuracy among all the
classifiers studied, It was fine tuned to arrive at an accuracy
of 87.76%, Other algorithms also gave promising results, K-
Nearest Neighbor recorded an accuracy of 87.75% even
Fig. 9. Decision Tree Confusion
though Logistic Regression it is prone to overfitting, the Fig. 8. Logistic Regression
Matrix
results recorded are realistic with a precision of 78.26% and Confusion Matrix
KNN was the most precise in terms of prediction and
recorded the

6
Authorized licensed use limited to: Dr. D. Y. Patil Educational Complex Akurdi. Downloaded on July 23,2024 at 05:38:42 UTC from IEEE Xplore. Restrictions apply.
Similarly, Decision Tree classifier has recorded the
lowest performance in all the metrics considered in this study
and is therefore, also rejected. Based on the metrics
considered, SVM with the linear kernel also shows
promising results when tested on the data, however, Logistic
Regression is considered as the ideal classifier for the data
under study as it outperforms the other classifiers on all the
validation metrics in terms of accuracy, precision, recall, F1
score and AUC and can accurately predict the employee
Fig. 10. Naïve Bayes Confusion attrition in an organization.
Matrix Fig. 11. Random Forest Confusion
Matrix
TABLE I. RESULT METRICS

Accu- F-1
Precision Recall
Classifier racy Score AUC
(%) (%)
(%) (%)
Logistic
87.76 78.26 36.73 50.00 0.8440
Regression
Decision
77.55 33.33 34.69 34.00 0.6041
Trees
Random
85.03 69.23 18.36 29.03 0.7830
Forest
Fig 12. K-Nearest Neighbour Fig 13. Scalar Vector Machine Naïve
80.27 40.82 40.82 40.82 0.7551
Confusion Matrix Confusion Matrix Bayes
K-Nearest
Fig. 14 shows the combined ROC curve of all the 87.75 84.21 32.65 47.06 0.6788
Neighbor
algorithms with their respective AUC scores. AUC is a
metric that aggregates the performance across all Support
categorization criteria. AUC may be interpreted as the Vector 84.35 55.17 32.65 41.03 0.7637
likelihood that the model rates a random positive case higher Machines
than a random negative example. It is observed that similar
to the above metrics, Logistic Regression has the highest A. Optimising the Model
AUC score of 0.8440. However, SVM and Random Forest As Logistic Regression outperformed the other
shows improved AUC score of 0.7637 and 0.7863 which algorithms considered in this study, using the ROC curve in
implies that they have potential in predicting employee Fig. 14, an attempt is made to optimize and overall improve
attrition correctly. the performance and efficiency of logistic-regression model.
As observed in Table 1, confusion matrices and Fig.14, Since the default threshold value considered while
LR shows the best results, with the highest AUC score of training the logistic regression algorithm is 0.5, based on the
0.844, followed by Random Forest Classifier with the AUC ROC curve and AUC value as observed in Fig. 14, the
score of 0.6788. KNN model trained with k=5 can be threshold value was changed to 0.4 in order to optimize the
considered as an equally efficient model just like Logistic results and performance of the running model. As observed
Regression, since it has similar accuracy to Logistic in Table 2 and Fig.15.
Regression and better precision of 84.21%, Recall and F-1
scores are also comparable, however, the AUC score of The accuracy has reduced upon changing the threshold
KNN is one of the lowest after Decision Tree classifier value, from 87.76% to 86.73%, however, the count of false-
among all the algorithms considered. negative values which are most critical in the study where
the true label is 1, but the predicted label is 0, i.e. the
employee would leave the organization, but the model
predicted that the employee would not leave, has reduced
from 31 to 26, which can be considered as an improvement
of the model. Hence, even though the accuracy has very
slightly reduced, the wrong or incorrect classifications has
improved by optimizing the threshold value.

TABLE II. OPTIMIZED RESULTS FOR LOGISTIC REGRESSION WITH

THRESHOLD VALUE AS 0.4

Accuracy Precision Recall F-1 Score

86.73% 63.89% 46.94% 54.11%

Fig. 14. ROC Curve and AUC

7
Authorized licensed use limited to: Dr. D. Y. Patil Educational Complex Akurdi. Downloaded on July 23,2024 at 05:38:42 UTC from IEEE Xplore. Restrictions apply.
opportunities and hence this is also an important factor while
predicting attrition.

Fig. 15. Optimized-Confusion-Matrix for Logistic Regression

B. Discussion
A company should focus on both its organizational
culture and the work environment if they want to avoid
turnover. It is important to create a motivating work
environment that gives meaning and purpose so that
employees know what their contributions mean. When Fig. 16. Major factors that contribute towards attrition
employees get a feel like their input matters, they feel more Fig. 16 depicts the major factors that contribute towards
invested in staying with the organization for the long-term. attrition of an employee which should be the primarily
At any given time, a business can be losing more than 50% considered during hiring to prevent attrition. Employees in
of its employees. This can lead to inefficiencies, an inability the age group of 25-40 years having the monthly income
to meet goals and deadlines, and lackluster productivity. The between $2000-$3000 with 0-5 years of total working
causes of employee attrition are as varied as the individuals experience and 0-2 years at the company and 6-8 years in the
themselves. current role have a higher tendency to leave the organization.
In this paper, the factors that contributed the most
towards attrition was MonthlyIncome, other features that V. CONCLUSION
played a significant role in predicting attrition of an The current study helps to generate new insights
employee were DailyRate, Age, TotalWorkingYears, regarding employee attrition which cannot be generated by
YearsAtCompany, YearsInCurrentRole, DistanceFromHome merely conducting exit interviews with employee. Income,
and Overtime. Income acts as the biggest motivation for years in current role, age, total working years, and number of
employees working in an organization to continue working years in the same company were the most import factors for
there, if they are satisfied with their income the attrition employee attrition. Employee Attrition Prediction is a
would be reduced and they would be retained, Age also business strategy concerned with the processes of predicting
impacts attrition, as observed in Fig. 3, employees who are how many employees will leave an enterprise over the next
currently in the ages 25 to 40 are more prone to leave the year, and which groups of employees are more likely to be
organization as in the initial years of their career they are on their way out. This information can help companies
filled with the zeal to take on new challenges and risks evaluate their workforce, monitor the effectiveness of
thereby switch companies more often, also the retention programs, and plan for workforce reductions.
responsibilities are comparatively less which makes it easier
for them to leave more easily. This paper identified the factors that strongly affected the
attrition of an employee and the selected features were used
It is also observed from this study that, the more work to train the algorithms and test them against various inputs
experience an individual has in terms of number of years, which were alien to the model and the performance was
and the years they have spent in their current organization recorded to arrive at the optimum method to tackle the
and role would affect their chances of leaving the problem of employee attrition and aid in early prediction of
organization as well which is why it is also an important the same.
feature to consider during hiring and later for retention as
well. Various machine learning algorithms were implemented
out of which Logistic regression gave the most accurate
More often, an employee is unsatisfied if the organization results and outperformed the other methods on all the metrics
they are working for is geographically located far from their considered and the selected model was optimized to reduce
place of residence, as the time spent in commuting increases the misclassifications and can therefore be used for
which also affects the overall productivity of the employee predicting employee attrition most accurately.
and can also sometimes be considered as an added expense.
An employee does not voluntarily choose to work overtime The first step to reduce employee turnover is to find out
unless there is a personal motive in terms of gaining extra where attrition starts. When answering this question it's
remuneration, however if the employee is working overtime important for companies to make sure they're looking at all
but not out of choice it could lead to a poor productivity and possible sources that could lead to attrition- both voluntary
with time the employee would consider looking for other and involuntary- because otherwise they won't know where
they should focus their efforts. A more accurate and

8
Authorized licensed use limited to: Dr. D. Y. Patil Educational Complex Akurdi. Downloaded on July 23,2024 at 05:38:42 UTC from IEEE Xplore. Restrictions apply.
insightful prediction of employee attrition relies on a variety Eng. Technol., vol. 11, no. 12, pp. 3329–3341, 2020, [Online].
of factors. It is possible to predict the probability that an Available: https://doi.org/10.34218/IJARET.11.12.2020.313
[19] R. Punnoose and P. Ajit, “Prediction of Employee Turnover in
employee will leave their current job based on their Organizations using Machine Learning Algorithms”, Int. J. Adv.
personality, background, and professional values. It also Res. Artif. Intell., vol. 5, no. 9, pp. 22–26, 2016, doi:
relies on workplace strife, employer prestige, and other non- 10.14569/ijarai.2016.050904.
personal factors. [20] J. Hamidah, H. AbdulRazak, and A. O. Zulaiha, “Towards applying
data mining techniques for talent managements”, 2009 Int. Conf.
With the available dataset, this study optimized the Comput. Eng. Appl. IPCSIT, vol. 2, no. March 2015, pp. 476–481,
results of best performing model out of the six algorithms 2011.
used in the current study. In future with larger datasets, [21] N. Mansor, N. S. Sani, and M. Aliff, “Machine Learning for
Predicting Employee Attrition”, Int. J. Adv. Comput. Sci. Appl., vol.
employee segmentation can be studied to develop ‘at risk’ 12, no. 11, pp. 435–445, 2021, doi:
categories of employees and deep learning algorithm can 10.14569/IJACSA.2021.0121149.
also be employed to study employee attrition. [22] A. Patel, N. Pardeshi, S. Patil, S. Sutar, R. Sadafule, and S. Bhat,
“Employee Attrition Predictive Model Using Machine Learning”,
REFERENCES Int. Res. J. Eng. Technol., no. May, pp. 3855–3859, 2020, [Online].
Available: www.irjet.net
[1] IBM, “What is logistic regression?”, 2022.
[23] K. K. Mohbey, “Employee’s attrition prediction using machine
https://www.ibm.com/topics/logistic-regression (accessed Sep. 20,
learning approaches”, Mach. Learn. Deep Learn. Real-Time Appl.,
2022).
no. January, pp. 121–128, 2020, doi: 10.4018/978-1-7998-3095-
[2] H. H. Patel and P. Prajapati, “Study and Analysis of Decision Tree
5.ch005.
Based Classification Algorithms”, Int. J. Comput. Sci. Eng., vol. 6,
[24] A.A.D. Alao, “Analyzing Employee Attrition using Decision Tree
no. 10, pp. 74–78, 2018, doi: 10.26438/ijcse/v6i10.7478.
Algorithms”, Inf. Syst. Dev. Informatics, vol. 4, no. 1, pp. 17–28,
[3] L. Breiman, “Random Forests”, Mach. Learn., vol. 45, no. 1, pp. 5–
2013, [Online]. Available:
32, 2001, doi: 10.1023/A:1010933404324.
http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.1012.294
[4] A. Sarica, A. Cerasa, and A. Quattrone, “Random Forest Algorithm
7&rep=rep1&type=pdf
for the Classification of Neuroimaging Data in Alzheimer’s Disease:
[25] Y. Zhao, M. K. Hryniewicki, F. Cheng, B. Fu, and X. Zhu,
A Systematic Review”, Front. Aging Neurosci., vol. 9, 2017, doi:
"Employee Turnover Prediction with Machine Learning: A Reliable
10.3389/fnagi.2017.00329.
Approach", vol. 869. Springer International Publishing, 2018. doi:
[5] I. Rish, “An Empirical Study of the Naïve Bayes Classifier An
10.1007/978-3-030-01057-7_56.
empirical study of the naive Bayes classifier”, Cc.Gatech.Edu, no.
[26] T.A. Assegie, “A Predictive Model for Improving Employee
January 2001, pp. 41–46, 2014, [Online]. Available:
Attrition Rate With K-Nearest Neighbor Classifier”, Int. J. of
https://www.cc.gatech.edu/~isbell/reading/papers/Rish.pdf
Research and Reviews in App. Sci., vol. 46, no. 1, pp. 78–84, 2021,
[6] S. Raschka, “Naive Bayes and Text Classification I - Introduction
[Online]. Available:
and Theory.” arXiv, 2014. doi: 10.48550/ARXIV.1410.5329.
www.arpapress.com/Volumes/Vol46Issue1/IJRRAS_46_1_09.pdf
[7] Z. Zhang, “Introduction to machine learning: K-nearest neighbors”
[27] IBM, IBM HR Analytics Employee Attrition & Performance. 2019.
Ann. Transl. Med., vol. 4, no. 11, pp. 1–7, 2016, doi:
[Online]. Available: https://github.com/IBM/employee-attrition-
10.21037/atm.2016.03.37.
aif360
[8] D. SBoswell, “An Introduction to Support Vector Machines”,
[28] M. Hasnain, M. F. Pasha, I. Ghani, M. Imran, M. Y. Alzahrani, and
Recent Adv. Trends Nonparametric Stat., pp. 3–17, 2002, doi:
R. Budiarto, “Evaluating Trust Prediction and Confusion Matrix
10.1016/B978-044451378-6/50001-6.
Measures for Web Services Ranking”, IEEE Access, vol. 8, pp.
[9] D. Singh, “A Literature Review on Employee Retention with Focus
90847–90861, 2020, doi: 10.1109/ACCESS.2020.2994222.
on Recent Trends”, Int. J. Sci. Res. Sci. Eng. Technol., no. May, pp.
425–431, 2019, doi: 10.32628/ijsrst195463.
[10] M. Pratt, M. Boudhane, and S. Cakula, “Employee Attrition
Estimation Using Random Forest Algorithm”, Balt. J. Mod.
Comput., vol. 9, no. 1, pp. 49–66, 2021, doi:
10.22364/BJMC.2021.9.1.04.
[11] F. Fallucchi, M. Coladangelo, R. Giuliano, and E. W. De Luca,
“Predicting Employee Attrition Using Machine Learning
Techniques”, Computers, vol. 9, no. 4, pp. 1–17, 2020, doi:
10.3390/computers9040086.
[12] R. Yedida, R. Reddy, R. Vahi, R. Jana, A. GV, and D. Kulkarni,
“Employee Attrition Prediction”, arXiv, 2018. doi:
10.48550/ARXIV.1806.10480.
[13] S. Al-Darraji, D. G. Honi, F. Fallucchi, A. I. Abdulsada, R.
Giuliano, and H. A. Abdulmalik, “Employee Attrition Prediction
Using Deep Neural Networks Salah”, Computers, vol. 10, no. 11,
pp. 1–11, 2021, doi: 10.3390/computers10110141.
[14] S. Gupta, “Employee Attrition Rate Prediction Using Machine
Learning”, Code Algorithms Pvt. Ltd, 2022.
https://www.enjoyalgorithms.com/blog/attrition-rate-prediction-
using-ml
[15] F. K. Alsheref, I. E. Fattoh, and W. M Ead, “Automated Prediction
of Employee Attrition Using Ensemble Model Based on Machine
Learning Algorithms”, Comput. Intell. Neurosci., vol. 2022, p.
7728668, 2022, doi: 10.1155/2022/7728668.
[16] S. Aggarwal, M. Singh, S. Chauhan, M. Sharma, and D. Jain,
“Employee Attrition Prediction Using Machine Learning
Comparative Study”, Smart Innov. Syst. Technol., vol. 265, no. 9,
pp. 453–466, 2022, doi: 10.1007/978-981-16-6482-3_45.
[17] D. R. S. Kamath, D. S. S. Jamsandekar, and D. P. G. Naik,
“Machine Learning Approach for Employee Attrition Analysis”, Int.
J. Trend Sci. Res. Dev., vol. Special Is, no. Special Issue-
FIIIIPM2019, pp. 62–67, 2019, doi: 10.31142/ijtsrd23065.
[18] M. Subhashini and R. Gopinath, “Employee Attrition Prediction in
Industry Using Machine Learning Techniques”, Int. J. Adv. Res.

9
Authorized licensed use limited to: Dr. D. Y. Patil Educational Complex Akurdi. Downloaded on July 23,2024 at 05:38:42 UTC from IEEE Xplore. Restrictions apply.

Salary Prediction-2
No ratings yet
Salary Prediction-2
26 pages
5 Ieee
No ratings yet
5 Ieee
6 pages
Employee Attrition Prediction
100% (1)
Employee Attrition Prediction
21 pages
Summer Internship Report
No ratings yet
Summer Internship Report
24 pages
Salary Prediction
No ratings yet
Salary Prediction
4 pages
MaWinPaPaMayPhyoAung - First Seminar
No ratings yet
MaWinPaPaMayPhyoAung - First Seminar
21 pages
Identification of Human Resource Analytics Using Machine Learning Algorithms
No ratings yet
Identification of Human Resource Analytics Using Machine Learning Algorithms
12 pages
Employee Churn Prediction Using Logistic Regression
No ratings yet
Employee Churn Prediction Using Logistic Regression
72 pages
ANLY 502 Final Report
No ratings yet
ANLY 502 Final Report
7 pages
Towards Understanding Employee Attrition Using Decision Tree
100% (1)
Towards Understanding Employee Attrition Using Decision Tree
4 pages
Ibm Attrition Practices
No ratings yet
Ibm Attrition Practices
7 pages
Employee Attrition Prediction Using Machine Learning Models: A Review Paper
No ratings yet
Employee Attrition Prediction Using Machine Learning Models: A Review Paper
27 pages
Attrition Project Mangal
No ratings yet
Attrition Project Mangal
75 pages
AIP - Aip 202501 0006
No ratings yet
AIP - Aip 202501 0006
16 pages
18 Intellisys Employee
No ratings yet
18 Intellisys Employee
22 pages
Industry 4 0, Smart Manufacturing, and Industrial Engineering Challenges
No ratings yet
Industry 4 0, Smart Manufacturing, and Industrial Engineering Challenges
389 pages
Employee Attrition Prediction Using Machine Learning
No ratings yet
Employee Attrition Prediction Using Machine Learning
9 pages
Mathematics 11 04677
No ratings yet
Mathematics 11 04677
25 pages
Evaluation of Machine Learning Models For Employee Churn
No ratings yet
Evaluation of Machine Learning Models For Employee Churn
5 pages
Ataiml 02.04 04
No ratings yet
Ataiml 02.04 04
14 pages
Employee Turnover Prediction
100% (1)
Employee Turnover Prediction
16 pages
Research Paper 102
No ratings yet
Research Paper 102
8 pages
Reportprediction of Employee Atrition Uisng Machine Learning
No ratings yet
Reportprediction of Employee Atrition Uisng Machine Learning
6 pages
Emloyee Attrition and Retention
No ratings yet
Emloyee Attrition and Retention
17 pages
HR Review1
No ratings yet
HR Review1
11 pages
10 1109@iadcc 2018 8692137
No ratings yet
10 1109@iadcc 2018 8692137
6 pages
DOCUMENTATION12
No ratings yet
DOCUMENTATION12
42 pages
IBM HR Analytics For Employee Attrition and Performance Prediction
No ratings yet
IBM HR Analytics For Employee Attrition and Performance Prediction
44 pages
Evaluating Employee Attrition - Design and Implementation
No ratings yet
Evaluating Employee Attrition - Design and Implementation
10 pages
IBM Analysis
No ratings yet
IBM Analysis
17 pages
Cdu 1121 09
No ratings yet
Cdu 1121 09
10 pages
Employee Turnover Prediction
No ratings yet
Employee Turnover Prediction
12 pages
ISE 527 Term Paper
No ratings yet
ISE 527 Term Paper
17 pages
Prediction of Employee Turnover
No ratings yet
Prediction of Employee Turnover
6 pages
Research Paper
No ratings yet
Research Paper
5 pages
ISE 527 IEEE Access LaTeX Template
No ratings yet
ISE 527 IEEE Access LaTeX Template
16 pages
HR Analytics - Employee Attrition Analysis Using Random Forest
No ratings yet
HR Analytics - Employee Attrition Analysis Using Random Forest
7 pages
Karpagam Sep Oct 2019 Article 6
No ratings yet
Karpagam Sep Oct 2019 Article 6
6 pages
DATA4800 Report
No ratings yet
DATA4800 Report
6 pages
Employee Attrition PREDICTION Using Machine Learning
No ratings yet
Employee Attrition PREDICTION Using Machine Learning
11 pages
Employee Attrition Rate Prediction Using Machine Learning Approach
No ratings yet
Employee Attrition Rate Prediction Using Machine Learning Approach
8 pages
Applsci 12 06424
No ratings yet
Applsci 12 06424
17 pages
Batch 16
No ratings yet
Batch 16
8 pages
Db15 Conference
No ratings yet
Db15 Conference
6 pages
Applsci 13 00267
No ratings yet
Applsci 13 00267
8 pages
11783-Article Text-8048-2-10-20221004
No ratings yet
11783-Article Text-8048-2-10-20221004
9 pages
Employee Turnover1
No ratings yet
Employee Turnover1
4 pages
Foreseeing Employee Attritions Using Div
No ratings yet
Foreseeing Employee Attritions Using Div
7 pages
Tentative Research Topic
No ratings yet
Tentative Research Topic
4 pages
Employee Attrition Miniblogs
100% (1)
Employee Attrition Miniblogs
15 pages
Data Mining
No ratings yet
Data Mining
17 pages
ISE527 Proposal
No ratings yet
ISE527 Proposal
3 pages
941-Article Text-9536-1-10-20240830
No ratings yet
941-Article Text-9536-1-10-20240830
12 pages
Employee Future Prediction
No ratings yet
Employee Future Prediction
3 pages
Predicting Employee Attrition Using XGBoost Machine Learning
No ratings yet
Predicting Employee Attrition Using XGBoost Machine Learning
8 pages
Attrition Prediction: Schandia@cit - Edu.in
No ratings yet
Attrition Prediction: Schandia@cit - Edu.in
1 page
Early Prediction of Employee Attrition Using Data Mining Techniques
No ratings yet
Early Prediction of Employee Attrition Using Data Mining Techniques
6 pages
Unleashing Potential of Employees Through Artificial Intelligence
No ratings yet
Unleashing Potential of Employees Through Artificial Intelligence
3 pages
Employee Attrition Prediction
No ratings yet
Employee Attrition Prediction
3 pages
PGP Machine Learning Brochure
No ratings yet
PGP Machine Learning Brochure
20 pages
Enhancing AI Systems With Agentic Workflows Patterns in Large Language Model
No ratings yet
Enhancing AI Systems With Agentic Workflows Patterns in Large Language Model
6 pages
Generative AI For Software Practitioners
No ratings yet
Generative AI For Software Practitioners
9 pages
Transforming Digital Employee Experience With Artificial Intelligence
No ratings yet
Transforming Digital Employee Experience With Artificial Intelligence
4 pages
Machine Learning and Flow Assurance in Oil and Gas Production
No ratings yet
Machine Learning and Flow Assurance in Oil and Gas Production
6 pages
Data Science Vijay1
No ratings yet
Data Science Vijay1
88 pages
A Hybrid Prediction Model Integrating Artificial Intelligence and Geospatial Analysis For Disaster Management
No ratings yet
A Hybrid Prediction Model Integrating Artificial Intelligence and Geospatial Analysis For Disaster Management
12 pages
Road Accident Prediction Model Using Data Mining Techniques
100% (1)
Road Accident Prediction Model Using Data Mining Techniques
6 pages
Shashank Iot ML
No ratings yet
Shashank Iot ML
32 pages
Publication 4
No ratings yet
Publication 4
21 pages
Enhancing Error Prediction in Machineries Through Sensor Data Fusion
No ratings yet
Enhancing Error Prediction in Machineries Through Sensor Data Fusion
78 pages
023-06-21-Evaluating Overtaking and Filtering Maneuver of Motorcyclists and Car Drivers Using Advanced Trajectory Data analysis-IJICSP
No ratings yet
023-06-21-Evaluating Overtaking and Filtering Maneuver of Motorcyclists and Car Drivers Using Advanced Trajectory Data analysis-IJICSP
18 pages
A Case Study On Data Classification Approach Using K-Nearest Neighbor
No ratings yet
A Case Study On Data Classification Approach Using K-Nearest Neighbor
7 pages
Role of Artificial Intelligence in Organizational Culture and Workplace
No ratings yet
Role of Artificial Intelligence in Organizational Culture and Workplace
5 pages
Spam Email. Classifier
No ratings yet
Spam Email. Classifier
44 pages
Towards A Responsible AI Metrics Catalogue A Collection of Metrics For AI Accountability
No ratings yet
Towards A Responsible AI Metrics Catalogue A Collection of Metrics For AI Accountability
12 pages
Custom Developer GPT For Ethical AI Solutions
No ratings yet
Custom Developer GPT For Ethical AI Solutions
2 pages
Pone.0302236 A Hybrid CNN-SVM Model For Enhanced Autism Diagnosis
No ratings yet
Pone.0302236 A Hybrid CNN-SVM Model For Enhanced Autism Diagnosis
20 pages
Unit 4 MCQ
No ratings yet
Unit 4 MCQ
10 pages
Biomedbench: A Benchmark Suite of Tinyml Biomedical Applications For Low-Power Wearables
No ratings yet
Biomedbench: A Benchmark Suite of Tinyml Biomedical Applications For Low-Power Wearables
7 pages
Exploring The Applications of Machine Learning in Healthcare
No ratings yet
Exploring The Applications of Machine Learning in Healthcare
16 pages
Handwritten Digit Recognition Using Machine Learning
No ratings yet
Handwritten Digit Recognition Using Machine Learning
5 pages
REPORT - STOCK PRICE PREDICTION - New
No ratings yet
REPORT - STOCK PRICE PREDICTION - New
40 pages
A Bibliometric View of AI Ethics Development
No ratings yet
A Bibliometric View of AI Ethics Development
5 pages
Detection of Advanced Malware by Machine Learning Techniques
No ratings yet
Detection of Advanced Malware by Machine Learning Techniques
8 pages
Intelligent Web Security: Machine Learning-Based SQL Injection Detection and Honeypot Integration
No ratings yet
Intelligent Web Security: Machine Learning-Based SQL Injection Detection and Honeypot Integration
7 pages
ML Questions
No ratings yet
ML Questions
6 pages
Chapter 4 - Kernel Theory
No ratings yet
Chapter 4 - Kernel Theory
9 pages
4026-Article Text-9944-1-10-20190730
No ratings yet
4026-Article Text-9944-1-10-20190730
8 pages
Final CC Pract No 5
No ratings yet
Final CC Pract No 5
10 pages
Developing A Desktop-Based Offline Quiz Application
No ratings yet
Developing A Desktop-Based Offline Quiz Application
6 pages
Performance Analysis of Deep Neural Network and Machine Learning Algorithms For Diabetes Prediction
No ratings yet
Performance Analysis of Deep Neural Network and Machine Learning Algorithms For Diabetes Prediction
6 pages
Digital Image Forensic Approach To Counter The JPEG Anti-Forensic Attacks
No ratings yet
Digital Image Forensic Approach To Counter The JPEG Anti-Forensic Attacks
12 pages
Impact of Machine Learning On The Productivity of Employees in Workplace
No ratings yet
Impact of Machine Learning On The Productivity of Employees in Workplace
5 pages
Explainable AI For Software Engineering
No ratings yet
Explainable AI For Software Engineering
2 pages
Divya - Detection of Face Spoofs With Raspberry Pi
No ratings yet
Divya - Detection of Face Spoofs With Raspberry Pi
34 pages
Expert Systems With Applications: Dana Bani-Hani, Mohammad Khasawneh
No ratings yet
Expert Systems With Applications: Dana Bani-Hani, Mohammad Khasawneh
14 pages
Water Levels Forecast in Thailand: A Case Study of Chao Phraya River
No ratings yet
Water Levels Forecast in Thailand: A Case Study of Chao Phraya River
6 pages
Audit TE Poster
No ratings yet
Audit TE Poster
1 page
A Self-Learning Approach To Single Image Super-Resolution: Min-Chun Yang and Yu-Chiang Frank Wang, Member, IEEE
No ratings yet
A Self-Learning Approach To Single Image Super-Resolution: Min-Chun Yang and Yu-Chiang Frank Wang, Member, IEEE
11 pages
Artificial Intelligence and Knowledge Processing: Methods and Applications
From Everand
Artificial Intelligence and Knowledge Processing: Methods and Applications
Hemachandran K.
No ratings yet
Generative Adversarial Networks with Industrial Use Cases: Learning How to Build GAN Applications for Retail, Healthcare, Telecom, Media, Education, and HRTech
From Everand
Generative Adversarial Networks with Industrial Use Cases: Learning How to Build GAN Applications for Retail, Healthcare, Telecom, Media, Education, and HRTech
Navin K Manaswi
No ratings yet
Introduction To Augmented Reality Hardware: Augmented Reality Will Change The Way We Live Now: 1, #1
From Everand
Introduction To Augmented Reality Hardware: Augmented Reality Will Change The Way We Live Now: 1, #1
Kaviyaraj R
No ratings yet

A Novel Optimized Approach For Machine Learning Techniques For Predicting Employee Attrition

Uploaded by

A Novel Optimized Approach For Machine Learning Techniques For Predicting Employee Attrition

Uploaded by

2022 International Conference on Smart Generation Computing, Communication and Networking (SMART GENCON)

Karnataka, India. Dec 23-25, 2022

A Novel Optimized Approach for Machine

Learning Techniques for Predicting Employee

978-1-6654-5499-5/22/$31.00 ©2022 IEEE 1

C. Feature Selection As observed in Fig. 5, BusinessTravel,

Fig. 5. Attributes v/s p-Value (Chi-Square Test)

TABLE II. OPTIMIZED RESULTS FOR LOGISTIC REGRESSION WITH

Accuracy Precision Recall F-1 Score

Fig. 14. ROC Curve and AUC

Fig. 15. Optimized-Confusion-Matrix for Logistic Regression

You might also like