Documentation - Ishaan Mittal - Jio - Assessment
Documentation - Ishaan Mittal - Jio - Assessment
DOCUMENTATION
0 1 manage 15 male no
1 2 admin 16 male no
2 3 admin 12 female no
3 4 admin 8 female no
5 admin 15 male no
4
For comparing the distribution of job roles among different gender groups
using a stacked bar chart, the data in the dataframe is grouped by two
columns, 'job' and 'gender'. The size (count) of each group is calculated and
results are organised into a table where job roles are in rows and genders are
in columns. If there are missing combinations, it fills them with 0 using
unstack(fill_value=0). Through parameters ‘kind’ (=bar) and ‘stacked’ (=True)
stacked bar chart is plotted.
Made by Ishaan Mittal (Btech ECE’23 Jamia Millia Islamia)
accuracy 0.93 95
macro avg 0.92 0.76 0.81 95
weighted avg 0.93 0.93 0.92 95
Made by Ishaan Mittal (Btech ECE’23 Jamia Millia Islamia)
Recall:
➢ Class 0 (admin): Recall is good at 0.96, indicating that the
model correctly identifies 96% of the actual admin roles.
➢ Class 1 (manager): Recall is lower at 0.50, suggesting that the
model captures only half of the actual manager roles.
➢ Class 2 (custodial): Recall is reasonable at 0.81, meaning the
model correctly identifies 81% of the actual custodial roles.
F1-Score:
➢ Class 0 (admin): F1-score is high at 0.95, which is a balanced
measure of precision and recall.
➢ Class 1 (manager): F1-score is 0.67, indicating a decent balance
between precision and recall for managers.
➢ Class 2 (custodial): F1-score is 0.81, showing a good balance
Made by Ishaan Mittal (Btech ECE’23 Jamia Millia Islamia)
The Decision Tree Classifier worked well on the dataset giving 93% accuracy.
The feature "education" is crucial in determining job roles in this model. It has
a significant impact on the predictions, meaning a person's education level
strongly influences whether they are classified as an admin, manager, or
custodial worker. In practical terms, it suggests that education plays a big
role in job assignments in this dataset, and it's the most important factor the
model considers when making predictions.