0% found this document useful (0 votes)
5 views

m2

Uploaded by

jaydaniel1113
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views

m2

Uploaded by

jaydaniel1113
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 18

6/2/2021 temp-162262094050084495

Open in Colab
(https://colab.research.google.com/github/JAYASURYAb/ML-
project2/blob/master/Campus_recruitment.ipynb)

Importing Libraries
In [1]:

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

In [2]:

dataset = pd.read_csv('/content/Placement_Data_Full_Class .csv')

In [3]:

dataset

Out[3]:

sl_no gender ssc_p ssc_b hsc_p hsc_b hsc_s degree_p degree_t work

0 1 M 67.00 Others 91.00 Others Commerce 58.00 Sci&Tech

1 2 M 79.33 Central 78.33 Others Science 77.48 Sci&Tech Y

2 3 M 65.00 Central 68.00 Central Arts 64.00 Comm&Mgmt

3 4 M 56.00 Central 52.00 Central Science 52.00 Sci&Tech

4 5 M 85.80 Central 73.60 Central Commerce 73.30 Comm&Mgmt

... ... ... ... ... ... ... ... ... ...

210 211 M 80.60 Others 82.00 Others Commerce 77.60 Comm&Mgmt

211 212 M 58.00 Others 60.00 Others Science 72.00 Sci&Tech

212 213 M 67.00 Others 67.00 Others Commerce 73.00 Comm&Mgmt Y

213 214 F 74.00 Others 66.00 Others Commerce 58.00 Comm&Mgmt

214 215 M 62.00 Central 58.00 Others Science 53.00 Comm&Mgmt

215 rows × 15 columns

https://htmtopdf.herokuapp.com/ipynbviewer/temp/28bc1f836ea2273deddb4b773b060bee/Campus_recruitment.html?t=1622620941713 1/18
6/2/2021 temp-162262094050084495

In [4]:

dataset.describe()

Out[4]:

sl_no ssc_p hsc_p degree_p etest_p mba_p salar

count 215.000000 215.000000 215.000000 215.000000 215.000000 215.000000 148.00000

mean 108.000000 67.303395 66.333163 66.370186 72.100558 62.278186 288655.40540

std 62.209324 10.827205 10.897509 7.358743 13.275956 5.833385 93457.45242

min 1.000000 40.890000 37.000000 50.000000 50.000000 51.210000 200000.00000

25% 54.500000 60.600000 60.900000 61.000000 60.000000 57.945000 240000.00000

50% 108.000000 67.000000 65.000000 66.000000 71.000000 62.000000 265000.00000

75% 161.500000 75.700000 73.000000 72.000000 83.500000 66.255000 300000.00000

max 215.000000 89.400000 97.700000 91.000000 98.000000 77.890000 940000.00000

In [5]:

dataset.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 215 entries, 0 to 214
Data columns (total 15 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 sl_no 215 non-null int64
1 gender 215 non-null object
2 ssc_p 215 non-null float64
3 ssc_b 215 non-null object
4 hsc_p 215 non-null float64
5 hsc_b 215 non-null object
6 hsc_s 215 non-null object
7 degree_p 215 non-null float64
8 degree_t 215 non-null object
9 workex 215 non-null object
10 etest_p 215 non-null float64
11 specialisation 215 non-null object
12 mba_p 215 non-null float64
13 status 215 non-null object
14 salary 148 non-null float64
dtypes: float64(6), int64(1), object(8)
memory usage: 25.3+ KB

We can observe we have null values only in salary column,So we have, 215 - 148 = 67.

Therfore,we have 67 null values.That means 67 members are not hired.

In [6]:

dataset['salary'].fillna(value=0, inplace=True)

https://htmtopdf.herokuapp.com/ipynbviewer/temp/28bc1f836ea2273deddb4b773b060bee/Campus_recruitment.html?t=1622620941713 2/18
6/2/2021 temp-162262094050084495

Filling missing value by 0,So that now ,we don't get any null values

In [7]:

dataset.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 215 entries, 0 to 214
Data columns (total 15 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 sl_no 215 non-null int64
1 gender 215 non-null object
2 ssc_p 215 non-null float64
3 ssc_b 215 non-null object
4 hsc_p 215 non-null float64
5 hsc_b 215 non-null object
6 hsc_s 215 non-null object
7 degree_p 215 non-null float64
8 degree_t 215 non-null object
9 workex 215 non-null object
10 etest_p 215 non-null float64
11 specialisation 215 non-null object
12 mba_p 215 non-null float64
13 status 215 non-null object
14 salary 215 non-null float64
dtypes: float64(6), int64(1), object(8)
memory usage: 25.3+ KB

Getting column objects data

https://htmtopdf.herokuapp.com/ipynbviewer/temp/28bc1f836ea2273deddb4b773b060bee/Campus_recruitment.html?t=1622620941713 3/18
6/2/2021 temp-162262094050084495

In [8]:

column=dataset.select_dtypes(include=['object'])
for col in column:
display(dataset[col].value_counts())

M 139
F 76
Name: gender, dtype: int64

Central 116
Others 99
Name: ssc_b, dtype: int64

Others 131
Central 84
Name: hsc_b, dtype: int64

Commerce 113
Science 91
Arts 11
Name: hsc_s, dtype: int64

Comm&Mgmt 145
Sci&Tech 59
Others 11
Name: degree_t, dtype: int64

No 141
Yes 74
Name: workex, dtype: int64

Mkt&Fin 120
Mkt&HR 95
Name: specialisation, dtype: int64

Placed 148
Not Placed 67
Name: status, dtype: int64

Except for hsc_s and degree_t with 3 classes, all other have 2 classes each and we can notice 148
students are placed and 67 students are not placed. Now the challenge is:

Which factor influenced a candidate in getting placed?

Exploring & Visualizations Data by each Features

Gender

In [9]:

import seaborn as sns

/usr/local/lib/python3.6/dist-packages/statsmodels/tools/_testing.py:19: F
utureWarning: pandas.util.testing is deprecated. Use the functions in the
public API at pandas.testing instead.
import pandas.util.testing as tm

https://htmtopdf.herokuapp.com/ipynbviewer/temp/28bc1f836ea2273deddb4b773b060bee/Campus_recruitment.html?t=1622620941713 4/18
6/2/2021 temp-162262094050084495

In [10]:

plt.style.use('seaborn-white')
f,ax=plt.subplots(1,2,figsize=(18,8))
dataset['gender'].value_counts().plot.pie(explode=[0,0.05],autopct='%1.1f%%',ax=ax[0],s
hadow=True)
ax[0].set_title('gender')
sns.countplot(x = 'gender',hue = "status",data = dataset)
ax[1].set_title('Influence of gender on placement')
plt.show()
ax = sns.barplot(x="gender", y="salary", data=dataset)
plt.show()

So,we observe

The number of placed male students are almost double than placed female students
Male students are offered slightly greater salary than female on an average.

Board of Education(ssc_b,hsc_b,hsc_s)

ssc_b:Secondary Education board-10th grade


hsc_b:Higher Secondary Education board-12th grade
hsc_s : Specialization in Higher Secondary Education

https://htmtopdf.herokuapp.com/ipynbviewer/temp/28bc1f836ea2273deddb4b773b060bee/Campus_recruitment.html?t=1622620941713 5/18
6/2/2021 temp-162262094050084495

In [11]:

plt.figure(figsize=(10,8))
sns.countplot(x='ssc_b',hue='status',data=dataset)
sns.catplot(x='hsc_b',hue='hsc_s',col='status',data=dataset,kind='count')
plt.show()

So,we observe

In ssc_b,the central board students are placed more than other boards.
But we see in hsc_b,the other boards students are placed more than the central board.
Therfore,Board doesn't matter in placements.

Degree,Specialisation

https://htmtopdf.herokuapp.com/ipynbviewer/temp/28bc1f836ea2273deddb4b773b060bee/Campus_recruitment.html?t=1622620941713 6/18
6/2/2021 temp-162262094050084495

In [12]:

plt.figure(figsize=(7,3))
sns.countplot(x="degree_t", hue='status',data=dataset)
plt.show()
ax = sns.barplot(x="degree_t", y="salary", data=dataset)
plt.show()
sns.countplot(x="specialisation", hue='status',data=dataset)
plt.show()
ax = sns.barplot(x="specialisation", y="salary", data=dataset)
plt.show()

https://htmtopdf.herokuapp.com/ipynbviewer/temp/28bc1f836ea2273deddb4b773b060bee/Campus_recruitment.html?t=1622620941713 7/18
6/2/2021 temp-162262094050084495

https://htmtopdf.herokuapp.com/ipynbviewer/temp/28bc1f836ea2273deddb4b773b060bee/Campus_recruitment.html?t=1622620941713 8/18
6/2/2021 temp-162262094050084495

So,here we observe

Commerce and Science degree students are placed more and other students are less placed.
By salary wise,Sci&tech students gets paid more and second comes Commerce&mgmt and others are
paid less salary.
Specialisation matters lot in placements.Mkt&fin students have more placements compared to
Mkt&HR.By salary wise also MKT&Fin students are highly paid compared to Mkt&HR.

Percentage

https://htmtopdf.herokuapp.com/ipynbviewer/temp/28bc1f836ea2273deddb4b773b060bee/Campus_recruitment.html?t=1622620941713 9/18
6/2/2021 temp-162262094050084495

In [13]:

plt.figure(figsize = (15, 15))


ax=plt.subplot(221)
sns.barplot(x='status',y='ssc_p',hue='gender',data=dataset)
ax.set_title('Secondary school percentage')
ax=plt.subplot(222)
sns.barplot(x='status',y='hsc_p',hue='gender',data=dataset)
ax.set_title('Higher Secondary school percentage')
ax=plt.subplot(223)
sns.barplot(x='status',y='degree_p',hue='gender',data=dataset)
ax.set_title(' Degree percentage')
ax=plt.subplot(224)
sns.barplot(x='status',y='mba_p',hue='gender',data=dataset)
ax.set_title('MBA percentage')
plt.show()

https://htmtopdf.herokuapp.com/ipynbviewer/temp/28bc1f836ea2273deddb4b773b060bee/Campus_recruitment.html?t=1622620941713 10/18
6/2/2021 temp-162262094050084495

So,from above plots we observe

Female students got a higher percentage in all fields as compared to male students.
Students with higher percentages in their 10th,12th and degree have a better chance of placements.
There's no guarantee of placements in MBA for good percentage.
Therefore,percentage doesn't influence over the salary.

Work experience

In [14]:

plt.style.use('seaborn-white')
f,ax=plt.subplots(1,2,figsize=(18,8))
dataset['workex'].value_counts().plot.pie(explode=[0,0.05],autopct='%1.1f%%',ax=ax[0],s
hadow=True)
ax[0].set_title('Work experience')
sns.countplot(x = 'workex',hue = "status",data = dataset)
ax[1].set_title('Influence of experience on placement')
plt.show()

So,we observe

There are more students who don't have a work experience.


And students with no work experience got placed more than the students who had work experience.
We can conclude that work experience doesn't influence a student in the recruitment process.But having
work experience can benefit in increasing the chances of geting placed.

Data preprocessing
In [15]:

x = dataset.iloc[:,[4,7,9,10,11,12]].values
y = dataset.iloc[:,-2].values

https://htmtopdf.herokuapp.com/ipynbviewer/temp/28bc1f836ea2273deddb4b773b060bee/Campus_recruitment.html?t=1622620941713 11/18
6/2/2021 temp-162262094050084495

In [16]:

from sklearn.compose import ColumnTransformer


from sklearn.preprocessing import OneHotEncoder
ct = ColumnTransformer(transformers = [('encoder',OneHotEncoder(), [2,4])],remainder =
'passthrough')
x = np.array(ct.fit_transform(x))
from sklearn.preprocessing import LabelEncoder
le = LabelEncoder()
y = le.fit_transform(y)

In [17]:

Out[17]:

array([[1.0, 0.0, 0.0, ..., 58.0, 55.0, 58.8],


[0.0, 1.0, 1.0, ..., 77.48, 86.5, 66.28],
[1.0, 0.0, 1.0, ..., 64.0, 75.0, 57.8],
...,
[0.0, 1.0, 1.0, ..., 73.0, 59.0, 69.72],
[1.0, 0.0, 0.0, ..., 58.0, 70.0, 60.23],
[1.0, 0.0, 0.0, ..., 53.0, 89.0, 60.22]], dtype=object)

In [18]:

from sklearn.model_selection import train_test_split


x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.3, random_state=1
)

In [19]:

x_train

Out[19]:

array([[0.0, 1.0, 1.0, ..., 67.0, 95.0, 64.86],


[0.0, 1.0, 1.0, ..., 64.27, 64.0, 66.23],
[1.0, 0.0, 1.0, ..., 61.4, 68.0, 66.88],
...,
[1.0, 0.0, 1.0, ..., 78.0, 95.5, 68.53],
[0.0, 1.0, 1.0, ..., 69.5, 56.0, 56.94],
[1.0, 0.0, 0.0, ..., 65.6, 58.0, 55.47]], dtype=object)

https://htmtopdf.herokuapp.com/ipynbviewer/temp/28bc1f836ea2273deddb4b773b060bee/Campus_recruitment.html?t=1622620941713 12/18
6/2/2021 temp-162262094050084495

In [20]:

x_test

https://htmtopdf.herokuapp.com/ipynbviewer/temp/28bc1f836ea2273deddb4b773b060bee/Campus_recruitment.html?t=1622620941713 13/18
6/2/2021 temp-162262094050084495

Out[20]:

array([[1.0, 0.0, 1.0, 0.0, 82.0, 69.0, 84.0, 58.31],


[0.0, 1.0, 1.0, 0.0, 78.0, 61.0, 88.56, 71.55],
[1.0, 0.0, 1.0, 0.0, 50.0, 54.0, 71.0, 65.69],
[1.0, 0.0, 0.0, 1.0, 90.0, 83.0, 80.0, 73.52],
[1.0, 0.0, 0.0, 1.0, 61.12, 56.2, 67.0, 62.65],
[0.0, 1.0, 1.0, 0.0, 65.0, 81.0, 88.0, 72.78],
[1.0, 0.0, 1.0, 0.0, 65.58, 72.11, 57.6, 56.66],
[1.0, 0.0, 1.0, 0.0, 60.5, 84.0, 98.0, 65.25],
[1.0, 0.0, 1.0, 0.0, 73.6, 73.3, 96.8, 55.5],
[1.0, 0.0, 0.0, 1.0, 53.0, 65.0, 64.0, 58.32],
[0.0, 1.0, 0.0, 1.0, 80.0, 78.0, 97.0, 70.48],
[1.0, 0.0, 1.0, 0.0, 68.0, 64.0, 93.0, 62.56],
[1.0, 0.0, 0.0, 1.0, 62.0, 54.0, 72.0, 55.41],
[0.0, 1.0, 1.0, 0.0, 73.0, 66.0, 70.0, 68.07],
[1.0, 0.0, 0.0, 1.0, 51.0, 57.5, 57.63, 62.72],
[1.0, 0.0, 0.0, 1.0, 61.0, 61.0, 58.0, 53.94],
[1.0, 0.0, 1.0, 0.0, 62.0, 64.0, 53.88, 54.97],
[1.0, 0.0, 1.0, 0.0, 62.5, 61.0, 93.91, 69.03],
[1.0, 0.0, 1.0, 0.0, 65.0, 75.0, 83.0, 58.87],
[1.0, 0.0, 0.0, 1.0, 58.0, 66.0, 53.7, 56.86],
[0.0, 1.0, 1.0, 0.0, 73.0, 81.0, 89.0, 69.7],
[1.0, 0.0, 1.0, 0.0, 60.0, 65.0, 87.55, 52.81],
[0.0, 1.0, 1.0, 0.0, 63.0, 70.0, 55.0, 62.0],
[1.0, 0.0, 0.0, 1.0, 66.0, 64.0, 68.0, 64.08],
[1.0, 0.0, 1.0, 0.0, 63.0, 64.0, 60.0, 61.87],
[1.0, 0.0, 0.0, 1.0, 78.0, 72.0, 71.0, 62.74],
[1.0, 0.0, 1.0, 0.0, 83.83, 71.72, 86.0, 59.75],
[0.0, 1.0, 1.0, 0.0, 66.8, 69.3, 80.4, 71.0],
[0.0, 1.0, 1.0, 0.0, 67.0, 70.0, 50.48, 77.89],
[1.0, 0.0, 1.0, 0.0, 64.2, 67.4, 59.0, 59.69],
[1.0, 0.0, 0.0, 1.0, 42.16, 61.26, 54.48, 65.48],
[1.0, 0.0, 0.0, 1.0, 60.33, 64.21, 63.0, 60.02],
[1.0, 0.0, 0.0, 1.0, 51.0, 52.0, 68.44, 62.77],
[0.0, 1.0, 1.0, 0.0, 68.4, 78.3, 60.0, 63.7],
[1.0, 0.0, 1.0, 0.0, 67.0, 58.0, 77.0, 51.29],
[1.0, 0.0, 0.0, 1.0, 91.0, 58.0, 55.0, 58.8],
[1.0, 0.0, 0.0, 1.0, 49.0, 58.0, 62.0, 60.59],
[1.0, 0.0, 1.0, 0.0, 86.0, 56.0, 57.0, 64.08],
[1.0, 0.0, 1.0, 0.0, 65.0, 60.0, 84.0, 64.15],
[0.0, 1.0, 1.0, 0.0, 76.5, 67.5, 73.35, 64.15],
[0.0, 1.0, 1.0, 0.0, 66.2, 65.6, 60.0, 62.54],
[1.0, 0.0, 0.0, 1.0, 70.0, 65.0, 88.0, 71.96],
[0.0, 1.0, 0.0, 1.0, 76.0, 72.0, 84.0, 58.95],
[0.0, 1.0, 1.0, 0.0, 60.0, 57.0, 78.0, 54.55],
[1.0, 0.0, 0.0, 1.0, 63.0, 66.0, 61.28, 60.11],
[1.0, 0.0, 0.0, 1.0, 67.0, 66.0, 68.0, 57.69],
[1.0, 0.0, 1.0, 0.0, 62.0, 68.0, 74.0, 57.99],
[1.0, 0.0, 1.0, 0.0, 77.0, 80.0, 60.0, 66.72],
[1.0, 0.0, 1.0, 0.0, 64.89, 70.67, 89.0, 60.39],
[1.0, 0.0, 0.0, 1.0, 58.0, 53.0, 89.0, 60.22],
[0.0, 1.0, 1.0, 0.0, 63.0, 60.0, 70.0, 53.2],
[1.0, 0.0, 1.0, 0.0, 90.9, 64.5, 86.04, 59.42],
[1.0, 0.0, 1.0, 0.0, 82.0, 63.0, 50.0, 59.47],
[1.0, 0.0, 0.0, 1.0, 62.0, 73.0, 58.0, 64.36],
[1.0, 0.0, 0.0, 1.0, 67.2, 60.0, 58.06, 69.28],
[1.0, 0.0, 1.0, 0.0, 90.0, 82.0, 92.0, 68.03],
[1.0, 0.0, 0.0, 1.0, 60.0, 65.0, 92.66, 62.92],
[1.0, 0.0, 1.0, 0.0, 78.5, 67.0, 68.71, 60.99],
[1.0, 0.0, 0.0, 1.0, 72.0, 78.0, 82.0, 71.43],
https://htmtopdf.herokuapp.com/ipynbviewer/temp/28bc1f836ea2273deddb4b773b060bee/Campus_recruitment.html?t=1622620941713 14/18
6/2/2021 temp-162262094050084495

[1.0, 0.0, 1.0, 0.0, 62.0, 65.0, 62.0, 56.81],


[1.0, 0.0, 0.0, 1.0, 68.0, 69.0, 53.7, 55.01],
[1.0, 0.0, 0.0, 1.0, 60.0, 69.0, 55.5, 58.4],
[0.0, 1.0, 1.0, 0.0, 72.8, 66.6, 96.0, 70.85],
[1.0, 0.0, 1.0, 0.0, 75.0, 73.0, 80.0, 67.05],
[1.0, 0.0, 0.0, 1.0, 47.0, 50.0, 76.0, 54.96]], dtype=object)

In [21]:

y_train

Out[21]:

array([1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1,
1, 1, 0, 1, 0, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 1, 0,
1, 1, 0, 0, 1, 0, 1, 0, 1, 1, 0, 0, 1, 0, 1, 1, 1, 1, 1, 0, 1, 0,
1, 1, 0, 0, 1, 1, 0, 0, 1, 1, 1, 0, 1, 0, 0, 1, 1, 1, 0, 1, 1, 1,
1, 0, 0, 1, 1, 0, 1, 0, 1, 0, 1, 1, 1, 1, 0, 0, 0, 0, 1, 1, 1, 1,
0, 1, 0, 1, 1, 1, 1, 1, 1, 1, 0, 1, 1, 0, 1, 1, 0, 1, 1, 1, 1, 1,
1, 1, 1, 1, 0, 1, 1, 1, 1, 0, 1, 0, 1, 1, 1, 1, 1, 1])

In [22]:

y_test

Out[22]:

array([1, 1, 0, 1, 0, 1, 1, 1, 1, 0, 1, 1, 0, 1, 0, 1, 1, 0, 1, 1, 1, 1,
1, 0, 0, 1, 1, 1, 1, 1, 0, 0, 0, 1, 0, 1, 0, 1, 0, 1, 1, 0, 1, 1,
1, 1, 1, 1, 1, 0, 1, 1, 0, 1, 0, 1, 0, 1, 1, 1, 1, 1, 1, 1, 0])

Standardisation
In [23]:

from sklearn.preprocessing import StandardScaler


sc = StandardScaler()
x_train[:,4:] = sc.fit_transform(x_train[:,4:])
x_test[:,4:] = sc.transform(x_test[:,4:])

Comparing classification models

Logistic Regression
KNN
Decision Tree
Random Forest
linear SVC
Kernel SVC

https://htmtopdf.herokuapp.com/ipynbviewer/temp/28bc1f836ea2273deddb4b773b060bee/Campus_recruitment.html?t=1622620941713 15/18
6/2/2021 temp-162262094050084495

In [24]:

from sklearn.linear_model import LogisticRegression


from sklearn.neighbors import KNeighborsClassifier
from sklearn.tree import DecisionTreeClassifier
from sklearn.ensemble import RandomForestClassifier
from sklearn.svm import SVC

In [25]:

l_cla = LogisticRegression()
k_cla = KNeighborsClassifier(n_neighbors = 10)
d_cla = DecisionTreeClassifier()
r_cla = RandomForestClassifier(n_estimators = 200)
s_cla = SVC(kernel='linear')
ks_cla = SVC(kernel= 'rbf')

In [26]:

l_cla.fit(x_train, y_train)
k_cla.fit(x_train, y_train)
d_cla.fit(x_train, y_train)
r_cla.fit(x_train, y_train)
s_cla.fit(x_train, y_train)
ks_cla.fit(x_train, y_train)

Out[26]:

SVC(C=1.0, break_ties=False, cache_size=200, class_weight=None, coef0=0.0,


decision_function_shape='ovr', degree=3, gamma='scale', kernel='rbf',
max_iter=-1, probability=False, random_state=None, shrinking=True,
tol=0.001, verbose=False)

In [27]:

l_pred = l_cla.predict(x_test)
k_pred = k_cla.predict(x_test)
d_pred = d_cla.predict(x_test)
r_pred = r_cla.predict(x_test)
s_pred = s_cla.predict(x_test)
ks_pred = ks_cla.predict(x_test)

In [28]:

from sklearn.metrics import confusion_matrix

In [29]:

l_c = confusion_matrix(y_test, l_pred)


k_c = confusion_matrix(y_test, k_pred)
d_c = confusion_matrix(y_test, d_pred)
r_c = confusion_matrix(y_test, r_pred)
s_c = confusion_matrix(y_test, s_pred)
ks_c = confusion_matrix(y_test, ks_pred)

https://htmtopdf.herokuapp.com/ipynbviewer/temp/28bc1f836ea2273deddb4b773b060bee/Campus_recruitment.html?t=1622620941713 16/18
6/2/2021 temp-162262094050084495

In [30]:

print(l_c)
print(k_c)
print(d_c)
print(r_c)
print(s_c)
print(ks_c)

[[15 5]
[ 2 43]]
[[13 7]
[ 3 42]]
[[14 6]
[11 34]]
[[15 5]
[ 1 44]]
[[13 7]
[ 1 44]]
[[10 10]
[ 0 45]]

In [31]:

from sklearn.metrics import accuracy_score


l_a = accuracy_score(y_test, l_pred)
k_a = accuracy_score(y_test, k_pred)
d_a = accuracy_score(y_test, d_pred)
r_a = accuracy_score(y_test, r_pred)
s_a = accuracy_score(y_test, s_pred)
ks_a = accuracy_score(y_test, ks_pred)

In [32]:

print('Logistic Regression: ' + str(l_a) + '\nKNN: ' + str(k_a) + '\nDecision Tree: ' +
str(d_a) + '\nRandom Forest: ' + str(r_a) + '\nLinear SVC: ' + str(s_a) + '\nKernel SV
C: ' + str(ks_a))

Logistic Regression: 0.8923076923076924


KNN: 0.8461538461538461
Decision Tree: 0.7384615384615385
Random Forest: 0.9076923076923077
Linear SVC: 0.8769230769230769
Kernel SVC: 0.8461538461538461

Random Forest shows highest accuracy score of 90.76%

So we can conclude Random Forest classification model is best fit for our dataset

Conclusion

https://htmtopdf.herokuapp.com/ipynbviewer/temp/28bc1f836ea2273deddb4b773b060bee/Campus_recruitment.html?t=1622620941713 17/18
6/2/2021 temp-162262094050084495

More male students got placed as compared to female students. (Since more male students sat for
placements)
Male students got higher salaries as compared to female students.
Board of Education doesn't matter in placements
Students with higher percentages in 10th,12th and degree have a better chance of placements.But MBA
percentages don't influence over placements.
Students with no work experience got placed more than the students who had work experience.
Specialisation matters lot in placements.Mkt&fin students have more placements compared to
Mkt&HR.By salary wise also MKT&Fin students are highly paid compared to Mkt&HR.
Sci&Tech students gets a higher salary compared to Comm&Mgmt and other degrees.

Thank you

Jayasurya B

https://htmtopdf.herokuapp.com/ipynbviewer/temp/28bc1f836ea2273deddb4b773b060bee/Campus_recruitment.html?t=1622620941713 18/18

You might also like