0% found this document useful (0 votes)
11 views

Pract5 1

The document discusses using a logistic regression model to predict whether individuals will purchase a product based on their age and estimated salary using Python and scikit-learn. Data is loaded from a CSV file and split into training and test sets. A logistic regression model is trained on the training set and used to make predictions on the test set. Model performance is evaluated using metrics like accuracy, precision, and recall.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views

Pract5 1

The document discusses using a logistic regression model to predict whether individuals will purchase a product based on their age and estimated salary using Python and scikit-learn. Data is loaded from a CSV file and split into training and test sets. A logistic regression model is trained on the training set and used to make predictions on the test set. Model performance is evaluated using metrics like accuracy, precision, and recall.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

Name :- Nisha Sunil Ambike

Roll No :- 02

Title :- Data Analytics II


In [36]: import pandas as pd
import numpy as np

In [37]: df=pd.read_csv("Social_Network_Ads.csv")

In [38]: df

Out[38]: User ID Gender Age EstimatedSalary Purchased


0 15624510 Male 19 19000 0
1 15810944 Male 35 20000 0
2 15668575 Female 26 43000 0
3 15603246 Female 27 57000 0
4 15804002 Male 19 76000 0
... ... ... ... ... ...
395 15691863 Female 46 41000 1
396 15706071 Male 51 23000 1
397 15654296 Female 50 20000 1
398 15755018 Male 36 33000 0
399 15594041 Female 49 36000 1
400 rows × 5 columns

In [39]: df.isnull().sum()

Out[39]: User ID 0
Gender 0
Age 0
EstimatedSalary 0
Purchased 0
dtype: int64

In [40]: x = df[["Age", "EstimatedSalary"]]


y=df['Purchased']

In [41]: x.isnull().sum()

Out[41]: Age 0
EstimatedSalary 0
dtype: int64

In [42]: y.isnull().sum()

Out[42]: 0

In [43]: from sklearn.model_selection import train_test_split


In [44]: x_train, x_test, y_train, y_test =train_test_split(x,y,test_size=0.3)

In [45]: x_test.shape

Out[45]: (120, 2)

In [46]: from sklearn.linear_model import LogisticRegression

In [47]: model = LogisticRegression()

In [48]: x_train

Out[48]: Age EstimatedSalary


318 45 32000
23 45 22000
155 31 15000
40 27 17000
123 35 53000
... ... ...
184 33 60000
293 37 77000
75 34 112000
366 58 47000
355 60 34000
280 rows × 2 columns

In [49]: y_train

Out[49]: 318 1
23 1
155 0
40 0
123 0
..
184 0
293 0
75 1
366 1
355 1
Name: Purchased, Length: 280, dtype: int64

In [50]: model.fit(x_train,y_train)

Out[50]: ▾ LogisticRegression
LogisticRegression()

In [51]: y_pred=model.predict(x_test)

In [52]: from sklearn.metrics import confusion_matrix


In [53]: cm=confusion_matrix(y_test,y_pred)

In [54]: print(cm)
[[77 0]
[43 0]]

In [55]: print(f"TN value is {cm[0][0]}")


print(f"FP value is {cm[0][1]}")
print(f"FN value is {cm[1][0]}")
print(f"TP value is {cm[1][1]}")
TN value is 77
FP value is 0
FN value is 43
TP value is 0

In [69]: from sklearn.metrics import confusion_matrix, classification_report, accuracy_score, pre

In [57]: print(f"Accuracy score is {accuracy_score(y_test, y_pred)}")


Accuracy score is 0.6416666666666667

In [58]: print(f"Error rate is {1-accuracy_score(y_test, y_pred)}")


Error rate is 0.3583333333333333

In [73]: print(f"precision score is {precision_score(y_test, y_pred, zero_division=1)}")


precision score is 1.0

In [75]: print(f"Recall score is {recall_score(y_test, y_pred, zero_division=1)}")


Recall score is 0.0

In [76]: print(classification_report(y_test, y_pred, zero_division=1))


precision recall f1-score support

0 0.64 1.00 0.78 77


1 1.00 0.00 0.00 43

accuracy 0.64 120


macro avg 0.82 0.50 0.39 120
weighted avg 0.77 0.64 0.50 120

You might also like