0% found this document useful (0 votes)
2 views

logistic-regressions

The document outlines the process of using logistic regression to classify handwritten digits from the sklearn digits dataset. It includes steps for loading the dataset, preprocessing the data, training the model, and evaluating its performance, achieving an accuracy of approximately 97.22%. Additionally, it provides visualizations of some test images along with their predicted labels.

Uploaded by

hemaaccess
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

logistic-regressions

The document outlines the process of using logistic regression to classify handwritten digits from the sklearn digits dataset. It includes steps for loading the dataset, preprocessing the data, training the model, and evaluating its performance, achieving an accuracy of approximately 97.22%. Additionally, it provides visualizations of some test images along with their predicted labels.

Uploaded by

hemaaccess
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

logistic-regressions

August 6, 2024

[1]: from sklearn.datasets import load_digits


digits = load_digits()
print(digits)

{'data': array([[ 0., 0., 5., …, 0., 0., 0.],


[ 0., 0., 0., …, 10., 0., 0.],
[ 0., 0., 0., …, 16., 9., 0.],
…,
[ 0., 0., 1., …, 6., 0., 0.],
[ 0., 0., 2., …, 12., 0., 0.],
[ 0., 0., 10., …, 12., 1., 0.]]), 'target': array([0, 1, 2, …, 8,
9, 8]), 'frame': None, 'feature_names': ['pixel_0_0', 'pixel_0_1', 'pixel_0_2',
'pixel_0_3', 'pixel_0_4', 'pixel_0_5', 'pixel_0_6', 'pixel_0_7', 'pixel_1_0',
'pixel_1_1', 'pixel_1_2', 'pixel_1_3', 'pixel_1_4', 'pixel_1_5', 'pixel_1_6',
'pixel_1_7', 'pixel_2_0', 'pixel_2_1', 'pixel_2_2', 'pixel_2_3', 'pixel_2_4',
'pixel_2_5', 'pixel_2_6', 'pixel_2_7', 'pixel_3_0', 'pixel_3_1', 'pixel_3_2',
'pixel_3_3', 'pixel_3_4', 'pixel_3_5', 'pixel_3_6', 'pixel_3_7', 'pixel_4_0',
'pixel_4_1', 'pixel_4_2', 'pixel_4_3', 'pixel_4_4', 'pixel_4_5', 'pixel_4_6',
'pixel_4_7', 'pixel_5_0', 'pixel_5_1', 'pixel_5_2', 'pixel_5_3', 'pixel_5_4',
'pixel_5_5', 'pixel_5_6', 'pixel_5_7', 'pixel_6_0', 'pixel_6_1', 'pixel_6_2',
'pixel_6_3', 'pixel_6_4', 'pixel_6_5', 'pixel_6_6', 'pixel_6_7', 'pixel_7_0',
'pixel_7_1', 'pixel_7_2', 'pixel_7_3', 'pixel_7_4', 'pixel_7_5', 'pixel_7_6',
'pixel_7_7'], 'target_names': array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]), 'images':
array([[[ 0., 0., 5., …, 1., 0., 0.],
[ 0., 0., 13., …, 15., 5., 0.],
[ 0., 3., 15., …, 11., 8., 0.],
…,
[ 0., 4., 11., …, 12., 7., 0.],
[ 0., 2., 14., …, 12., 0., 0.],
[ 0., 0., 6., …, 0., 0., 0.]],

[[ 0., 0., 0., …, 5., 0., 0.],


[ 0., 0., 0., …, 9., 0., 0.],
[ 0., 0., 3., …, 6., 0., 0.],
…,
[ 0., 0., 1., …, 6., 0., 0.],
[ 0., 0., 1., …, 6., 0., 0.],
[ 0., 0., 0., …, 10., 0., 0.]],

1
[[ 0., 0., 0., …, 12., 0., 0.],
[ 0., 0., 3., …, 14., 0., 0.],
[ 0., 0., 8., …, 16., 0., 0.],
…,
[ 0., 9., 16., …, 0., 0., 0.],
[ 0., 3., 13., …, 11., 5., 0.],
[ 0., 0., 0., …, 16., 9., 0.]],

…,

[[ 0., 0., 1., …, 1., 0., 0.],


[ 0., 0., 13., …, 2., 1., 0.],
[ 0., 0., 16., …, 16., 5., 0.],
…,
[ 0., 0., 16., …, 15., 0., 0.],
[ 0., 0., 15., …, 16., 0., 0.],
[ 0., 0., 2., …, 6., 0., 0.]],

[[ 0., 0., 2., …, 0., 0., 0.],


[ 0., 0., 14., …, 15., 1., 0.],
[ 0., 4., 16., …, 16., 7., 0.],
…,
[ 0., 0., 0., …, 16., 2., 0.],
[ 0., 0., 4., …, 16., 2., 0.],
[ 0., 0., 5., …, 12., 0., 0.]],

[[ 0., 0., 10., …, 1., 0., 0.],


[ 0., 2., 16., …, 1., 0., 0.],
[ 0., 0., 15., …, 15., 0., 0.],
…,
[ 0., 4., 16., …, 16., 6., 0.],
[ 0., 8., 16., …, 16., 8., 0.],
[ 0., 1., 8., …, 12., 1., 0.]]]), 'DESCR': "..
_digits_dataset:\n\nOptical recognition of handwritten digits
dataset\n--------------------------------------------------\n\n**Data Set
Characteristics:**\n\n :Number of Instances: 1797\n :Number of Attributes:
64\n :Attribute Information: 8x8 image of integer pixels in the range
0..16.\n :Missing Attribute Values: None\n :Creator: E. Alpaydin (alpaydin
'@' boun.edu.tr)\n :Date: July; 1998\n\nThis is a copy of the test set of the
UCI ML hand-written digits datasets\nhttps://archive.ics.uci.edu/ml/datasets/Opt
ical+Recognition+of+Handwritten+Digits\n\nThe data set contains images of hand-
written digits: 10 classes where\neach class refers to a digit.\n\nPreprocessing
programs made available by NIST were used to extract\nnormalized bitmaps of
handwritten digits from a preprinted form. From a\ntotal of 43 people, 30
contributed to the training set and different 13\nto the test set. 32x32 bitmaps
are divided into nonoverlapping blocks of\n4x4 and the number of on pixels are
counted in each block. This generates\nan input matrix of 8x8 where each element

2
is an integer in the range\n0..16. This reduces dimensionality and gives
invariance to small\ndistortions.\n\nFor info on NIST preprocessing routines,
see M. D. Garris, J. L. Blue, G.\nT. Candela, D. L. Dimmick, J. Geist, P. J.
Grother, S. A. Janet, and C.\nL. Wilson, NIST Form-Based Handprint Recognition
System, NISTIR 5469,\n1994.\n\n.. topic:: References\n\n - C. Kaynak (1995)
Methods of Combining Multiple Classifiers and Their\n Applications to
Handwritten Digit Recognition, MSc Thesis, Institute of\n Graduate Studies in
Science and Engineering, Bogazici University.\n - E. Alpaydin, C. Kaynak (1998)
Cascading Classifiers, Kybernetika.\n - Ken Tang and Ponnuthurai N. Suganthan
and Xi Yao and A. Kai Qin.\n Linear dimensionalityreduction using relevance
weighted LDA. School of\n Electrical and Electronic Engineering Nanyang
Technological University.\n 2005.\n - Claudio Gentile. A New Approximate
Maximal Margin Classification\n Algorithm. NIPS. 2000.\n"}

[2]: from sklearn.datasets import load_digits


from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score, confusion_matrix
import matplotlib.pyplot as plt

# Load the dataset


digits = load_digits()

# Split the dataset into training and testing sets


X_train, X_test, y_train, y_test = train_test_split(digits.data, digits.target,␣
↪test_size=0.2, random_state=42)

# Scale the data


scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)

# Fit a logistic regression model


model = LogisticRegression()
model.fit(X_train_scaled, y_train)

# Predict the digits in the testing data


y_pred = model.predict(X_test_scaled)

# Evaluate the performance of the model


accuracy = accuracy_score(y_test, y_pred)
confusion_mat = confusion_matrix(y_test, y_pred)

print("Accuracy:", accuracy)
print("Confusion Matrix:\n", confusion_mat)

3
# Visualize some of the images and their predicted labels
fig, axes = plt.subplots(nrows=3, ncols=4, figsize=(11,11))
for i, ax in enumerate(axes.flat):
ax.imshow(X_test[i].reshape(8, 8), cmap='binary')
ax.set(title=f"True: {y_test[i]}, Predicted: {y_pred[i]}")
plt.show()

Accuracy: 0.9722222222222222
Confusion Matrix:
[[33 0 0 0 0 0 0 0 0 0]
[ 0 28 0 0 0 0 0 0 0 0]
[ 0 0 33 0 0 0 0 0 0 0]
[ 0 0 0 33 0 1 0 0 0 0]
[ 0 1 0 0 45 0 0 0 0 0]
[ 0 0 0 0 0 44 1 0 0 2]
[ 0 0 0 0 0 1 34 0 0 0]
[ 0 0 0 0 0 0 0 33 0 1]
[ 0 0 0 0 0 1 0 0 29 0]
[ 0 0 0 1 0 0 0 0 1 38]]

4
[ ]:

You might also like