Write a lab report on Linear Regression and Logistic Regression. Include the cost function differentiation and the code in the report.
Write a lab report on Linear Regression and Logistic Regression. Include the cost function differentiation and the code in the report.
Lab Experiment Name: Write a lab report on Linear Regression and Logistic Regression.
Include the cost function differentiation and the code in the report.
Student Details
Name ID
1. Shahedul Islam 213002178
[For Teachers use only: Don’t Write Anything inside this box]
Logistic regression is a fundamental statistical and machine learning algorithm used for binary classification
problems. Unlike linear regression, which predicts continuous numerical values, logistic regression predicts the
probability of an outcome belonging to one of two classes. For example, it can be used to determine whether a
patient has a specific disease (1) or not (0) based on their medical features.
In this lab, we implemented logistic regression to classify individuals as diabetic or non-diabetic based on a
medical dataset. The dataset includes features such as glucose levels, blood pressure, BMI, and insulin levels,
along with the target variable (Outcome), which indicates whether the patient has diabetes.
Objective
1. Linear Regression
Mathematical Foundation
Linear Regression aims to model the relationship between a dependent variable y and one or more independent
variables X. The model is expressed as:
Cost Function
The cost function for Linear Regression is the Mean Squared Error (MSE):
Where:
Gradient Descent
Implementation
Dataset
We use the dataset provided, focusing on the Price (dependent variable) and other attributes as independent
variables.
# Dataset preparation
data = pd.read_csv('Car_Raw_Data.csv')
df = pd.DataFrame(data)
# Preprocessing
df['Age'] = 2024 - df['Year'] # Calculate the age of the car
X = df[['Mileage', 'EngineV', 'Age']] # Independent variables
y = df['Price'] # Dependent variable
# Predictions
y_pred = model.predict(X_test)
# Results
print("\nModel Coefficients:")
print(model.coef_)
print(f"Intercept: {model.intercept_}")
OUTPUT:
2. Logistic Regression
Mathematical Foundation
Logistic Regression is used for classification tasks. It uses the sigmoid function to map predictions to
probabilities:
Cost Function
The cost function for logistic regression is:
Gradient Descent
The gradients are calculated as:
Implementation
Dataset
We use the dataset for predicting Outcome (dependent variable) based on independent variables.
# Importing necessary libraries
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score, confusion_matrix, classification_report
import matplotlib.pyplot as plt
# Dataset preparation
data = pd.read_csv('diabetes.csv')
df = pd.DataFrame(data)
# Making predictions
y_pred = model.predict(X_test)
# Evaluation Metrics
accuracy = accuracy_score(y_test, y_pred)
conf_matrix = confusion_matrix(y_test, y_pred)
class_report = classification_report(y_test, y_pred)
# Displaying results
print(f"\nAccuracy: {accuracy:.2f}")
print("\nConfusion Matrix:")
print(conf_matrix)
print("\nClassification Report:")
print(class_report)
Discussion
The logistic regression implementation provided a practical introduction to classification modeling. The
pipeline—from data preprocessing to evaluation—highlights best practices, including missing value imputation
and performance measurement. However, further steps such as hyperparameter tuning, feature scaling, and
handling class imbalance could enhance the model’s predictive capabilities.