0% found this document useful (0 votes)
4 views

3-LinearRegression Formula Based

The document analyzes a dataset containing head size and brain weight measurements for 237 individuals. Simple linear regression is performed to find the relationship between head size and brain weight, yielding a slope of 0.2634 and intercept of 325.57. The coefficient of determination (R^2) is calculated as 0.6393, indicating a moderately strong linear relationship.

Uploaded by

animehv5500
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views

3-LinearRegression Formula Based

The document analyzes a dataset containing head size and brain weight measurements for 237 individuals. Simple linear regression is performed to find the relationship between head size and brain weight, yielding a slope of 0.2634 and intercept of 325.57. The coefficient of determination (R^2) is calculated as 0.6393, indicating a moderately strong linear relationship.

Uploaded by

animehv5500
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

In [23]: %matplotlib inline

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

#Download HeadBrain Dataset from Kaggle

In [24]: data=pd.read_csv('headbrain.csv')
print(data.shape)
data.head()

(237, 4)
Out[24]: Gender Age Range Head Size(cm^3) Brain Weight(grams)

0 1 1 4512 1530

1 1 1 3738 1297

2 1 1 4261 1335

3 1 1 3777 1282

4 1 1 4177 1590

In [25]: X = data['Head Size(cm^3)'].values


Y = data['Brain Weight(grams)'].values

In [26]: # Mean of X and Y


mean_x = np.mean(X)
mean_y = np.mean(Y)
# total no. of values
n=len(X)
numer=0
denom=0
for i in range(n):
numer+=(X[i]-mean_x)*(Y[i]-mean_y)
denom+=(X[i]-mean_x)**2
b1=numer/denom
b0=mean_y - (b1*mean_x)

print(b1,b0)

0.26342933948939945 325.57342104944223

In [29]: # Lets plot it and see graphically


max_x=np.max(X)+100
min_x=np.min(X)+100

x= np.linspace(min_x, max_x, 1000)


y=b0+b1*x

plt.plot(x,y,label='Regression Line')
plt.scatter(X,Y,label='Scatter Plot')

plt.xlabel('Head Size')
plt.ylabel('Brain Weight')
plt.legend()
plt.show()

In [30]: # To find how good our model is, lets calculate R Square
# ss_t is total sum of square
# ss_r is sum of residual
ss_t = 0
ss_r = 0
for i in range(n):
y_pred=b0+b1*X[i]
ss_t += (Y[i]-mean_y)**2
ss_r += (Y[i]-y_pred)**2
r2=1-(ss_r/ss_t)
print(r2)

0.6393117199570003

In [32]: # Now lets see how it can be implemented using ML library sci kit learn
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_squared_error

# Scikit learn can't use Rank 1 matrix


X=X.reshape((n,1))

# Creating Model
reg=LinearRegression()

reg=reg.fit(X,Y)

Y_pred=reg.predict(X)

mse=mean_squared_error(Y,Y_pred)
rmse=np.sqrt(mse)
r2_score=reg.score(X,Y)

print(np.sqrt(mse))
print(r2_score)
72.1206213783709
0.639311719957

You might also like