0% found this document useful (0 votes)
14 views

R 9. Regression

The document outlines a Python script for performing regression analysis on housing data, including importing necessary libraries and data, visualizing the data with scatter plots, and running the regression analysis. It specifies the independent variable as house size and the dependent variable as house price, and provides details on the regression output including key parameters like slope, intercept, and R-squared value. The script also includes notes on potential multicollinearity issues in the data.

Uploaded by

nishant.sharma
Copyright
© © All Rights Reserved
Available Formats
Download as PDF or read online on Scribd
0% found this document useful (0 votes)
14 views

R 9. Regression

The document outlines a Python script for performing regression analysis on housing data, including importing necessary libraries and data, visualizing the data with scatter plots, and running the regression analysis. It specifies the independent variable as house size and the dependent variable as house price, and provides details on the regression output including key parameters like slope, intercept, and R-squared value. The script also includes notes on potential multicollinearity issues in the data.

Uploaded by

nishant.sharma
Copyright
© © All Rights Reserved
Available Formats
Download as PDF or read online on Scribd
You are on page 1/ 7
# Step 1: Importing Required Python Packages / Libraries import nunpy as np import pandas as pd from scipy import stats import statsmodels.api as sm inport matplotlib.pyplot as plt # Step 2: Importing datafile from Local Computer # Remember for this, you need to copy file path # add file name and extension # Use forward slash tn path data = pd.read excel\ ("C:/Users/drank/Desktop/Python/Python for Finance - Notebook Files\ /81 Running a Regression in Python/Python 32/Housing.x1sx") # Step 3: Viewing and Cropping Data (Optional) data House Price House Size (sq.ft.) State Number of Rooms Year of Construction © 1116000 1940 IN 8 2002 1860000 1300, IN 5 1982, 2 818400 1420 IN 6 1987 3 1000000 1680 IN 7 2000 4640000 1270 WN 5 1995 5 1010000 1850. IN 7 1998 6 600000 1000 IN 4 2015, 70000 1100 A 4 2014 8 1100000 160A 7 2017 9 570000 100 NY 5 1997 10 860000 2150 Ny 9 1997 111088000 1900 NY 8 2000 121250000 2200 NY 9 2014 13850000 110K 4 2017 14640000 860 1K 4 1997 15900000 135K 6 1997 16 730000 1350 1K 6 2000 17730000 1600 1K 6 1992 18650000 950 1K 2 1987 19680000 1250 1K 4 2000 data[[‘House Price’, ‘House Size (sq.ft.)']] House Price House Size (sq.ft) © 1116000 1940 1860000 1300 2 818400 1420 3 1000000 1680 4640000 1270 5 1010000 1850 6 600000 1000 7 ro0000 1100 8 1100000 1600 9 570000 1000 10 860000 2150 411085000 1900 121250000 2200 13350000 1100 14640000 860 15900000 1325, 16 730000 1350 7 730000 1600 18650000 950 19680000 1250 # Step 4: Specifying x and y x = data['House Size (sq.ft.)"] y = data[ ‘House Price’ ] 1940 1300 1420 1680 1278 1850 1000 1100 1600 1000 e150 1900 2200 1100 360 1325 135 1600 95@ 1250 House Size (sq.ft.), dtype: intea e 1 2 3 4 5 6 7 a 9 1 y 2 1116000 1860000 2 a1eaeo 3 1000000 4 640000 5 1010000 6 620000 7 7e8800 a 1100000 9 570000 10 -a6a000 11 1085000 12 1250000 13850800 14 64a000 15 900800 16 730000 17750000 18 65@000 19 680000 Name: House Price, dtype: intea # Step 5: Making Scatter Plot (Optional) pit.scatter(x.y) 1e6 12 ut . - Lo . os . os 07 . 06 . 800 1000 1200 1400 1600 1800 2000 # adjusting axis of Scatter Plot pit.scatten(x.y) pit. axis([2, 2509, 0,1500000]) (0.2, 2500.0, 0.8, 1500000.0) 16 2200 14 12 10 ee 08 . 06 3 04 02 0.0 ° 500 1000 1500 2000 2500 out[18 In [21 In [22 # Labeling axis of Scatter Plot plt.scatter(x,y) plt.axis([@,25¢e, 0,15¢0000]) plt.ylabel(‘House Price’) plt.xlabel( ‘House Size(sq.ft.)') Text(@.5, @, ‘House Size(sq-ft.)") 1e6 14 12 10 os House Price 06 04 0.2 00 ° 500 1000 1500 House Size(sq.ft.) # Step 6: Running Regression Analysis x1 = sm.add_constant(x) reg = sm.OLS(y,x1).Fit() # Step 7: Finding Regression Output reg. sunnary() 2000 2500 i OLS Regression Results Dep. Variable: House Price Model: ous Method: Least Squares Date: Sat, 26 Nov 2022 Time: 022843 No. Observations: 20 DF Residuals: 18 Bic: Df Model: 1 Covariance Type: nonrobust coot stderr P>it] const 2.6086+05 976e+04 2673 0.016 House Size (sqft) 4019163 65.243 6.160 0.000 Omnibus: 1.238 Durbin-Watson: 1.810 Prob(Omnibus): 0.538 Jarque-Bera (JB): 0.715 Skew: -0459 Probus): 0.699 Kurtosis: 2.864 Cond. No. 5.662+03 Notes: [1] Standard Errors assume that the covariance matrix of the errors is correctly specified. [2] The condition number is large, 5.66e+03. This might indicate that there are strong multicollinearity or other numerical problems. # Key Regression Parameters slope, intercept, r_value, p_value, std_err slope 24); 401.91628631922595 25): intercept 5]: 260806. 2360560964 rvalue 0.8235775534696924 rvalue**2 27): @.678279986579124 a8]: p.value 8,129642377231308¢-@6 osre 0.660 3795 8.130-06 26083 5249 5268 10.025 5.58208 264.046 stats. linregress(x,y) 0.975) 466e+05 538.987 stderr 65.24299510636492

You might also like