0% found this document useful (0 votes)

12 views

DMV - 3 - Jupyter Notebook

Uploaded by

Anushka Jadhav

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

12 views

DMV - 3 - Jupyter Notebook

Uploaded by

Anushka Jadhav

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 2

10/6/24, 7:24 PM DMV_3 - Jupyter Notebook

In [1]: import pandas as pd

In [2]: df = pd.read_csv('Housing.csv')

In [3]: df.columns = df.columns.str.strip()

df.columns = df.columns.str.replace(' ', '_')
df.columns = df.columns.str.replace('[^A-Za-z0-9_]', '', regex=True)

In [4]: df.head()

Out[4]:
price area bedrooms bathrooms stories mainroad guestroom basement hotwaterheating airconditioning parking prefarea furnishingstatus

0 13300000 7420 4 2 3 yes no no no yes 2 yes furnished

1 12250000 8960 4 4 4 yes no no no yes 3 no furnished

2 12250000 9960 3 2 2 yes no yes no no 2 yes semi-furnished

3 12215000 7500 4 2 2 yes no yes no yes 3 yes furnished

4 11410000 7420 4 1 2 yes yes yes no yes 2 no furnished

In [5]: df.tail()

Out[5]:
price area bedrooms bathrooms stories mainroad guestroom basement hotwaterheating airconditioning parking prefarea furnishingstatus

540 1820000 3000 2 1 1 yes no yes no no 2 no unfurnished

541 1767150 2400 3 1 1 no no no no no 0 no semi-furnished

542 1750000 3620 2 1 1 yes no no no no 0 no unfurnished

543 1750000 2910 3 1 1 no no no no no 0 no furnished

544 1750000 3850 3 1 2 yes no no no no 0 no unfurnished

In [6]: df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 545 entries, 0 to 544
Data columns (total 13 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 price 545 non-null int64
1 area 545 non-null int64
2 bedrooms 545 non-null int64
3 bathrooms 545 non-null int64
4 stories 545 non-null int64
5 mainroad 545 non-null object
6 guestroom 545 non-null object
7 basement 545 non-null object
8 hotwaterheating 545 non-null object
9 airconditioning 545 non-null object
10 parking 545 non-null int64
11 prefarea 545 non-null object
12 furnishingstatus 545 non-null object
dtypes: int64(6), object(7)
memory usage: 55.5+ KB

In [7]: df.describe()

Out[7]:
price area bedrooms bathrooms stories parking

count 5.450000e+02 545.000000 545.000000 545.000000 545.000000 545.000000

mean 4.766729e+06 5150.541284 2.965138 1.286239 1.805505 0.693578

std 1.870440e+06 2170.141023 0.738064 0.502470 0.867492 0.861586

min 1.750000e+06 1650.000000 1.000000 1.000000 1.000000 0.000000

25% 3.430000e+06 3600.000000 2.000000 1.000000 1.000000 0.000000

50% 4.340000e+06 4600.000000 3.000000 1.000000 2.000000 0.000000

75% 5.740000e+06 6360.000000 3.000000 2.000000 2.000000 1.000000

max 1.330000e+07 16200.000000 6.000000 4.000000 4.000000 3.000000

In [8]: df.shape

Out[8]: (545, 13)

localhost:8888/notebooks/BE_PRACTICALS/DMV_3.ipynb 1/2
10/6/24, 7:24 PM DMV_3 - Jupyter Notebook

In [9]: df.columns

Out[9]: Index(['price', 'area', 'bedrooms', 'bathrooms', 'stories', 'mainroad',

'guestroom', 'basement', 'hotwaterheating', 'airconditioning',
'parking', 'prefarea', 'furnishingstatus'],
dtype='object')

In [10]: df.isnull().sum()

Out[10]: price 0
area 0
bedrooms 0
bathrooms 0
stories 0
mainroad 0
guestroom 0
basement 0
hotwaterheating 0
airconditioning 0
parking 0
prefarea 0
furnishingstatus 0
dtype: int64

In [16]: Categorical_Column = ['mainroad', 'guestroom', 'basement', 'hotwaterheating', 'aircondtioning', 'prefarea', 'furnishing_statu

In [19]: filtered_data = df[df['price'] > 100000]

print("Filtered data: ", filtered_data.head())

Filtered data: price area bedrooms bathrooms stories mainroad guestroom basement \
0 13300000 7420 4 2 3 yes no no
1 12250000 8960 4 4 4 yes no no
2 12250000 9960 3 2 2 yes no yes
3 12215000 7500 4 2 2 yes no yes
4 11410000 7420 4 1 2 yes yes yes

hotwaterheating airconditioning parking prefarea furnishingstatus

0 no yes 2 yes furnished
1 no yes 3 no furnished
2 no no 2 yes semi-furnished
3 no yes 3 yes furnished
4 no yes 2 no furnished

In [21]: categorical_cols = ['mainroad', 'guestroom', 'basement', 'hotwaterheating', 'airconditioning', 'prefarea', 'furnishingstatus

df = pd.get_dummies(df, columns=categorical_cols, drop_first=True)

In [23]: Q1 = df['price'].quantile(0.25)
Q3 = df['price'].quantile(0.75)
IQR = Q3 - Q1

lower_bound = Q1 - 1.5 * IQR
upper_bound = Q3 + 1.5 * IQR

data_no_outliers = df[(df['price'] >= lower_bound) & (df['price'] <= upper_bound)]
print("Data after removing outliers:\n", data_no_outliers.describe())

Data after removing outliers:

price area bedrooms bathrooms stories \
count 5.300000e+02 530.000000 530.000000 530.000000 530.000000
mean 4.600663e+06 5061.518868 2.943396 1.260377 1.788679
std 1.596119e+06 2075.449479 0.730515 0.464359 0.861190
min 1.750000e+06 1650.000000 1.000000 1.000000 1.000000
25% 3.430000e+06 3547.500000 2.000000 1.000000 1.000000
50% 4.270000e+06 4500.000000 3.000000 1.000000 2.000000
75% 5.600000e+06 6315.750000 3.000000 1.000000 2.000000
max 9.100000e+06 15600.000000 6.000000 3.000000 4.000000

parking
count 530.000000
mean 0.664151
std 0.843320
min 0.000000
25% 0.000000
50% 0.000000
75% 1.000000
max 3.000000

In [ ]:

localhost:8888/notebooks/BE_PRACTICALS/DMV_3.ipynb 2/2

Socpsych Reviewer!!!!!
No ratings yet
Socpsych Reviewer!!!!!
23 pages
Multiple - Linear - Regression - AirBNB - Student - File0.2 - New (1) .Ipynb - Colaboratory
No ratings yet
Multiple - Linear - Regression - AirBNB - Student - File0.2 - New (1) .Ipynb - Colaboratory
8 pages
House Rent Prediction EDA
No ratings yet
House Rent Prediction EDA
35 pages
A Project Report On Identifying Car Preference and Buying Behaviour of The Car Owners
60% (5)
A Project Report On Identifying Car Preference and Buying Behaviour of The Car Owners
86 pages
Eda On Housing Data
No ratings yet
Eda On Housing Data
7 pages
exp10
No ratings yet
exp10
1 page
R Prerequisite1
No ratings yet
R Prerequisite1
4 pages
1722414346054
No ratings yet
1722414346054
18 pages
vertopal.com_housing_linear
No ratings yet
vertopal.com_housing_linear
3 pages
vertopal.com_MachineLearning
No ratings yet
vertopal.com_MachineLearning
11 pages
Data Science Project
No ratings yet
Data Science Project
7 pages
Housing Case Study Using RFE (MLR) PDF
No ratings yet
Housing Case Study Using RFE (MLR) PDF
38 pages
Housing
No ratings yet
Housing
10 pages
Multiple Linear Regression Housing Case Study PDF
No ratings yet
Multiple Linear Regression Housing Case Study PDF
151 pages
Eda Project
No ratings yet
Eda Project
28 pages
House - Price - Prediction
No ratings yet
House - Price - Prediction
16 pages
Capstone Project Report
No ratings yet
Capstone Project Report
187 pages
Exercise2 Solution
No ratings yet
Exercise2 Solution
15 pages
IE0005 Exercise Solutions 2-6
No ratings yet
IE0005 Exercise Solutions 2-6
84 pages
IndianHouses 1695069727
No ratings yet
IndianHouses 1695069727
7 pages
Housing Prices Notebook
No ratings yet
Housing Prices Notebook
14 pages
Assignement 4
No ratings yet
Assignement 4
6 pages
Housing Main
No ratings yet
Housing Main
23 pages
Multiple - Linear - Regression - AirBNB - Solution-0.2 - New - Ipynb - Colaboratory
No ratings yet
Multiple - Linear - Regression - AirBNB - Solution-0.2 - New - Ipynb - Colaboratory
11 pages
House Price Prediction Models
No ratings yet
House Price Prediction Models
16 pages
Ex 1
No ratings yet
Ex 1
119 pages
00 Data Wrangling
No ratings yet
00 Data Wrangling
10 pages
Week 12
No ratings yet
Week 12
2 pages
Predicting Home Prices in Bangalore
No ratings yet
Predicting Home Prices in Bangalore
18 pages
Assigment1 - Manuel Tapia
No ratings yet
Assigment1 - Manuel Tapia
3 pages
Real_Estate_Price_Prediction_Model
No ratings yet
Real_Estate_Price_Prediction_Model
33 pages
COMM1110 Assese Property Data (1) Histogram
No ratings yet
COMM1110 Assese Property Data (1) Histogram
134 pages
Air BNB Data Analysis
No ratings yet
Air BNB Data Analysis
12 pages
House Price Prediction
No ratings yet
House Price Prediction
14 pages
EDA
No ratings yet
EDA
14 pages
Statisitics Project 4
No ratings yet
Statisitics Project 4
13 pages
Quantam - Learning - Colaboratory
No ratings yet
Quantam - Learning - Colaboratory
13 pages
Linear Regression - House Price Prediction
100% (2)
Linear Regression - House Price Prediction
174 pages
CSV File
No ratings yet
CSV File
7 pages
Kaggle House Prices Advanced Regression Techniques
No ratings yet
Kaggle House Prices Advanced Regression Techniques
87 pages
Use the method value_counts to count the number o...
No ratings yet
Use the method value_counts to count the number o...
3 pages
AirbnbMarketAnalysis_watermarked
No ratings yet
AirbnbMarketAnalysis_watermarked
16 pages
House 2
No ratings yet
House 2
11 pages
House Price Prediction
No ratings yet
House Price Prediction
1 page
Delhi House Price Prediction 1692019997
No ratings yet
Delhi House Price Prediction 1692019997
34 pages
DL_1
No ratings yet
DL_1
11 pages
q1
No ratings yet
q1
2 pages
Kaggle Machine Learning
No ratings yet
Kaggle Machine Learning
6 pages
Final DA LAB1 Merged (1)
No ratings yet
Final DA LAB1 Merged (1)
48 pages
ML LAB34
No ratings yet
ML LAB34
29 pages
W1D5 - EDA Airbnb - Part1 - Loading To Cleaning - Solutions
No ratings yet
W1D5 - EDA Airbnb - Part1 - Loading To Cleaning - Solutions
26 pages
Assignment1
No ratings yet
Assignment1
3 pages
Data_cleaning_on_Melbourne_housing
No ratings yet
Data_cleaning_on_Melbourne_housing
16 pages
Introduction To Machine Learning (ML) With Sklearn
No ratings yet
Introduction To Machine Learning (ML) With Sklearn
10 pages
20BECE30146 ML Pratical2
No ratings yet
20BECE30146 ML Pratical2
3 pages
Boston Housing Solutions
No ratings yet
Boston Housing Solutions
3 pages
Evan Marie Carr - Python and SKlearn
No ratings yet
Evan Marie Carr - Python and SKlearn
32 pages
AirbnbMarketAnalysis
No ratings yet
AirbnbMarketAnalysis
16 pages
Project PDF
No ratings yet
Project PDF
13 pages
Pandas Tutorial - Top 40 Useful Tricks - Ipynb
No ratings yet
Pandas Tutorial - Top 40 Useful Tricks - Ipynb
316 pages
02 End To End Machine Learning Project
No ratings yet
02 End To End Machine Learning Project
26 pages
I Can Master Subtraction, Grades K - 2
From Everand
I Can Master Subtraction, Grades K - 2
Carson Dellosa Education
No ratings yet
Updated Placement Report Final Modified (2)
No ratings yet
Updated Placement Report Final Modified (2)
457 pages
DMV - 1 - Jupyter Notebook
No ratings yet
DMV - 1 - Jupyter Notebook
4 pages
DMV - 6 - Jupyter Notebook
No ratings yet
DMV - 6 - Jupyter Notebook
6 pages
Clustering - With - Elbow - Plot - ML - 4 - Jupyter Notebook
No ratings yet
Clustering - With - Elbow - Plot - ML - 4 - Jupyter Notebook
6 pages
Ensmble - Learning - ML - 5 - Jupyter Notebook
No ratings yet
Ensmble - Learning - ML - 5 - Jupyter Notebook
7 pages
PPC_Cynpol PPC 042HC
No ratings yet
PPC_Cynpol PPC 042HC
1 page
Assignment of Real Estate Contract and Sale Agreement
100% (3)
Assignment of Real Estate Contract and Sale Agreement
2 pages
5 6312170923621351932 PDF
No ratings yet
5 6312170923621351932 PDF
54 pages
Toray Industries
No ratings yet
Toray Industries
6 pages
REvenue Management
No ratings yet
REvenue Management
30 pages
Single Split PAC R32 Inverter NX Series Catalog
No ratings yet
Single Split PAC R32 Inverter NX Series Catalog
6 pages
Session 5 Airport Design - Airside
No ratings yet
Session 5 Airport Design - Airside
47 pages
Completing The Square Worksheet
No ratings yet
Completing The Square Worksheet
4 pages
Market Forecast - Philippines E-Commerce Industry Statistics
No ratings yet
Market Forecast - Philippines E-Commerce Industry Statistics
13 pages
33371-00, 1111-11-0 Operation & Maintenance Manual 8-22-16,2-20-17
No ratings yet
33371-00, 1111-11-0 Operation & Maintenance Manual 8-22-16,2-20-17
134 pages
Insurance Sector in India and Effects of Global Events On Commodity Market
No ratings yet
Insurance Sector in India and Effects of Global Events On Commodity Market
9 pages
Lecture 01 - Principles of Finance
No ratings yet
Lecture 01 - Principles of Finance
70 pages
168900 Spec 606B312-322-362 ULTIMA Selectronic Concealed Toilet Flush Valves Original
No ratings yet
168900 Spec 606B312-322-362 ULTIMA Selectronic Concealed Toilet Flush Valves Original
2 pages
‘Azm-e-Istehkam’- Can new Pakistani military operation curb armed attacks
No ratings yet
‘Azm-e-Istehkam’- Can new Pakistani military operation curb armed attacks
6 pages
Airborne Weather Radar: PART II: Theory & Operation For More Effective Troubleshooting
100% (1)
Airborne Weather Radar: PART II: Theory & Operation For More Effective Troubleshooting
8 pages
Pre-Calculus Q2 Module 9
No ratings yet
Pre-Calculus Q2 Module 9
7 pages
Build Your Own CNC Milling Machine PDF
No ratings yet
Build Your Own CNC Milling Machine PDF
13 pages
Compressor Design
No ratings yet
Compressor Design
4 pages
TipSheet_HowF-1StudentsSeekingOPTCanAvoidI-765Delays_1
No ratings yet
TipSheet_HowF-1StudentsSeekingOPTCanAvoidI-765Delays_1
2 pages
(Book) .Load-Pull Techniques With Applications To Power Amplifier Design
No ratings yet
(Book) .Load-Pull Techniques With Applications To Power Amplifier Design
240 pages
Maths (2)
No ratings yet
Maths (2)
11 pages
Blog - Peek N Fothergill - Using Focus Groups For Qualitative Research
No ratings yet
Blog - Peek N Fothergill - Using Focus Groups For Qualitative Research
4 pages
XAPAX Security
No ratings yet
XAPAX Security
212 pages
PMT Hps Installation Manual Pressure Transmitters pl3700x Series Mi5098e Rev3
No ratings yet
PMT Hps Installation Manual Pressure Transmitters pl3700x Series Mi5098e Rev3
34 pages
Hose & Cable Reel Selection Guide
No ratings yet
Hose & Cable Reel Selection Guide
5 pages
S U N D I A L: Tropical Design
No ratings yet
S U N D I A L: Tropical Design
23 pages
Chapter Text - IFRS 15
No ratings yet
Chapter Text - IFRS 15
6 pages
Abtech GRP Electrical Enclosures Abtech BPG Junction Boxes
No ratings yet
Abtech GRP Electrical Enclosures Abtech BPG Junction Boxes
38 pages

DMV - 3 - Jupyter Notebook

Uploaded by

DMV - 3 - Jupyter Notebook

Uploaded by

10/6/24, 7:24 PM DMV_3 - Jupyter Notebook

In [1]: import pandas as pd

In [3]: df.columns = df.columns.str.strip()

0 13300000 7420 4 2 3 yes no no no yes 2 yes furnished

1 12250000 8960 4 4 4 yes no no no yes 3 no furnished

2 12250000 9960 3 2 2 yes no yes no no 2 yes semi-furnished

3 12215000 7500 4 2 2 yes no yes no yes 3 yes furnished

4 11410000 7420 4 1 2 yes yes yes no yes 2 no furnished

540 1820000 3000 2 1 1 yes no yes no no 2 no unfurnished

541 1767150 2400 3 1 1 no no no no no 0 no semi-furnished

542 1750000 3620 2 1 1 yes no no no no 0 no unfurnished

543 1750000 2910 3 1 1 no no no no no 0 no furnished

544 1750000 3850 3 1 2 yes no no no no 0 no unfurnished

count 5.450000e+02 545.000000 545.000000 545.000000 545.000000 545.000000

mean 4.766729e+06 5150.541284 2.965138 1.286239 1.805505 0.693578

std 1.870440e+06 2170.141023 0.738064 0.502470 0.867492 0.861586

min 1.750000e+06 1650.000000 1.000000 1.000000 1.000000 0.000000

25% 3.430000e+06 3600.000000 2.000000 1.000000 1.000000 0.000000

50% 4.340000e+06 4600.000000 3.000000 1.000000 2.000000 0.000000

75% 5.740000e+06 6360.000000 3.000000 2.000000 2.000000 1.000000

max 1.330000e+07 16200.000000 6.000000 4.000000 4.000000 3.000000

Out[8]: (545, 13)

Out[9]: Index(['price', 'area', 'bedrooms', 'bathrooms', 'stories', 'mainroad',

In [16]: Categorical_Column = ['mainroad', 'guestroom', 'basement', 'hotwaterheating', 'aircondtioning', 'prefarea', 'furnishing_statu

In [19]: filtered_data = df[df['price'] > 100000]

hotwaterheating airconditioning parking prefarea furnishingstatus

In [21]: categorical_cols = ['mainroad', 'guestroom', 'basement', 'hotwaterheating', 'airconditioning', 'prefarea', 'furnishingstatus

Data after removing outliers:

You might also like