DMV - 3 - Jupyter Notebook
DMV - 3 - Jupyter Notebook
In [2]: df = pd.read_csv('Housing.csv')
In [4]: df.head()
Out[4]:
price area bedrooms bathrooms stories mainroad guestroom basement hotwaterheating airconditioning parking prefarea furnishingstatus
In [5]: df.tail()
Out[5]:
price area bedrooms bathrooms stories mainroad guestroom basement hotwaterheating airconditioning parking prefarea furnishingstatus
In [6]: df.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 545 entries, 0 to 544
Data columns (total 13 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 price 545 non-null int64
1 area 545 non-null int64
2 bedrooms 545 non-null int64
3 bathrooms 545 non-null int64
4 stories 545 non-null int64
5 mainroad 545 non-null object
6 guestroom 545 non-null object
7 basement 545 non-null object
8 hotwaterheating 545 non-null object
9 airconditioning 545 non-null object
10 parking 545 non-null int64
11 prefarea 545 non-null object
12 furnishingstatus 545 non-null object
dtypes: int64(6), object(7)
memory usage: 55.5+ KB
In [7]: df.describe()
Out[7]:
price area bedrooms bathrooms stories parking
In [8]: df.shape
localhost:8888/notebooks/BE_PRACTICALS/DMV_3.ipynb 1/2
10/6/24, 7:24 PM DMV_3 - Jupyter Notebook
In [9]: df.columns
In [10]: df.isnull().sum()
Out[10]: price 0
area 0
bedrooms 0
bathrooms 0
stories 0
mainroad 0
guestroom 0
basement 0
hotwaterheating 0
airconditioning 0
parking 0
prefarea 0
furnishingstatus 0
dtype: int64
Filtered data: price area bedrooms bathrooms stories mainroad guestroom basement \
0 13300000 7420 4 2 3 yes no no
1 12250000 8960 4 4 4 yes no no
2 12250000 9960 3 2 2 yes no yes
3 12215000 7500 4 2 2 yes no yes
4 11410000 7420 4 1 2 yes yes yes
In [23]: Q1 = df['price'].quantile(0.25)
Q3 = df['price'].quantile(0.75)
IQR = Q3 - Q1
lower_bound = Q1 - 1.5 * IQR
upper_bound = Q3 + 1.5 * IQR
data_no_outliers = df[(df['price'] >= lower_bound) & (df['price'] <= upper_bound)]
print("Data after removing outliers:\n", data_no_outliers.describe())
parking
count 530.000000
mean 0.664151
std 0.843320
min 0.000000
25% 0.000000
50% 0.000000
75% 1.000000
max 3.000000
In [ ]:
localhost:8888/notebooks/BE_PRACTICALS/DMV_3.ipynb 2/2