0% found this document useful (0 votes)
16 views

Assignment 1

The document analyzes an iris dataset with 150 observations and 5 variables using Pandas in Python. It loads the dataset from a CSV file, examines the head and tail, determines the shape, data types and describes statistics. It also checks for null/missing values which are not present in the dataset.

Uploaded by

Akshata Chopade
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views

Assignment 1

The document analyzes an iris dataset with 150 observations and 5 variables using Pandas in Python. It loads the dataset from a CSV file, examines the head and tail, determines the shape, data types and describes statistics. It also checks for null/missing values which are not present in the dataset.

Uploaded by

Akshata Chopade
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

10/02/2023, 10:19 iris datase 1st-Copy1 - Jupyter Notebook

In [1]:

import pandas as pd

In [2]:

df=pd.read_csv("/home/ubuntu/Downloads/iris.csv")

In [3]:

df.head()

Out[3]:

sepal_length sepal_width petal_length petal_width species

0 5.1 3.5 1.4 0.2 setosa

1 4.9 3.0 1.4 0.2 setosa

2 4.7 3.2 1.3 0.2 setosa

3 4.6 3.1 1.5 0.2 setosa

4 5.0 3.6 1.4 0.2 setosa

In [4]:

df.tail()

Out[4]:

sepal_length sepal_width petal_length petal_width species

145 6.7 3.0 5.2 2.3 virginica

146 6.3 2.5 5.0 1.9 virginica

147 6.5 3.0 5.2 2.0 virginica

148 6.2 3.4 5.4 2.3 virginica

149 5.9 3.0 5.1 1.8 virginica

In [6]:

df.index

Out[6]:

RangeIndex(start=0, stop=150, step=1)

In [7]:

df.columns

Out[7]:

Index(['sepal_length', 'sepal_width', 'petal_length', 'petal_width',


'species'],
dtype='object')

localhost:8888/notebooks/Akshata T2 29/iris datase 1st-Copy1.ipynb 1/6


10/02/2023, 10:19 iris datase 1st-Copy1 - Jupyter Notebook

In [8]:

df.shape

Out[8]:

(150, 5)

In [9]:

df.dtypes

Out[9]:

sepal_length float64
sepal_width float64
petal_length float64
petal_width float64
species object
dtype: object

In [10]:

df.columns.values

Out[10]:

array(['sepal_length', 'sepal_width', 'petal_length', 'petal_width',


'species'], dtype=object)

In [11]:

df.describe()

Out[11]:

sepal_length sepal_width petal_length petal_width

count 150.000000 150.000000 150.000000 150.000000

mean 5.843333 3.054000 3.758667 1.198667

std 0.828066 0.433594 1.764420 0.763161

min 4.300000 2.000000 1.000000 0.100000

25% 5.100000 2.800000 1.600000 0.300000

50% 5.800000 3.000000 4.350000 1.300000

75% 6.400000 3.300000 5.100000 1.800000

max 7.900000 4.400000 6.900000 2.500000

localhost:8888/notebooks/Akshata T2 29/iris datase 1st-Copy1.ipynb 2/6


10/02/2023, 10:19 iris datase 1st-Copy1 - Jupyter Notebook

In [12]:

df.describe(include="all")

Out[12]:

sepal_length sepal_width petal_length petal_width species

count 150.000000 150.000000 150.000000 150.000000 150

unique NaN NaN NaN NaN 3

top NaN NaN NaN NaN setosa

freq NaN NaN NaN NaN 50

mean 5.843333 3.054000 3.758667 1.198667 NaN

std 0.828066 0.433594 1.764420 0.763161 NaN

min 4.300000 2.000000 1.000000 0.100000 NaN

25% 5.100000 2.800000 1.600000 0.300000 NaN

50% 5.800000 3.000000 4.350000 1.300000 NaN

75% 6.400000 3.300000 5.100000 1.800000 NaN

max 7.900000 4.400000 6.900000 2.500000 NaN

In [13]:

df.isnull()

Out[13]:

sepal_length sepal_width petal_length petal_width species

0 False False False False False

1 False False False False False

2 False False False False False

3 False False False False False

4 False False False False False

... ... ... ... ... ...

145 False False False False False

146 False False False False False

147 False False False False False

148 False False False False False

149 False False False False False

150 rows × 5 columns

localhost:8888/notebooks/Akshata T2 29/iris datase 1st-Copy1.ipynb 3/6


10/02/2023, 10:19 iris datase 1st-Copy1 - Jupyter Notebook

In [14]:

df.isna()

Out[14]:

sepal_length sepal_width petal_length petal_width species

0 False False False False False

1 False False False False False

2 False False False False False

3 False False False False False

4 False False False False False

... ... ... ... ... ...

145 False False False False False

146 False False False False False

147 False False False False False

148 False False False False False

149 False False False False False

150 rows × 5 columns

In [15]:

df.notnull()

Out[15]:

sepal_length sepal_width petal_length petal_width species

0 True True True True True

1 True True True True True

2 True True True True True

3 True True True True True

4 True True True True True

... ... ... ... ... ...

145 True True True True True

146 True True True True True

147 True True True True True

148 True True True True True

149 True True True True True

150 rows × 5 columns

localhost:8888/notebooks/Akshata T2 29/iris datase 1st-Copy1.ipynb 4/6


10/02/2023, 10:19 iris datase 1st-Copy1 - Jupyter Notebook

In [16]:

df.notna()

Out[16]:

sepal_length sepal_width petal_length petal_width species

0 True True True True True

1 True True True True True

2 True True True True True

3 True True True True True

4 True True True True True

... ... ... ... ... ...

145 True True True True True

146 True True True True True

147 True True True True True

148 True True True True True

149 True True True True True

150 rows × 5 columns

In [17]:

df.isnull().sum()

Out[17]:

sepal_length 0
sepal_width 0
petal_length 0
petal_width 0
species 0
dtype: int64

In [18]:

df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 150 entries, 0 to 149
Data columns (total 5 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 sepal_length 150 non-null float64
1 sepal_width 150 non-null float64
2 petal_length 150 non-null float64
3 petal_width 150 non-null float64
4 species 150 non-null object
dtypes: float64(4), object(1)
memory usage: 6.0+ KB

localhost:8888/notebooks/Akshata T2 29/iris datase 1st-Copy1.ipynb 5/6


10/02/2023, 10:19 iris datase 1st-Copy1 - Jupyter Notebook

In [19]:

df.isnull().any()

Out[19]:

sepal_length False
sepal_width False
petal_length False
petal_width False
species False
dtype: bool

In [20]:

df.iloc[3]

Out[20]:

sepal_length 4.6
sepal_width 3.1
petal_length 1.5
petal_width 0.2
species setosa
Name: 3, dtype: object

In [21]:

df[0:3]

Out[21]:

sepal_length sepal_width petal_length petal_width species

0 5.1 3.5 1.4 0.2 setosa

1 4.9 3.0 1.4 0.2 setosa

2 4.7 3.2 1.3 0.2 setosa

In [ ]:

localhost:8888/notebooks/Akshata T2 29/iris datase 1st-Copy1.ipynb 6/6

You might also like