DSBDA-Assignment-3 - Jupyter Notebook http://localhost:8888/notebooks/DSBDA-Assignment-3...
In [1]: import pandas as pd
import numpy as np
In [3]: df = pd.read_csv("StudentsPerformance.csv")
In [4]: df
Out[4]:
test
race/ parental level math reading writing
gender lunch preparation
ethnicity of education score score score
course
bachelor's
0 female group B standard none 72 72 74
degree
1 female group C some college standard completed 69 90 88
master's
2 female group B standard none 90 95 93
degree
associate's free/
3 male group A none 47 57 44
degree reduced
4 male group C some college standard none 76 78 75
... ... ... ... ... ... ... ... ...
master's
995 female group E standard completed 88 99 95
degree
free/
996 male group C high school none 62 55 55
reduced
free/
997 female group C high school completed 59 71 65
reduced
998 female group D some college standard completed 68 78 77
free/
999 female group D some college none 77 86 86
reduced
1000 rows × 8 columns
In [5]: df.head()
Out[5]:
test
race/ parental level math reading writing
gender lunch preparation
ethnicity of education score score score
course
bachelor's
0 female group B standard none 72 72 74
degree
1 female group C some college standard completed 69 90 88
master's
2 female group B standard none 90 95 93
degree
associate's free/
3 male group A none 47 57 44
degree reduced
4 male group C some college standard none 76 78 75
1 of 3 20/02/25, 11:03
DSBDA-Assignment-3 - Jupyter Notebook http://localhost:8888/notebooks/DSBDA-Assignment-3...
In [6]: df.tail()
Out[6]:
test
race/ parental level math reading writing
gender lunch preparation
ethnicity of education score score score
course
master's
995 female group E standard completed 88 99 95
degree
free/
996 male group C high school none 62 55 55
reduced
free/
997 female group C high school completed 59 71 65
reduced
998 female group D some college standard completed 68 78 77
free/
999 female group D some college none 77 86 86
reduced
In [8]: statistics = df.groupby('gender')['math score'].agg(['mean','median',
statistics
Out[8]:
mean median min max std
gender
female 63.633205 65.0 0 100 15.491453
male 68.728216 69.0 27 100 14.356277
In [9]: data = pd.read_csv("Iris.csv")
In [10]: data
Out[10]:
Id SepalLengthCm SepalWidthCm PetalLengthCm PetalWidthCm Species
0 1 5.1 3.5 1.4 0.2 Iris-setosa
1 2 4.9 3.0 1.4 0.2 Iris-setosa
2 3 4.7 3.2 1.3 0.2 Iris-setosa
3 4 4.6 3.1 1.5 0.2 Iris-setosa
4 5 5.0 3.6 1.4 0.2 Iris-setosa
... ... ... ... ... ... ...
145 146 6.7 3.0 5.2 2.3 Iris-virginica
146 147 6.3 2.5 5.0 1.9 Iris-virginica
147 148 6.5 3.0 5.2 2.0 Iris-virginica
148 149 6.2 3.4 5.4 2.3 Iris-virginica
149 150 5.9 3.0 5.1 1.8 Iris-virginica
150 rows × 6 columns
2 of 3 20/02/25, 11:03
DSBDA-Assignment-3 - Jupyter Notebook http://localhost:8888/notebooks/DSBDA-Assignment-3...
In [12]: selected_species = ['Iris-setosa','Iris-versicolor','Iris-virginica']
filtered_df = data[data['Species'].isin(selected_species)]
In [14]: species_stats = filtered_df.groupby('Species').agg(['quantile','mean'
print("\nBasic Statistical Details for Selected Species :\n",species_stats
Basic Statistical Details for Selected Species :
Id SepalLengthCm
\
quantile mean median min max quantile me
an median
Species
Iris-setosa 25.5 25.5 25.5 1 50 5.0 5.0
06 5.0
Iris-versicolor 75.5 75.5 75.5 51 100 5.9 5.9
36 5.9
Iris-virginica 125.5 125.5 125.5 101 150 6.5 6.5
88 6.5
... PetalLengthCm
\
min max ... quantile mean median min ma
x
Species ...
Iris-setosa 4.3 5.8 ... 1.50 1.464 1.50 1.0 1.
9
Iris-versicolor 4.9 7.0 ... 4.35 4.260 4.35 3.0 5.
1
Iris-virginica 4.9 7.9 ... 5.55 5.552 5.55 4.5 6.
9
PetalWidthCm
quantile mean median min max
Species
Iris-setosa 0.2 0.244 0.2 0.1 0.6
Iris-versicolor 1.3 1.326 1.3 1.0 1.8
Iris-virginica 2.0 2.026 2.0 1.4 2.5
[3 rows x 25 columns]
In [ ]:
3 of 3 20/02/25, 11:03