0% found this document useful (0 votes)
38 views3 pages

DSBDA Assignment 3 Jupyter Notebook

The document is a Jupyter Notebook containing data analysis on student performance and the Iris dataset. It includes importing libraries, reading CSV files, and performing group statistics on student math scores by gender, as well as statistical details for selected Iris species. The analysis provides insights into the scores and characteristics of the datasets.

Uploaded by

sumeet
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
38 views3 pages

DSBDA Assignment 3 Jupyter Notebook

The document is a Jupyter Notebook containing data analysis on student performance and the Iris dataset. It includes importing libraries, reading CSV files, and performing group statistics on student math scores by gender, as well as statistical details for selected Iris species. The analysis provides insights into the scores and characteristics of the datasets.

Uploaded by

sumeet
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

DSBDA-Assignment-3 - Jupyter Notebook http://localhost:8888/notebooks/DSBDA-Assignment-3...

In [1]: import pandas as pd


import numpy as np

In [3]: df = pd.read_csv("StudentsPerformance.csv")

In [4]: df

Out[4]:
test
race/ parental level math reading writing
gender lunch preparation
ethnicity of education score score score
course

bachelor's
0 female group B standard none 72 72 74
degree

1 female group C some college standard completed 69 90 88

master's
2 female group B standard none 90 95 93
degree

associate's free/
3 male group A none 47 57 44
degree reduced

4 male group C some college standard none 76 78 75

... ... ... ... ... ... ... ... ...

master's
995 female group E standard completed 88 99 95
degree

free/
996 male group C high school none 62 55 55
reduced

free/
997 female group C high school completed 59 71 65
reduced

998 female group D some college standard completed 68 78 77

free/
999 female group D some college none 77 86 86
reduced

1000 rows × 8 columns

In [5]: df.head()

Out[5]:
test
race/ parental level math reading writing
gender lunch preparation
ethnicity of education score score score
course

bachelor's
0 female group B standard none 72 72 74
degree

1 female group C some college standard completed 69 90 88

master's
2 female group B standard none 90 95 93
degree

associate's free/
3 male group A none 47 57 44
degree reduced

4 male group C some college standard none 76 78 75

1 of 3 20/02/25, 11:03
DSBDA-Assignment-3 - Jupyter Notebook http://localhost:8888/notebooks/DSBDA-Assignment-3...

In [6]: df.tail()

Out[6]:
test
race/ parental level math reading writing
gender lunch preparation
ethnicity of education score score score
course

master's
995 female group E standard completed 88 99 95
degree

free/
996 male group C high school none 62 55 55
reduced

free/
997 female group C high school completed 59 71 65
reduced

998 female group D some college standard completed 68 78 77

free/
999 female group D some college none 77 86 86
reduced

In [8]: statistics = df.groupby('gender')['math score'].agg(['mean','median',


statistics

Out[8]:
mean median min max std

gender

female 63.633205 65.0 0 100 15.491453

male 68.728216 69.0 27 100 14.356277

In [9]: data = pd.read_csv("Iris.csv")

In [10]: data

Out[10]:
Id SepalLengthCm SepalWidthCm PetalLengthCm PetalWidthCm Species

0 1 5.1 3.5 1.4 0.2 Iris-setosa

1 2 4.9 3.0 1.4 0.2 Iris-setosa

2 3 4.7 3.2 1.3 0.2 Iris-setosa

3 4 4.6 3.1 1.5 0.2 Iris-setosa

4 5 5.0 3.6 1.4 0.2 Iris-setosa

... ... ... ... ... ... ...

145 146 6.7 3.0 5.2 2.3 Iris-virginica

146 147 6.3 2.5 5.0 1.9 Iris-virginica

147 148 6.5 3.0 5.2 2.0 Iris-virginica

148 149 6.2 3.4 5.4 2.3 Iris-virginica

149 150 5.9 3.0 5.1 1.8 Iris-virginica

150 rows × 6 columns

2 of 3 20/02/25, 11:03
DSBDA-Assignment-3 - Jupyter Notebook http://localhost:8888/notebooks/DSBDA-Assignment-3...

In [12]: selected_species = ['Iris-setosa','Iris-versicolor','Iris-virginica']


filtered_df = data[data['Species'].isin(selected_species)]

In [14]: species_stats = filtered_df.groupby('Species').agg(['quantile','mean'


print("\nBasic Statistical Details for Selected Species :\n",species_stats

Basic Statistical Details for Selected Species :


Id SepalLengthCm
\
quantile mean median min max quantile me
an median
Species
Iris-setosa 25.5 25.5 25.5 1 50 5.0 5.0
06 5.0
Iris-versicolor 75.5 75.5 75.5 51 100 5.9 5.9
36 5.9
Iris-virginica 125.5 125.5 125.5 101 150 6.5 6.5
88 6.5

... PetalLengthCm
\
min max ... quantile mean median min ma
x
Species ...
Iris-setosa 4.3 5.8 ... 1.50 1.464 1.50 1.0 1.
9
Iris-versicolor 4.9 7.0 ... 4.35 4.260 4.35 3.0 5.
1
Iris-virginica 4.9 7.9 ... 5.55 5.552 5.55 4.5 6.
9

PetalWidthCm
quantile mean median min max
Species
Iris-setosa 0.2 0.244 0.2 0.1 0.6
Iris-versicolor 1.3 1.326 1.3 1.0 1.8
Iris-virginica 2.0 2.026 2.0 1.4 2.5

[3 rows x 25 columns]

In [ ]:

3 of 3 20/02/25, 11:03

You might also like