9/12/2021 Untitled5 - Jupyter Notebook
In [1]: import pandas as pd
import numpy as np
In [2]: survey=pd.read_csv("Survey-1.csv")
survey
Out[2]:
Grad Social Tex
ID Gender Age Class Major GPA Employment Salary Satisfaction Spending Computer
Intention Networking Message
0 1 Female 20 Junior Other Yes 2.9 Full-Time 50.0 1 3 350 Laptop 20
1 2 Male 23 Senior Management Yes 3.6 Part-Time 25.0 1 4 360 Laptop 5
2 3 Male 21 Junior Other Yes 2.5 Part-Time 45.0 2 4 600 Laptop 20
3 4 Male 21 Junior CIS Yes 2.5 Full-Time 40.0 4 6 600 Laptop 25
4 5 Male 23 Senior Other Undecided 2.8 Unemployed 40.0 2 4 500 Laptop 10
... ... ... ... ... ... ... ... ... ... ... ... ... ...
International
57 58 Female 21 Senior No 2.4 Part-Time 40.0 1 3 1000 Laptop 1
Business
58 59 Female 20 Junior CIS No 2.9 Part-Time 40.0 2 4 350 Laptop 25
59 60 Female 20 Sophomore CIS No 2.5 Part-Time 55.0 1 4 500 Laptop 50
60 61 Female 23 Senior Accounting Yes 3.5 Part-Time 30.0 2 3 490 Laptop 5
61 62 Female 23 Senior Economics/Finance No 3.2 Part-Time 70.0 2 3 250 Laptop
62 rows × 14 columns
In [3]: import matplotlib as plt
https://hub.gke2.mybinder.org/user/ipython-ipython-in-depth-e6cpxq3c/notebooks/binder/Untitled5.ipynb?kernel_name=python3# 1/11
9/12/2021 Untitled5 - Jupyter Notebook
In [4]: survey.describe()
Out[4]:
ID Age GPA Salary Social Networking Satisfaction Spending Text Messages
count 62.000000 62.000000 62.000000 62.000000 62.000000 62.000000 62.000000 62.000000
mean 31.500000 21.129032 3.129032 48.548387 1.516129 3.741935 482.016129 246.209677
std 18.041619 1.431311 0.377388 12.080912 0.844305 1.213793 221.953805 214.465950
min 1.000000 18.000000 2.300000 25.000000 0.000000 1.000000 100.000000 0.000000
25% 16.250000 20.000000 2.900000 40.000000 1.000000 3.000000 312.500000 100.000000
50% 31.500000 21.000000 3.150000 50.000000 1.000000 4.000000 500.000000 200.000000
75% 46.750000 22.000000 3.400000 55.000000 2.000000 4.000000 600.000000 300.000000
max 62.000000 26.000000 3.900000 80.000000 4.000000 6.000000 1400.000000 900.000000
https://hub.gke2.mybinder.org/user/ipython-ipython-in-depth-e6cpxq3c/notebooks/binder/Untitled5.ipynb?kernel_name=python3# 2/11
9/12/2021 Untitled5 - Jupyter Notebook
In [5]: survey.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 62 entries, 0 to 61
Data columns (total 14 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 ID 62 non-null int64
1 Gender 62 non-null object
2 Age 62 non-null int64
3 Class 62 non-null object
4 Major 62 non-null object
5 Grad Intention 62 non-null object
6 GPA 62 non-null float64
7 Employment 62 non-null object
8 Salary 62 non-null float64
9 Social Networking 62 non-null int64
10 Satisfaction 62 non-null int64
11 Spending 62 non-null int64
12 Computer 62 non-null object
13 Text Messages 62 non-null int64
dtypes: float64(2), int64(6), object(6)
memory usage: 6.9+ KB
In [7]: data_crosstab = pd.crosstab(survey['Gender'],
survey['Major'],
margins = False)
In [8]: print(data_crosstab)
Major Accounting CIS Economics/Finance International Business \
Gender
Female 3 3 7 4
Male 4 1 4 2
Major Management Other Retailing/Marketing Undecided
Gender
Female 4 3 9 0
Male 6 4 5 3
https://hub.gke2.mybinder.org/user/ipython-ipython-in-depth-e6cpxq3c/notebooks/binder/Untitled5.ipynb?kernel_name=python3# 3/11
9/12/2021 Untitled5 - Jupyter Notebook
In [9]: data_crosstab = pd.crosstab(survey['Gender'],
survey['Grad Intention'],
margins = False)
print(data_crosstab)
Grad Intention No Undecided Yes
Gender
Female 9 13 11
Male 3 9 17
In [10]: data_crosstab = pd.crosstab(survey['Gender'],
survey['Employment'],
margins = False)
print(data_crosstab)
Employment Full-Time Part-Time Unemployed
Gender
Female 3 24 6
Male 7 19 3
In [11]: data_crosstab = pd.crosstab(survey['Gender'],
survey['Computer'],
margins = False)
print(data_crosstab)
Computer Desktop Laptop Tablet
Gender
Female 2 29 2
Male 3 26 0
In [2]: import pandas as pd
import numpy as np
https://hub.gke2.mybinder.org/user/ipython-ipython-in-depth-e6cpxq3c/notebooks/binder/Untitled5.ipynb?kernel_name=python3# 4/11
9/12/2021 Untitled5 - Jupyter Notebook
In [4]: survey=pd.read_csv("Survey-1.csv")
survey
Out[4]:
Grad Social Tex
ID Gender Age Class Major GPA Employment Salary Satisfaction Spending Computer
Intention Networking Message
0 1 Female 20 Junior Other Yes 2.9 Full-Time 50.0 1 3 350 Laptop 20
1 2 Male 23 Senior Management Yes 3.6 Part-Time 25.0 1 4 360 Laptop 5
2 3 Male 21 Junior Other Yes 2.5 Part-Time 45.0 2 4 600 Laptop 20
3 4 Male 21 Junior CIS Yes 2.5 Full-Time 40.0 4 6 600 Laptop 25
4 5 Male 23 Senior Other Undecided 2.8 Unemployed 40.0 2 4 500 Laptop 10
... ... ... ... ... ... ... ... ... ... ... ... ... ...
International
57 58 Female 21 Senior No 2.4 Part-Time 40.0 1 3 1000 Laptop 1
Business
58 59 Female 20 Junior CIS No 2.9 Part-Time 40.0 2 4 350 Laptop 25
59 60 Female 20 Sophomore CIS No 2.5 Part-Time 55.0 1 4 500 Laptop 50
60 61 Female 23 Senior Accounting Yes 3.5 Part-Time 30.0 2 3 490 Laptop 5
61 62 Female 23 Senior Economics/Finance No 3.2 Part-Time 70.0 2 3 250 Laptop
62 rows × 14 columns
In [12]: data_crosstab = pd.crosstab(survey['Gender'],
survey['Grad Intention'], margins =True)
print(data_crosstab)
Grad Intention No Undecided Yes All
Gender
Female 9 13 11 33
Male 3 9 17 29
All 12 22 28 62
https://hub.gke2.mybinder.org/user/ipython-ipython-in-depth-e6cpxq3c/notebooks/binder/Untitled5.ipynb?kernel_name=python3# 5/11
9/12/2021 Untitled5 - Jupyter Notebook
In [13]: conda install seaborn
Collecting package metadata (current_repodata.json): done
Solving environment: done
==> WARNING: A newer version of conda exists. <==
current version: 4.9.2
latest version: 4.10.3
Please update conda by running
$ conda update -n base conda
# All requested packages already installed.
Note: you may need to restart the kernel to use updated packages.
https://hub.gke2.mybinder.org/user/ipython-ipython-in-depth-e6cpxq3c/notebooks/binder/Untitled5.ipynb?kernel_name=python3# 6/11
9/12/2021 Untitled5 - Jupyter Notebook
In [18]: sns.distplot(survey['GPA'])
/srv/conda/envs/notebook/lib/python3.6/site-packages/seaborn/distributions.py:2619: FutureWarning: `distplot` is a depr
ecated function and will be removed in a future version. Please adapt your code to use either `displot` (a figure-level
function with similar flexibility) or `histplot` (an axes-level function for histograms).
warnings.warn(msg, FutureWarning)
Out[18]: <AxesSubplot:xlabel='GPA', ylabel='Density'>
https://hub.gke2.mybinder.org/user/ipython-ipython-in-depth-e6cpxq3c/notebooks/binder/Untitled5.ipynb?kernel_name=python3# 7/11
9/12/2021 Untitled5 - Jupyter Notebook
In [19]: sns.distplot(survey['Salary'])
/srv/conda/envs/notebook/lib/python3.6/site-packages/seaborn/distributions.py:2619: FutureWarning: `distplot` is a depr
ecated function and will be removed in a future version. Please adapt your code to use either `displot` (a figure-level
function with similar flexibility) or `histplot` (an axes-level function for histograms).
warnings.warn(msg, FutureWarning)
Out[19]: <AxesSubplot:xlabel='Salary', ylabel='Density'>
https://hub.gke2.mybinder.org/user/ipython-ipython-in-depth-e6cpxq3c/notebooks/binder/Untitled5.ipynb?kernel_name=python3# 8/11
9/12/2021 Untitled5 - Jupyter Notebook
In [20]: sns.distplot(survey['Spending'])
/srv/conda/envs/notebook/lib/python3.6/site-packages/seaborn/distributions.py:2619: FutureWarning: `distplot` is a depr
ecated function and will be removed in a future version. Please adapt your code to use either `displot` (a figure-level
function with similar flexibility) or `histplot` (an axes-level function for histograms).
warnings.warn(msg, FutureWarning)
Out[20]: <AxesSubplot:xlabel='Spending', ylabel='Density'>
https://hub.gke2.mybinder.org/user/ipython-ipython-in-depth-e6cpxq3c/notebooks/binder/Untitled5.ipynb?kernel_name=python3# 9/11
9/12/2021 Untitled5 - Jupyter Notebook
In [21]: sns.distplot(survey['Text Messages'])
/srv/conda/envs/notebook/lib/python3.6/site-packages/seaborn/distributions.py:2619: FutureWarning: `distplot` is a depr
ecated function and will be removed in a future version. Please adapt your code to use either `displot` (a figure-level
function with similar flexibility) or `histplot` (an axes-level function for histograms).
warnings.warn(msg, FutureWarning)
Out[21]: <AxesSubplot:xlabel='Text Messages', ylabel='Density'>
In [22]: from scipy import stats
In [*]: t_statistic, p_value = stats.shapiro(one_sample_data)
print('One sample t test \nt statistic: {0} p value: {1} '.format(t_statistic, p_value/2))
In [ ]:
In [16]: import seaborn as sns
https://hub.gke2.mybinder.org/user/ipython-ipython-in-depth-e6cpxq3c/notebooks/binder/Untitled5.ipynb?kernel_name=python3# 10/11
9/12/2021 Untitled5 - Jupyter Notebook
In [ ]:
In [ ]:
In [ ]:
In [ ]:
In [ ]:
In [ ]:
In [ ]:
In [ ]:
In [ ]:
In [ ]:
https://hub.gke2.mybinder.org/user/ipython-ipython-in-depth-e6cpxq3c/notebooks/binder/Untitled5.ipynb?kernel_name=python3# 11/11