0% found this document useful (0 votes)
31 views

Pandas PD: Import As

The document loads and explores two datasets: 1) The Titanic dataset with 891 passengers and 12 variables including Survived, Pclass, Name, Sex, Age, etc. Various commands are used to view the shape, head, tail and info of the dataframe. 2) The Iris dataset with 150 observations and 5 variables related to iris flowers including sepal length, petal width, and target species. The head of this dataframe is also viewed.

Uploaded by

Razin
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
31 views

Pandas PD: Import As

The document loads and explores two datasets: 1) The Titanic dataset with 891 passengers and 12 variables including Survived, Pclass, Name, Sex, Age, etc. Various commands are used to view the shape, head, tail and info of the dataframe. 2) The Iris dataset with 150 observations and 5 variables related to iris flowers including sepal length, petal width, and target species. The head of this dataframe is also viewed.

Uploaded by

Razin
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 19

eda

October 11, 2023

[2]: import pandas as pd

[3]: df=pd.read_csv("titanic_dataset.csv")

[5]: df.head()

[5]: PassengerId Survived Pclass \


0 1 0 3
1 2 1 1
2 3 1 3
3 4 1 1
4 5 0 3

Name Sex Age SibSp \


0 Braund, Mr. Owen Harris male 22.0 1
1 Cumings, Mrs. John Bradley (Florence Briggs Th… female 38.0 1
2 Heikkinen, Miss. Laina female 26.0 0
3 Futrelle, Mrs. Jacques Heath (Lily May Peel) female 35.0 1
4 Allen, Mr. William Henry male 35.0 0

Parch Ticket Fare Cabin Embarked


0 0 A/5 21171 7.2500 NaN S
1 0 PC 17599 71.2833 C85 C
2 0 STON/O2. 3101282 7.9250 NaN S
3 0 113803 53.1000 C123 S
4 0 373450 8.0500 NaN S

[6]: df.head(8)

[6]: PassengerId Survived Pclass \


0 1 0 3
1 2 1 1
2 3 1 3
3 4 1 1
4 5 0 3
5 6 0 3
6 7 0 1

1
7 8 0 3

Name Sex Age SibSp \


0 Braund, Mr. Owen Harris male 22.0 1
1 Cumings, Mrs. John Bradley (Florence Briggs Th… female 38.0 1
2 Heikkinen, Miss. Laina female 26.0 0
3 Futrelle, Mrs. Jacques Heath (Lily May Peel) female 35.0 1
4 Allen, Mr. William Henry male 35.0 0
5 Moran, Mr. James male NaN 0
6 McCarthy, Mr. Timothy J male 54.0 0
7 Palsson, Master. Gosta Leonard male 2.0 3

Parch Ticket Fare Cabin Embarked


0 0 A/5 21171 7.2500 NaN S
1 0 PC 17599 71.2833 C85 C
2 0 STON/O2. 3101282 7.9250 NaN S
3 0 113803 53.1000 C123 S
4 0 373450 8.0500 NaN S
5 0 330877 8.4583 NaN Q
6 0 17463 51.8625 E46 S
7 1 349909 21.0750 NaN S

[7]: df.tail()

[7]: PassengerId Survived Pclass Name \


886 887 0 2 Montvila, Rev. Juozas
887 888 1 1 Graham, Miss. Margaret Edith
888 889 0 3 Johnston, Miss. Catherine Helen "Carrie"
889 890 1 1 Behr, Mr. Karl Howell
890 891 0 3 Dooley, Mr. Patrick

Sex Age SibSp Parch Ticket Fare Cabin Embarked


886 male 27.0 0 0 211536 13.00 NaN S
887 female 19.0 0 0 112053 30.00 B42 S
888 female NaN 1 2 W./C. 6607 23.45 NaN S
889 male 26.0 0 0 111369 30.00 C148 C
890 male 32.0 0 0 370376 7.75 NaN Q

[8]: df.tail(10)

[8]: PassengerId Survived Pclass Name \


881 882 0 3 Markun, Mr. Johann
882 883 0 3 Dahlberg, Miss. Gerda Ulrika
883 884 0 2 Banfield, Mr. Frederick James
884 885 0 3 Sutehall, Mr. Henry Jr
885 886 0 3 Rice, Mrs. William (Margaret Norton)
886 887 0 2 Montvila, Rev. Juozas

2
887 888 1 1 Graham, Miss. Margaret Edith
888 889 0 3 Johnston, Miss. Catherine Helen "Carrie"
889 890 1 1 Behr, Mr. Karl Howell
890 891 0 3 Dooley, Mr. Patrick

Sex Age SibSp Parch Ticket Fare Cabin Embarked


881 male 33.0 0 0 349257 7.8958 NaN S
882 female 22.0 0 0 7552 10.5167 NaN S
883 male 28.0 0 0 C.A./SOTON 34068 10.5000 NaN S
884 male 25.0 0 0 SOTON/OQ 392076 7.0500 NaN S
885 female 39.0 0 5 382652 29.1250 NaN Q
886 male 27.0 0 0 211536 13.0000 NaN S
887 female 19.0 0 0 112053 30.0000 B42 S
888 female NaN 1 2 W./C. 6607 23.4500 NaN S
889 male 26.0 0 0 111369 30.0000 C148 C
890 male 32.0 0 0 370376 7.7500 NaN Q

[9]: df.shape

[9]: (891, 12)

[7]: df2=pd.read_csv("iris_dataset.csv")
df2

[7]: sepal length (cm) sepal width (cm) petal length (cm) petal width (cm) \
0 5.1 3.5 1.4 0.2
1 4.9 3.0 1.4 0.2
2 4.7 3.2 1.3 0.2
3 4.6 3.1 1.5 0.2
4 5.0 3.6 1.4 0.2
.. … … … …
145 6.7 3.0 5.2 2.3
146 6.3 2.5 5.0 1.9
147 6.5 3.0 5.2 2.0
148 6.2 3.4 5.4 2.3
149 5.9 3.0 5.1 1.8

target
0 Iris-setosa
1 Iris-setosa
2 Iris-setosa
3 Iris-setosa
4 Iris-setosa
.. …
145 Iris-virginica
146 Iris-virginica
147 Iris-virginica

3
148 Iris-virginica
149 Iris-virginica

[150 rows x 5 columns]

[12]: df2.head(10)

[12]: sepal length (cm) sepal width (cm) petal length (cm) petal width (cm) \
0 5.1 3.5 1.4 0.2
1 4.9 3.0 1.4 0.2
2 4.7 3.2 1.3 0.2
3 4.6 3.1 1.5 0.2
4 5.0 3.6 1.4 0.2
5 5.4 3.9 1.7 0.4
6 4.6 3.4 1.4 0.3
7 5.0 3.4 1.5 0.2
8 4.4 2.9 1.4 0.2
9 4.9 3.1 1.5 0.1

target
0 Iris-setosa
1 Iris-setosa
2 Iris-setosa
3 Iris-setosa
4 Iris-setosa
5 Iris-setosa
6 Iris-setosa
7 Iris-setosa
8 Iris-setosa
9 Iris-setosa

[13]: df2.shape

[13]: (150, 5)

[14]: df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 891 entries, 0 to 890
Data columns (total 12 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 PassengerId 891 non-null int64
1 Survived 891 non-null int64
2 Pclass 891 non-null int64
3 Name 891 non-null object
4 Sex 891 non-null object

4
5 Age 714 non-null float64
6 SibSp 891 non-null int64
7 Parch 891 non-null int64
8 Ticket 891 non-null object
9 Fare 891 non-null float64
10 Cabin 204 non-null object
11 Embarked 889 non-null object
dtypes: float64(2), int64(5), object(5)
memory usage: 83.7+ KB

[16]: df2.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 150 entries, 0 to 149
Data columns (total 5 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 sepal length (cm) 150 non-null float64
1 sepal width (cm) 150 non-null float64
2 petal length (cm) 150 non-null float64
3 petal width (cm) 150 non-null float64
4 target 150 non-null object
dtypes: float64(4), object(1)
memory usage: 6.0+ KB

[17]: df.describe()

[17]: PassengerId Survived Pclass Age SibSp \


count 891.000000 891.000000 891.000000 714.000000 891.000000
mean 446.000000 0.383838 2.308642 29.699118 0.523008
std 257.353842 0.486592 0.836071 14.526497 1.102743
min 1.000000 0.000000 1.000000 0.420000 0.000000
25% 223.500000 0.000000 2.000000 20.125000 0.000000
50% 446.000000 0.000000 3.000000 28.000000 0.000000
75% 668.500000 1.000000 3.000000 38.000000 1.000000
max 891.000000 1.000000 3.000000 80.000000 8.000000

Parch Fare
count 891.000000 891.000000
mean 0.381594 32.204208
std 0.806057 49.693429
min 0.000000 0.000000
25% 0.000000 7.910400
50% 0.000000 14.454200
75% 0.000000 31.000000
max 6.000000 512.329200

5
[18]: df2.describe()

[18]: sepal length (cm) sepal width (cm) petal length (cm) \
count 150.000000 150.000000 150.000000
mean 5.843333 3.054000 3.758667
std 0.828066 0.433594 1.764420
min 4.300000 2.000000 1.000000
25% 5.100000 2.800000 1.600000
50% 5.800000 3.000000 4.350000
75% 6.400000 3.300000 5.100000
max 7.900000 4.400000 6.900000

petal width (cm)


count 150.000000
mean 1.198667
std 0.763161
min 0.100000
25% 0.300000
50% 1.300000
75% 1.800000
max 2.500000

[19]: df.isnull()

[19]: PassengerId Survived Pclass Name Sex Age SibSp Parch Ticket \
0 False False False False False False False False False
1 False False False False False False False False False
2 False False False False False False False False False
3 False False False False False False False False False
4 False False False False False False False False False
.. … … … … … … … … …
886 False False False False False False False False False
887 False False False False False False False False False
888 False False False False False True False False False
889 False False False False False False False False False
890 False False False False False False False False False

Fare Cabin Embarked


0 False True False
1 False False False
2 False True False
3 False False False
4 False True False
.. … … …
886 False True False
887 False False False
888 False True False

6
889 False False False
890 False True False

[891 rows x 12 columns]

[20]: df.isnull().sum()

[20]: PassengerId 0
Survived 0
Pclass 0
Name 0
Sex 0
Age 177
SibSp 0
Parch 0
Ticket 0
Fare 0
Cabin 687
Embarked 2
dtype: int64

[21]: df2.isnull().sum()

[21]: sepal length (cm) 0


sepal width (cm) 0
petal length (cm) 0
petal width (cm) 0
target 0
dtype: int64

[22]: df.drop_duplicates()

[22]: PassengerId Survived Pclass \


0 1 0 3
1 2 1 1
2 3 1 3
3 4 1 1
4 5 0 3
.. … … …
886 887 0 2
887 888 1 1
888 889 0 3
889 890 1 1
890 891 0 3

Name Sex Age SibSp \


0 Braund, Mr. Owen Harris male 22.0 1

7
1 Cumings, Mrs. John Bradley (Florence Briggs Th… female 38.0 1
2 Heikkinen, Miss. Laina female 26.0 0
3 Futrelle, Mrs. Jacques Heath (Lily May Peel) female 35.0 1
4 Allen, Mr. William Henry male 35.0 0
.. … … … …
886 Montvila, Rev. Juozas male 27.0 0
887 Graham, Miss. Margaret Edith female 19.0 0
888 Johnston, Miss. Catherine Helen "Carrie" female NaN 1
889 Behr, Mr. Karl Howell male 26.0 0
890 Dooley, Mr. Patrick male 32.0 0

Parch Ticket Fare Cabin Embarked


0 0 A/5 21171 7.2500 NaN S
1 0 PC 17599 71.2833 C85 C
2 0 STON/O2. 3101282 7.9250 NaN S
3 0 113803 53.1000 C123 S
4 0 373450 8.0500 NaN S
.. … … … … …
886 0 211536 13.0000 NaN S
887 0 112053 30.0000 B42 S
888 2 W./C. 6607 23.4500 NaN S
889 0 111369 30.0000 C148 C
890 0 370376 7.7500 NaN Q

[891 rows x 12 columns]

[23]: df2.drop_duplicates()

[23]: sepal length (cm) sepal width (cm) petal length (cm) petal width (cm) \
0 5.1 3.5 1.4 0.2
1 4.9 3.0 1.4 0.2
2 4.7 3.2 1.3 0.2
3 4.6 3.1 1.5 0.2
4 5.0 3.6 1.4 0.2
.. … … … …
145 6.7 3.0 5.2 2.3
146 6.3 2.5 5.0 1.9
147 6.5 3.0 5.2 2.0
148 6.2 3.4 5.4 2.3
149 5.9 3.0 5.1 1.8

target
0 Iris-setosa
1 Iris-setosa
2 Iris-setosa
3 Iris-setosa
4 Iris-setosa

8
.. …
145 Iris-virginica
146 Iris-virginica
147 Iris-virginica
148 Iris-virginica
149 Iris-virginica

[147 rows x 5 columns]

[24]: df.value_counts('Sex')

[24]: Sex
male 577
female 314
dtype: int64

[26]: df2.value_counts('target')

[26]: target
Iris-setosa 50
Iris-versicolor 50
Iris-virginica 50
dtype: int64

[27]: df2.value_counts('sepal length (cm)')>3.5

[27]: sepal length (cm)


5.0 True
6.3 True
5.1 True
5.7 True
6.7 True
6.4 True
5.8 True
5.5 True
5.4 True
6.1 True
5.6 True
6.0 True
4.9 True
4.8 True
6.5 True
6.2 True
5.2 True
4.6 True
7.7 True
6.9 True

9
7.2 False
5.9 False
4.4 False
6.8 False
4.7 False
6.6 False
7.4 False
7.6 False
7.3 False
4.3 False
7.1 False
7.0 False
5.3 False
4.5 False
7.9 False
dtype: bool

[28]: df.value_counts('Age')>18

[28]: Age
24.00 True
22.00 True
18.00 True
30.00 True
28.00 True

20.50 False
14.50 False
12.00 False
0.92 False
80.00 False
Length: 88, dtype: bool

[29]: df.sample()

[29]: PassengerId Survived Pclass Name Sex Age \


655 656 0 2 Hickman, Mr. Leonard Mark male 24.0

SibSp Parch Ticket Fare Cabin Embarked


655 2 0 S.O.C. 14879 73.5 NaN S

[10]: df2.sample()

[10]: sepal length (cm) sepal width (cm) petal length (cm) petal width (cm) \
80 5.5 2.4 3.8 1.1

target

10
80 Iris-versicolor

[30]: df.sample(5)

[30]: PassengerId Survived Pclass Name \


843 844 0 3 Lemberopolous, Mr. Peter L
255 256 1 3 Touma, Mrs. Darwis (Hanne Youssef Razi)
732 733 0 2 Knight, Mr. Robert J
201 202 0 3 Sage, Mr. Frederick
361 362 0 2 del Carlo, Mr. Sebastiano

Sex Age SibSp Parch Ticket Fare Cabin Embarked


843 male 34.5 0 0 2683 6.4375 NaN C
255 female 29.0 0 2 2650 15.2458 NaN C
732 male NaN 0 0 239855 0.0000 NaN S
201 male NaN 8 2 CA. 2343 69.5500 NaN S
361 male 29.0 1 0 SC/PARIS 2167 27.7208 NaN C

[31]: df.sample(axis=1)

[31]: Fare
0 7.2500
1 71.2833
2 7.9250
3 53.1000
4 8.0500
.. …
886 13.0000
887 30.0000
888 23.4500
889 30.0000
890 7.7500

[891 rows x 1 columns]

[32]: df.nlargest(5,'Age')

[32]: PassengerId Survived Pclass Name \


630 631 1 1 Barkworth, Mr. Algernon Henry Wilson
851 852 0 3 Svensson, Mr. Johan
96 97 0 1 Goldschmidt, Mr. George B
493 494 0 1 Artagaveytia, Mr. Ramon
116 117 0 3 Connors, Mr. Patrick

Sex Age SibSp Parch Ticket Fare Cabin Embarked


630 male 80.0 0 0 27042 30.0000 A23 S
851 male 74.0 0 0 347060 7.7750 NaN S

11
96 male 71.0 0 0 PC 17754 34.6542 A5 C
493 male 71.0 0 0 PC 17609 49.5042 NaN C
116 male 70.5 0 0 370369 7.7500 NaN Q

[9]: df2.nlargest(5,'sepal length (cm)')

[9]: sepal length (cm) sepal width (cm) petal length (cm) petal width (cm) \
131 7.9 3.8 6.4 2.0
117 7.7 3.8 6.7 2.2
118 7.7 2.6 6.9 2.3
122 7.7 2.8 6.7 2.0
135 7.7 3.0 6.1 2.3

target
131 Iris-virginica
117 Iris-virginica
118 Iris-virginica
122 Iris-virginica
135 Iris-virginica

[33]: df.nsmallest(5,'Age')

[33]: PassengerId Survived Pclass Name Sex \


803 804 1 3 Thomas, Master. Assad Alexander male
755 756 1 2 Hamalainen, Master. Viljo male
469 470 1 3 Baclini, Miss. Helene Barbara female
644 645 1 3 Baclini, Miss. Eugenie female
78 79 1 2 Caldwell, Master. Alden Gates male

Age SibSp Parch Ticket Fare Cabin Embarked


803 0.42 0 1 2625 8.5167 NaN C
755 0.67 1 1 250649 14.5000 NaN S
469 0.75 2 1 2666 19.2583 NaN C
644 0.75 2 1 2666 19.2583 NaN C
78 0.83 0 2 248738 29.0000 NaN S

[8]: df2.nsmallest(5,'sepal length (cm)')

[8]: sepal length (cm) sepal width (cm) petal length (cm) petal width (cm) \
13 4.3 3.0 1.1 0.1
8 4.4 2.9 1.4 0.2
38 4.4 3.0 1.3 0.2
42 4.4 3.2 1.3 0.2
41 4.5 2.3 1.3 0.3

target
13 Iris-setosa

12
8 Iris-setosa
38 Iris-setosa
42 Iris-setosa
41 Iris-setosa

[34]: df.loc[[0,1,2]]

[34]: PassengerId Survived Pclass \


0 1 0 3
1 2 1 1
2 3 1 3

Name Sex Age SibSp \


0 Braund, Mr. Owen Harris male 22.0 1
1 Cumings, Mrs. John Bradley (Florence Briggs Th… female 38.0 1
2 Heikkinen, Miss. Laina female 26.0 0

Parch Ticket Fare Cabin Embarked


0 0 A/5 21171 7.2500 NaN S
1 0 PC 17599 71.2833 C85 C
2 0 STON/O2. 3101282 7.9250 NaN S

[35]: df.loc[[3,4,5],['PassengerId','Survived','Name']]

[35]: PassengerId Survived Name


3 4 1 Futrelle, Mrs. Jacques Heath (Lily May Peel)
4 5 0 Allen, Mr. William Henry
5 6 0 Moran, Mr. James

[36]: df.loc[0:2, ['PassengerId', 'Survived', 'Name']]

[36]: PassengerId Survived Name


0 1 0 Braund, Mr. Owen Harris
1 2 1 Cumings, Mrs. John Bradley (Florence Briggs Th…
2 3 1 Heikkinen, Miss. Laina

[37]: df.loc[df['Age'] > 35.5]

[37]: PassengerId Survived Pclass \


1 2 1 1
6 7 0 1
11 12 1 1
13 14 0 3
15 16 1 2
.. … … …
865 866 1 2
871 872 1 1

13
873 874 0 3
879 880 1 1
885 886 0 3

Name Sex Age SibSp \


1 Cumings, Mrs. John Bradley (Florence Briggs Th… female 38.0 1
6 McCarthy, Mr. Timothy J male 54.0 0
11 Bonnell, Miss. Elizabeth female 58.0 0
13 Andersson, Mr. Anders Johan male 39.0 1
15 Hewlett, Mrs. (Mary D Kingcome) female 55.0 0
.. … … … …
865 Bystrom, Mrs. (Karolina) female 42.0 0
871 Beckwith, Mrs. Richard Leonard (Sallie Monypeny) female 47.0 1
873 Vander Cruyssen, Mr. Victor male 47.0 0
879 Potter, Mrs. Thomas Jr (Lily Alexenia Wilson) female 56.0 0
885 Rice, Mrs. William (Margaret Norton) female 39.0 0

Parch Ticket Fare Cabin Embarked


1 0 PC 17599 71.2833 C85 C
6 0 17463 51.8625 E46 S
11 0 113783 26.5500 C103 S
13 5 347082 31.2750 NaN S
15 0 248706 16.0000 NaN S
.. … … … … …
865 0 236852 13.0000 NaN S
871 1 11751 52.5542 D35 S
873 0 345765 9.0000 NaN S
879 1 11767 83.1583 C50 C
885 5 382652 29.1250 NaN Q

[217 rows x 12 columns]

[38]: df.loc[(df['Survived'] == 1) & (df['Age'] > 30.0)]

[38]: PassengerId Survived Pclass \


1 2 1 1
3 4 1 1
11 12 1 1
15 16 1 2
21 22 1 2
.. … … …
857 858 1 1
862 863 1 1
865 866 1 2
871 872 1 1
879 880 1 1

14
Name Sex Age SibSp \
1 Cumings, Mrs. John Bradley (Florence Briggs Th… female 38.0 1
3 Futrelle, Mrs. Jacques Heath (Lily May Peel) female 35.0 1
11 Bonnell, Miss. Elizabeth female 58.0 0
15 Hewlett, Mrs. (Mary D Kingcome) female 55.0 0
21 Beesley, Mr. Lawrence male 34.0 0
.. … … … …
857 Daly, Mr. Peter Denis male 51.0 0
862 Swift, Mrs. Frederick Joel (Margaret Welles Ba… female 48.0 0
865 Bystrom, Mrs. (Karolina) female 42.0 0
871 Beckwith, Mrs. Richard Leonard (Sallie Monypeny) female 47.0 1
879 Potter, Mrs. Thomas Jr (Lily Alexenia Wilson) female 56.0 0

Parch Ticket Fare Cabin Embarked


1 0 PC 17599 71.2833 C85 C
3 0 113803 53.1000 C123 S
11 0 113783 26.5500 C103 S
15 0 248706 16.0000 NaN S
21 0 248698 13.0000 D56 S
.. … … … … …
857 0 113055 26.5500 E17 S
862 0 17466 25.9292 D17 S
865 0 236852 13.0000 NaN S
871 1 11751 52.5542 D35 S
879 1 11767 83.1583 C50 C

[124 rows x 12 columns]

[39]: df.iloc[[0, 2, 3]]

[39]: PassengerId Survived Pclass \


0 1 0 3
2 3 1 3
3 4 1 1

Name Sex Age SibSp Parch \


0 Braund, Mr. Owen Harris male 22.0 1 0
2 Heikkinen, Miss. Laina female 26.0 0 0
3 Futrelle, Mrs. Jacques Heath (Lily May Peel) female 35.0 1 0

Ticket Fare Cabin Embarked


0 A/5 21171 7.250 NaN S
2 STON/O2. 3101282 7.925 NaN S
3 113803 53.100 C123 S

[5]: df.iloc[[0, 2], [0, 1]]

15
[5]: PassengerId Survived
0 1 0
2 3 1

[4]: df.iloc[1:5, 2:5]

[4]: Pclass Name Sex


1 1 Cumings, Mrs. John Bradley (Florence Briggs Th… female
2 3 Heikkinen, Miss. Laina female
3 1 Futrelle, Mrs. Jacques Heath (Lily May Peel) female
4 3 Allen, Mr. William Henry male

[40]: df.dropna()

[40]: PassengerId Survived Pclass \


1 2 1 1
3 4 1 1
6 7 0 1
10 11 1 3
11 12 1 1
.. … … …
871 872 1 1
872 873 0 1
879 880 1 1
887 888 1 1
889 890 1 1

Name Sex Age SibSp \


1 Cumings, Mrs. John Bradley (Florence Briggs Th… female 38.0 1
3 Futrelle, Mrs. Jacques Heath (Lily May Peel) female 35.0 1
6 McCarthy, Mr. Timothy J male 54.0 0
10 Sandstrom, Miss. Marguerite Rut female 4.0 1
11 Bonnell, Miss. Elizabeth female 58.0 0
.. … … … …
871 Beckwith, Mrs. Richard Leonard (Sallie Monypeny) female 47.0 1
872 Carlsson, Mr. Frans Olof male 33.0 0
879 Potter, Mrs. Thomas Jr (Lily Alexenia Wilson) female 56.0 0
887 Graham, Miss. Margaret Edith female 19.0 0
889 Behr, Mr. Karl Howell male 26.0 0

Parch Ticket Fare Cabin Embarked


1 0 PC 17599 71.2833 C85 C
3 0 113803 53.1000 C123 S
6 0 17463 51.8625 E46 S
10 1 PP 9549 16.7000 G6 S
11 0 113783 26.5500 C103 S
.. … … … … …

16
871 1 11751 52.5542 D35 S
872 0 695 5.0000 B51 B53 B55 S
879 1 11767 83.1583 C50 C
887 0 112053 30.0000 B42 S
889 0 111369 30.0000 C148 C

[183 rows x 12 columns]

[41]: df.isna().any()

[41]: PassengerId False


Survived False
Pclass False
Name False
Sex False
Age True
SibSp False
Parch False
Ticket False
Fare False
Cabin True
Embarked True
dtype: bool

[13]: df2.isnull()

[13]: sepal length (cm) sepal width (cm) petal length (cm) petal width (cm) \
0 False False False False
1 False False False False
2 False False False False
3 False False False False
4 False False False False
.. … … … …
145 False False False False
146 False False False False
147 False False False False
148 False False False False
149 False False False False

target
0 False
1 False
2 False
3 False
4 False
.. …
145 False

17
146 False
147 False
148 False
149 False

[150 rows x 5 columns]

[14]: df2.isnull().sum()

[14]: sepal length (cm) 0


sepal width (cm) 0
petal length (cm) 0
petal width (cm) 0
target 0
dtype: int64

[15]: df2.tail()

[15]: sepal length (cm) sepal width (cm) petal length (cm) petal width (cm) \
145 6.7 3.0 5.2 2.3
146 6.3 2.5 5.0 1.9
147 6.5 3.0 5.2 2.0
148 6.2 3.4 5.4 2.3
149 5.9 3.0 5.1 1.8

target
145 Iris-virginica
146 Iris-virginica
147 Iris-virginica
148 Iris-virginica
149 Iris-virginica

[12]: df2.dropna()

[12]: sepal length (cm) sepal width (cm) petal length (cm) petal width (cm) \
0 5.1 3.5 1.4 0.2
1 4.9 3.0 1.4 0.2
2 4.7 3.2 1.3 0.2
3 4.6 3.1 1.5 0.2
4 5.0 3.6 1.4 0.2
.. … … … …
145 6.7 3.0 5.2 2.3
146 6.3 2.5 5.0 1.9
147 6.5 3.0 5.2 2.0
148 6.2 3.4 5.4 2.3
149 5.9 3.0 5.1 1.8

18
target
0 Iris-setosa
1 Iris-setosa
2 Iris-setosa
3 Iris-setosa
4 Iris-setosa
.. …
145 Iris-virginica
146 Iris-virginica
147 Iris-virginica
148 Iris-virginica
149 Iris-virginica

[150 rows x 5 columns]

19

You might also like