DataFrame Notes1
DataFrame Notes1
OUTPUT:
Empty DataFrame
Columns: [ ]
Index: [ ]
2. Creation of DataFrame from numpy arrays:
Let us create DataFrame from the numpy arrays
import numpy as np
import pandas as pd
ar1 = np.array([1, 2, 3, 4]) #First array created containing 4
integers
ar2 = np.array([10, 20, 30, 40]) #Second array created
containing 4 integers
ar3 = np.array([-23, -43, 67, 90]) #Third array created
containing 4 integers
OUTPUT:
0
01
12
23
34
---------------------
import numpy as np
import pandas as pd
ar1 = np.array([1, 2, 3, 4]) #First array created containing 4
integers
ar2 = np.array([10, 20, 30, 40]) #Second array created
containing 4 integers
ar3 = np.array([-23, -43, 67, 90]) #Third array created
containing 4 integers
OUTPUT:
0 1 2 3
0 1 2 3 4
1 10 20 30 40
---------------------
import pandas as pd
ar1 = np.array([1, 2, 3, 4]) #First array created containing 4
integers
ar2 = np.array([10, 20, 30, 40]) #Second array created
containing 4 integers
ar3 = np.array([-23, -43, 67, 90]) #Third array created
containing 4 integers
OUTPUT:
0 1 2 3
0 1 2 3 4
1 10 20 30 40
2 -23 -43 67 90
OUTPUT:
0
0 11
1 22
2 33
3 44
4 55
Practical 2: To create dataframe from simple list by passing
appropriate column heading and row index.
import pandas as pd
df = pd.DataFrame([11, 22, 33, 44, 55], index=['R1',
'R2','R3','R4','R5'], columns=['C1'])
print(df)
OUTPUT:
C1
R1 11
R2 22
R3 33
R4 44
R5 55
Practical 3: To create dataframe from nested list.
import pandas as pd
df = pd.DataFrame([[21, 'X', 'A'], [32, 'IX', 'B'], [23, 'X', 'A'],
[12, 'XI','A']])
print(df)
OUTPUT:
0 1 2
0 21 X A
1 32 IX B
2 23 X A
3 12 XI A
Practical 4: To create dataframe from nested list by passing
appropriate column heading and row index.
import pandas as pd
df = pd.DataFrame([[21, 'X', 'A'], [32, 'IX', 'B'], [23, 'X', 'A'],[12,
'XI','A']], index= ['Rec1', 'Rec2', 'Rec3', 'Rec4'], columns =
["Rno", "Class", "Sec"])
print(df)
OUTPUT:
OUTPUT:
OUTPUT:
OUTPUT:
OUTPUT:
0
0 10
1 20
2 30
3 40
Here, the DataFrame has as many numbers of rows as the
numbers of elements in the series, but has only one column.
Example 2: Creation of DataFrame from two Series.
import pandas as pd
S1 = pd.Series([10, 20, 30, 40])
S2 = pd.Series([11, 22, 33, 44])
S3 = pd.Series([34, 44, 54, 24])
df = pd.DataFrame([S1, S2], index = ['R1', 'R2'])
print(df)
OUTPUT:
0 1 2 3
R1 10 20 30 40
R2 11 22 33 44
Example 3: Creation of DataFrame from three Series.
import pandas as pd
S1 = pd.Series([10, 20, 30, 40])
S2 = pd.Series([11, 22, 33, 44])
S3 = pd.Series([34, 44, 54, 24])
df = pd.DataFrame([S1, S2, S3],index = ['R1', 'R2', 'R3'])
print(df)
OUTPUT:
0 1 2 3
R1 10 20 30 40
R2 11 22 33 44
R3 34 44 54 24
To create a DataFrame using more than one series, we need to
pass multiple series in the list as shown above
NOTE: if a particular series does not have a corresponding
value for a label, NaN is inserted in the DataFrame column. for
example
import pandas as pd
S1 = pd.Series([10, 20, 30, 40])
S2 = pd.Series([11, 22, 33, 44])
S3 = pd.Series([34, 44, 54])
df = pd.DataFrame([S1, S2, S3],index = ['R1', 'R2', 'R3'])
print(df)
OUTPUT:
0 1 2 3
R1 10.0 20.0 30.0 40.0
R2 11.0 22.0 33.0 44.0
R3 34.0 44.0 54.0 NaN
Operations on rows and columns in
DataFrames
We can perform some basic operations on rows and columns of
a DataFrame like
1. Adding a New Column to a DataFrame:
We can easily add a new column to a DataFrame. Lets see the
example given below
import pandas as pd
df = pd.DataFrame([{'Ram':25, 'Anil':29, 'Simple':28},
{'Ram':21, 'Anil':25, 'Simple':23},{'Ram':23, 'Anil':18,
'Simple':26}],index=['R1','R2','R3'])
print(df)
df['Amit']=[18, 22, 25] #Adding column to DataFrame
print(df)
df['Parth']=[28, 12, 30] #Adding column to DataFrame
print(df)
OUTPUT:
OUTPUT:
ValueError: Length of values does not match length of index
2. Adding a New Row to a DataFrame:
We can add a new row to a DataFrame using the
DataFrame.loc[ ] method. Lets see the example given below
import pandas as pd
df = pd.DataFrame([{'Ram':25, 'Anil':29, 'Simple':28},
{'Ram':21, 'Anil':25, 'Simple':23}, {'Ram':23, 'Anil':18,
'Simple':26}], index=['R1', 'R2', 'R3'])
print(df)
df.loc['R4']=[12, 22, 10] #Adding new row
print(df)
OUTPUT:
OUTPUT:
OUTPUT:
OUTPUT:
OUTPUT:
R1 28
R2 23
R3 26
Name: Simple, dtype: int64
----------------------------------------------------
Ram Anil
R1 25 29
R2 21 25
R3 23 18
2. drop( ): This method deletes the entire column from a
dataframe. To delete a column, the parameter axis is assigned
the value 1. Lets see the examples given below
import pandas as pd
df = pd.DataFrame({'Ram': [25, 21, 23], 'Anil':[29, 25, 18],
'Simple':[28, 23, 26]},index=['R1', 'R2', 'R3'])
print(df)
print("----------------------------------------------------")
df=df.drop('Simple', axis=1) #Deleting column from dataframe
print(df)
OUTPUT:
OUTPUT:
OUTPUT:
-----------------------------------------------------
Ram Anil Simple
Maths 25 29 28
Science 21 25 23
English 23 18 26
OUTPUT:
OUTPUT:
OUTPUT:
OUTPUT:
OUTPUT:
OUTPUT:
OUTPUT:
OUTPUT:
OUTPUT:
OUTPUT:
OUTPUT:
OUTPUT:
OUTPUT:
OUTPUT:
OUTPUT:
OUTPUT:
OUTPUT:
OUTPUT:
OUTPUT:
OUTPUT:
OUTPUT:
OUTPUT:
OUTPUT:
The elements can be indexed in descending order The indexing starts with zero for the fi
also. element and the index is fixed.