Ch 2-Data Handling using Pandas - I NCERT Solved 2024-25
Ch 2-Data Handling using Pandas - I NCERT Solved 2024-25
Index 0 1 2 3
10 23 56 17
>>>import numpy as np
>>>a=np.array([10,23,56,17])
>>>print(a)
Output : [10 23 56 17]
• Multi-dimensional Array (2-D) known as Matrics (Multiple
Rows/Columns). Two dimensional (2-D) arrays by passing nested
lists to the array() function.
Indices 0 1 2
0 23 56 17
1 41 67 34
2 45 72 45
3 65 37 56
Example :
# array1 is 1D-array, there is nothing # after , in sequence
>>> array1.shape
(3,)
>>> array2.shape
(4,)
>>> array3.shape
(3, 2)
The output (3, 2) means array3 has 3 rows and 2 columns.
iii) ndarray.size: It gives the total number of elements of the array. This is
equal to the product of the elements of shape.
Example:
>>> array1.size
3
>>> array3.size
6
iv) ndarray.dtype: is the data type of the elements of the array. All the
elements of an array are of same data type. Common data types are int32,
int64, float32, float64, U32, etc.
Example :
>>> array1.dtype
dtype('int32')
>>> array2.dtype
dtype('<U32>')
>>> array3.dtype
dtype('float64')
v) ndarray.itemsize: It specifies the size in bytes of each element of the
array. Data type int32 and float32 means each element of the array occupies 32
bits in memory. 8 bits form a byte. Thus, an array of elements of type int32 has
itemsize 32/8=4 bytes. Likewise, int64/float64 means each item has itemsize
64/8=8 bytes.
Example :
>>> array1.itemsize
4 # memory allocated to integer
>>> array2.itemsize
Chapter 2: Data Handling with Pandas-I 5|P a g e
Pr ep ar ed b y : Tap o sh Ka rm akar AI R FO R CE SC HO OL JO RH A T | M ob i l e : 7 00 20 70 3 13
Lat e st Up d a te on : 7 Ap r il 20 24
Class Notes : 2024-25
128 # memory allocated to string
>>> array3.itemsize
8 #memory allocated to float type
11. Write the different or other ways to creating NumPy Array.
Ans. 1. We can specify data type (integer, float, etc.) while creating array using
dtype as an argument to array(). This will convert the data automatically
to the mentioned type. In the following example, nested list of integers are
passed to the array function. Since data type has been declared as float, the
integers are converted to floating point numbers.
>>> array4 = np.array( [ [1,2], [3,4] ], dtype=float)
>>> array4
array([[1., 2.],[3., 4.]])
4. We can create an array with numbers in a given range and sequence using
the arange() function. This function is analogous to the range()
function of Python.
Slicing is done for 2-D arrays. For this, let us create a 2-D array called array9
having 3 rows and 4 columns.
>>> array9 = np.array([[ -7, 0, 10, 20],[ -5, 1, 40, 200],
[ -1, 1, 4, 30]])
# Access all the elements in the 3rd column
>>> array9[0:3,2]
array([10, 40, 4])
Note that we are specifying rows in the range 0:3 because the end value of the
range is excluded.
# Access elements of 2nd and 3rd row from 1st and 2nd column
>>> array9[1:3,0:2]
array([[-5, 1],
[-1, 1]])
If row indices are not specified, it means all the rows are to be considered.
Likewise, if column indices are not specified, all the columns are to be
considered. Thus, the statement to access all the elements in the 3rd column
can also be written as:
>>>array9[:,2]
array([10, 40, 4])
13. Write the basic Arithmetic Operation on NumPy Array with examples.
Ans. To perform a basic arithmetic operation like addition, subtraction,
multiplication, division etc. on two arrays, the operation is done on each
corresponding pair of elements.
>>> array1 = np.array([[3,6],[4,2]])
>>> array2 = np.array([[10,20],[15,12]])
>>> array1 + array2
array([[13, 26],
[19, 14]])
#Subtraction
>>> array1 - array2
array([[ -7, -14],
[-11, -10]])
#Multiplication
>>> array1 * array2
array([[ 30, 120],
[ 60, 24]])
#Matrix Multiplication
Chapter 2: Data Handling with Pandas-I 7|P a g e
Pr ep ar ed b y : Tap o sh Ka rm akar AI R FO R CE SC HO OL JO RH A T | M ob i l e : 7 00 20 70 3 13
Lat e st Up d a te on : 7 Ap r il 20 24
Class Notes : 2024-25
>>> array1 @ array2
array([[120, 132],
[ 70, 104]])
#Exponentiation
>>> array1 ** 3
array([[ 27, 216],
[ 64, 8]], dtype=int32)
#Division
>>> array2 / array1
array([[3.33333333, 3.33333333],
[3.75 , 6. ]])
#Element wise Remainder of Division #(Modulus)
>>> array2 % array1
array([[1, 2],
[3, 0]], dtype=int32)
It is important to note that for element-wise operations, size of both arrays
must be same. That is, array1.shape must be equal to array2.shape.
14. What is Concatenating Arrays?
Ans. Concatenation means joining two or more arrays.
Concatenating 1-D arrays means appending the sequences one after another.
NumPy.concatenate() function can be used to concatenate two or more 2-
D arrays either row-wise or column-wise.
All the dimensions of the arrays to be concatenated must match exactly
except for the dimension or axis along which they need to be joined. Any
mismatch in the dimensions results in an error. By default, the
concatenation of the arrays happens along axis=0.
Example :
>>> array1 = np.array([[10, 20], [-30,40]])
>>> array2 = np.zeros((2, 3), dtype=array1.dtype)
>>> array1
array([[ 10, 20],
[-30, 40]])
>>> array2
array([[0, 0, 0],
[0, 0, 0]])
>>> array1.shape
(2, 2)
>>> array2.shape
(2, 3)
>>> np.concatenate((array1,array2), axis=1)
array([[ 10, 20, 0, 0, 0],
[-30, 40, 0, 0, 0]])
>>> np.concatenate((array1,array2), axis=0)
10 23 56 17 52 61
#OUTPUT
Series([], dtype: float64)
23. How to create non-empty series using range() method?
Ans. To create non-empty series, specify parameters for data and indexes.
Syntax:
pandas.Series(data, index=idx)
#OUTPUT
0 a
1 b
2 c
3 d
dtype: object
#OUTPUT
100 a
101 b
102 c
dtype: object
25. How to Create a Series from dict?
Ans. A dict can be passed as input and if no index is specified, then the dictionary
keys are taken in a sorted order to construct index.
If index is passed, the values in data corresponding to the labels in the index
will be pulled out.
# Create a Series from dict without index
import pandas as pd
import numpy as np
data = {'a' : 0., 'b' : 1., 'c' : 2.}
s = pd.Series(data)
print (s)
#OUTPUT
a 0.0
b 1.0
c 2.0
dtype: float64
#OUTPUT
b 1.0
c 2.0
d NaN # Index order is persisted and the missing
a 0.0 element is filled with NaN (Not a Number).
dtype: float64
#OUTPUT
Jan 31
Feb 28
Mar 31
Apr 30
dtype: int64
#OUTPUT
Jan 31.0
Feb 28.0
Mar 31.0
Apr 30.0
dtype: float64
Solution:
2.75 + 9.75 = 12.5 + 9.75 = 22.25 + 9.75 = 32.0 + 9.75 =
41.75
0 a 1
1 b 2
2 c 3
3 d 4
4 e 5
#OUTPUT
Series Elements :
a 1
b 2
c 3
d 4
e 5
dtype: int64
Accessing Third Element : 3
28. How to access the elements from Series using Slices?
Ans. Slice means subset of data.
Slicing is a powerful way to retrieve subsets of data from a pandas object.
We can define which part of the series is to be sliced by specifying the start
and end parameters [start :end] with the series name.
Slicing takes place position wise and not the index wise in a series
object.
To specify slices as [start : end : step]
When we use positional indices for slicing, the value at the endindex
position is excluded, i.e., only (end - start) number of data values of the
series are extracted. But if labeled indexes are used for slicing, then value
at the end index label is also included.
1 or -4 b 2
2 or -3 c 3
3 or -2 d 4
Example :
>>> import pandas as pd
>>> seriesCapCntry = pd.Series(['New Delhi', 'Washington DC',
'London','Paris'], index=['India', 'USA', 'UK', 'France'])
>>> seriesCapCntry.name = 'Capitals'
>>> seriesCapCntry
India New Delhi
USA Washington DC
UK London
France Paris
Name: Capitals, dtype: object
>>> seriesCapCntry.index.name = 'Countries'
>>> seriesCapCntry
Countries
India New Delhi
USA Washington DC
UK London
France Paris
Name: Capitals, dtype: object
>>> seriesCapCntry.values
array(['New Delhi', 'Washington DC', 'London', 'Paris'],
dtype=object)
>>> seriesCapCntry.size
4
Chapter 2: Data Handling with Pandas-I 20 | P a g e
Pr ep ar ed b y : Tap o sh Ka rm akar AI R FO R CE SC HO OL JO RH A T | M ob i l e : 7 00 20 70 3 13
Lat e st Up d a te on : 7 Ap r il 20 24
Class Notes : 2024-25
>>> seriesCapCntry.empty
False
31. Write the various operations that can be perform on Series object.
Ans. The various operation on Series object are –
Modifying Elements of Series object
Head() and Tail() functions
Vector Operations on Series Object
32. Write a program to modify element(s) of series object in Python shell?
Ans. >>> import pandas as pd
>>> obj1=pd.Series([4,2,7]
>>> obj1
0 4
1 2
2 7
>>> obj1[1]=8
>>> obj1
0 4
1 8
2 7
33. Write the methods of using head(), tail() and count() function on
pandas series object.
Ans. Head() function is used to fetch n rows from a pandas object.
Tail() function returns last n rows from a pandas object
Count() function returns the number of non-NaN values in the Series
If you do not provide any value for n, than head() and tail() will return first
5 ad last 5 rows respectively of pandas object.
34. Write a python program where n value is not provided in head() in
pandas series object.
Ans. >>> import pandas as pd
>>> obj3=pd.Series[10,15,20,25,30,35,40,45,50,55,60,65,70,75]
>>> obj3.head() #returns first 5 rows
Output:
0 10
1 15
2 20
3 25
4 30
Output:
0 10
1 15
2 20
3 25
4 30
5 35
6 40
36. Write a python program where n value is not provided in tail() in
pandas series object.
Ans. >>> import pandas as pd
>>> obj3=pd.Series[10,15,20,25,30,35,40,45,50,55,60,65,70,75]
>>> obj3.tail() #returns last 5 rows
Output:
9 55
10 60
11 65
12 70
13 75
37. Write a python program where n value is provided in tail() in pandas
series object.
Ans. >>> import pandas as pd
>>> obj3=pd.Series[10,15,20,25,30,35,40,45,50,55,60,65,70,75]
>>> obj3.tail(3) #returns last 3 rows
Output:
11 65
12 70
13 75
38. Write a python program to count the number of elements in pandas
series object.
Ans. >>> import pandas as pd
>>> s1=pd.Series([12,15,10])
>>> s2=pd.Series([12,np.nan,10])
>>> s1.count()
3
>>> s2.count()
2
39. What do you meant by Vector or Mathematical Operation on Series
objects?
Ans. Vector Operation means that if you apply a function or expression then it
is individually applied on each item of the object.
The various Vector/Mathematical Operation are –
>>>seriesA * seriesB
a -10.0
b NaN
c -150.0
d NaN
e 500.0
y NaN
z NaN
dtype: float64
>>> seriesA/seriesB
a -0.10
#Output:
Chapter 2: Data Handling with Pandas-I 27 | P a g e
Pr ep ar ed b y : Tap o sh Ka rm akar AI R FO R CE SC HO OL JO RH A T | M ob i l e : 7 00 20 70 3 13
Lat e st Up d a te on : 7 Ap r il 20 24
Class Notes : 2024-25
Empty DataFrame
Columns: []
Index: []
42. How to Create a DataFrame from List?
Ans. A basic DataFrame, which can be created, is a DataFrame with parameter.
Syntax:
pandas.DataFrame(data)
#Create a DataFrame with parameter as list data
import pandas as pd
data = [1,2,3,4,5] # List
df = pd.DataFrame(data)
print(df)
#Output:
0
0 1
1 2
2 3
3 4
4 5
#Create a DataFrame with parameter as list data and column name
import pandas as pd
data = [['Freya',10],['Mohak',12],['Dwivedi',13]]
df = pd.DataFrame(data, columns=['Name','Age'])
print (df)
#Output:
Name Age
0 Freya 10
1 Mohak 12
2 Dwivedi 13
#Create a DataFrame from Dict of ndarrays with Column name
import pandas as pd
data = {'Name':['Freya', 'Mohak'],'Age':[9,10]}
df = pd.DataFrame(data)
print (df)
#Output:
Name Age
0 Freya 9
1 Mohak 10
#Output:
Name Age
0 Freya 9
1 Mohak 10
43. How to Create a DataFrame from Series?
Ans. Consider the following three Series:
>>> import pandas as pd
>>> seriesA = pd.Series([1,2,3,4,5],
index = ['a', 'b', 'c', 'd', 'e'])
>>> seriesB = pd.Series ([1000,2000,-1000,-5000,1000],
index = ['a', 'b', 'c', 'd', 'e'])
>>> seriesC = pd.Series([10,20,-10,-50,100],
index = ['z', 'y', 'a', 'c', 'e'])
Here, labels in the series object become the column names in the DataFrame
object (dFrame7) and each series becomes a row in the DataFrame.
Here, different series do not have the same set of labels. But, the number of
columns in a DataFrame equals to distinct labels in all the series. So, if a
particular series does not have a corresponding value for a label, NaN is inserted
in the DataFrame column.
44. How to Create of DataFrame from Dictionary of Series?
Ans. A dictionary of series can also be used to create a DataFrame.
Example :
>>> import pandas as pd
>>> ResultSheet={'Arnab': pd.Series([90, 91, 97],
index=['Maths','Science','Hindi']),
'Ramit': pd.Series([92, 81, 96],
index=['Maths','Science','Hindi']),
'Samridhi': pd.Series([89, 91, 88],
index=['Maths','Science','Hindi']),
'Riya': pd.Series([81, 71, 67],
index=['Maths','Science','Hindi']),
'Mallika': pd.Series([94, 95, 99],
index=['Maths','Science','Hindi'])}
>>> ResultDF = pd.DataFrame(ResultSheet)
>>> ResultDF
For example:
Name Qualification
0 Jai Msc
1 Princi MA
2 Gaurav MCA
3 Anuj Phd
Output:
Name Height Qualification Address
0 Jai 5.1 Msc Delhi
1 Princi 6.2 MA Bangalore
2 Gaurav 5.1 Msc Chennai
3 Anuj 5.2 Msc Patna
v) Column Rename :
In Order to rename a column in Pandas DataFrame, we can either
rename the column by using rename() function or assigning a list of
new column names to the Column attribute.
Syntax 1 : Rename Single Column
df.rename(columns={„Old_Col_Name‟:‟New_col_Name‟}, inplace
= True)
Output :
PID PNAME
A P01 Pen
B P02 Pencil
C P03 Eraser
Col
df1 df2
0 Third
Col Col
+ = 1 Fourth
0 First 0 Third
2 First
1 Second 1 Fourth
3 Second
>>> ResultDF.loc['Science']
Arnab 91
Ramit 81
Samridhi 91
Riya 71
Mallika 95
Name: Science, dtype: int64
# When a single column label is passed, it returns the column as a
Series.
>>> ResultDF.loc[:,'Arnab']
Maths 90
Science 91
Hindi 97
Name: Arnab, dtype: int64
# To read more than one row from a DataFrame, a list of row labels is
used
>>> ResultDF.loc[['Science', 'Hindi']]
Arnab Ramit Samridhi Riya Mallika
Science 91 81 91 71 95
Hindi 97 96 88 67 99
>>> DF2
C2 C5
R4 10 20.0
R2 30 NaN
R5 40 50.0
# To append DF1 to DF2, the rows of DF2 precedes the rows of DF1.
To get the column labels appear in sorted order we can set the parameter
sort=True.
>>> ForestAreaDF.index
Index([‘GeoArea’, ‘VeryDense’, ‘ModeratelyDense’,
‘OpenForest’], dtype =’object’)
>>> ForestAreaDF.columns
Index(['Assam', 'Kerala', 'Delhi'], dtype='object')
>>> ForestAreaDF.dtypes
Assam int64
Kerala int64
Delhi float64
dtype: object
>>> ForestAreaDF.values
array([[7.8438e+04, 3.8852e+04, 1.4830e+03],
[2.7970e+03, 1.6630e+03, 6.7200e+00],
[1.0192e+04, 9.4070e+03, 5.6240e+01],
[1.5116e+04, 9.2510e+03, 1.2945e+02]])
>>> ForestAreaDF.shape
(4, 3)
>>> ForestAreaDF.size
12
>>> ForestAreaDF.T
GeoArea VeryDense ModeratelyDense OpenForest
Assam 78438.0 2797.00 10192.00 15116.00
Kerala 38852.0 1663.00 9407.00 9251.00
Delhi 1483.0 6.72 56.24 129.45
>>> ForestAreaDF.head(2)
Assam Kerala Delhi
GeoArea 78438 38852 1483.00
VeryDense 2797 1663 6.72
>>> ForestAreaDF.tail(2)
Assam Kerala Delhi
ModeratelyDense 10192 9407 56.24
OpenForest 15116 9251 129.45
54. Write a program to create and open “Student.csv” file using Pandas.
Note :
Missing values from the CSV file shall be treated as NaN (Not a Number)
in Pandas dataframe.
The read_csv() method automatically takes the first row of the CSV
file and assigns it as the dataframe header.
The parameter sep specifies whether the values are separated by
comma, semicolon, tab, or any other character. The default value for
sepis a space.
The parameter header specifies the number of the row whose values are
to be used as the column names. It also marks the start of the data to be
fetched. header=0 implies that column names are inferred from the first
line of the file. By default, header=0.
We can exclusively specify column names using the parameter names while
creating the DataFrame using the read_csv() function.
df1=pd.read_csv("D:\\Python\\student.csv" ,sep =",", names=
[‘RNo’, ’StdName’,’Sub1’])
print(df1)
RNO StdName Sub1
0 1 Anil 89.0
1 2 Bunty 68.0
2 3 Harish 98.0
3 4 Gautom NaN
4 5 Krishna 91.0
Output:
AdmNo StudName dob Class
0 101 Anita 12-05-2002 XI
1 102 Anil 22-09-2001 XII
2 103 Bijay 15-04-2002 XI
3 104 Chiran 25-01-2001 XII
#Create csv file using to_csv()
df.to_csv(“D:\\Python\\StudEnrollment.csv”)
or,
df.to_csv(path_or_buf='D:/Python/StudEnrollment.csv',
sep=',')
This creates a file by the name StudEnrollment.csv in the folder D:/Python on
the hard disk. When we open this file in any text editor or a spreadsheet, we
will find the above data along with the row labels and the column headers,
separated by comma.
Output :
# To open StudEnrollment.csv in Excel Spreadsheet –
In case, we do not want the column names to be saved to the file we may use
the parameter header=False.
Another parameter index=False is used when we do not want the row labels
to be written to the file on disk. For Example -
df.to_csv( 'd:/python/Student.txt', sep = '@', header =
False, index= False)
66. How to write data into SQL table using Dataframe using sqlalchemy?
Ans. Writing data into an sql table using dataframe :
.to_sql() method is used to write data from a dataframe into an sql
table.
67. Write Menu-driven program to demonstrate four major operations
performed on a table through MySQL-Python connectivity.
Ans. def menu():
c='y'
while (c=='y'):
print ("1. Add record")
print ("2. Update record ")
print ("3. Delete record")
print("4. Display records")
print("5. Exiting")
choice=int(input("Enter your choice: "))
if choice == 1:
adddata()
elif choice== 2:
#updatedata()
udata()
elif choice== 3:
deldata()
elif choice== 4:
fetchdata()
elif choice == 5:
print("Exiting")
break
else:
print("wrong input")
c=input("Do you want to continue or not: ")
def fetchdata():
import mysql.connector
try:
Chapter 2: Data Handling with Pandas-I 53 | P a g e
Pr ep ar ed b y : Tap o sh Ka rm akar AI R FO R CE SC HO OL JO RH A T | M ob i l e : 7 00 20 70 3 13
Lat e st Up d a te on : 7 Ap r il 20 24
Class Notes : 2024-25
db = mysql.connector.connect(user='root',
password='root', \
host='localhost',database='test')
print("Database Connected")
cursor = db.cursor()
sql = "SELECT * FROM student"
#cursor.execute(sql)
#results = cursor.fetchall()
#for x in results:
# print(x)
cursor.execute(sql)
results = cursor.fetchall()
print ("Name","\t","Stipend","\t","Stream","\t",
"Average Marks","\t",
"Grade","\t", "Class")
print ("~~~~","\t","~~~~~~~","\t","~~~~~~","\t",
"~~~~~~~~~~~~~","\t",
"~~~~~","\t", "~~~~~")
for cols in results:
nm = cols[0]
st = cols[1]
stream =cols[2]
av=cols[3]
gd=cols[4]
cl=cols[5]
print
(nm,"\t",st,"\t\t",stream,"\t",av,"\t\t",gd,"\t",cl)
print
("~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n
")
except:
print ("Error: unable to fetch data\n")
db.close()
def adddata():
import mysql.connector
nm=input("Enter Name : ")
stipend=int(input('Enter Stipend : '))
stream=input("Stream: ")
avgmark=float(input("Enter Average Marks : "))
grade=input("Enter Grade : ")
cls=int(input('Enter Class : '))
db = mysql.connector.connect(user='root',
password='root',\
host='localhost',
database='test')
cursor = db.cursor()
sql="INSERT INTO student VALUES ( '%s'
,'%d','%s','%f','%s','%d')"\
try:
sql = "Update student set stipend=%d where name='%s'"
% (tempst,temp)
cursor.execute(sql)
db.commit()
def deldata():
import mysql.connector
try:
db = mysql.connector.connect(user='root',
password='root', host='localhost',database='test')
cursor = db.cursor()
sql = "SELECT * FROM student"
cursor.execute(sql)
results = cursor.fetchall()
for cols in results:
nm = cols[0]
st = cols[1]
stream =cols[2]
av=cols[3]
gd=cols[4]
cl=cols[5]
print ("Name =%s, Stipend=%f, Stream=%s, Average
Marks=%f, Grade=%s, Class=%d" % (nm,st,stream,av,gd,cl ))
except:
print ("Error: unable to fetch data")
try:
sql = "delete from student where name='%s'" % (temp)
ans=input("Are you sure you want to delete the record
: ")
if ans=='yes' or ans=='YES':
cursor.execute(sql)
db.commit()
except Exception as e:
print (e)
try:
db = mysql.connector.connect(user='root',
password='root', host='localhost',database='test')
cursor = db.cursor()
sql = "SELECT * FROM student"
cursor.execute(sql)
results = cursor.fetchall()
for row in results:
nm = row[0]
st = row[1]
stream =row[2]
av=row[3]
menu()
1.
What is a Series and how is it different from a 1-D array, a list and a
dictionary?
Ans. A Series is a one-dimensional array having a sequence of values of any
data type (int, float, list, string, etc). By default series have numeric data
labels starting from zero.
Series vs List
Series vs Dictionary
c.
d={"Samir":1,"Manisha":2,"Dhara":3,"Shreya":4,"Kusum":5}
friends=pd.Series(d)
print(friends)
d.
import pandas as pd
#Method 1
MTseries=pd.Series(dtype=int)
print(MTseries)
#Method 2
MTseries.empty
print(MTseries)
e.
import numpy as np
import pandas as pd
MonthDays=np.array([31,28,31,30,31,30,31,31,30,31,30,31])
month=pd.Series(MonthDays,index=np.arange(1,13))
print(month)
6. Using the Series created in Question 5, write commands for the following:
a. Set all the values of Vowels to 10 and display the Series.
b. Divide all values of Vowels by 2 and display the Series.
c. Create another series Vowels1 having 5 elements with index labels
Chapter 2: Data Handling with Pandas-I 59 | P a g e
Pr ep ar ed b y : Tap o sh Ka rm akar AI R FO R CE SC HO OL JO RH A T | M ob i l e : 7 00 20 70 3 13
Lat e st Up d a te on : 7 Ap r il 20 24
Class Notes : 2024-25
„a‟, „e‟, „i‟, „o‟ and „u‟ having values [2,5,6,3,8] respectively.
d. Add Vowels and Vowels1 and assign the result to Vowels3.
e. Subtract, Multiply and Divide Vowels by Vowels1.
f. Alter the labels of Vowels1 to [„A‟, „E‟, „I‟, „O‟, „U‟].
Ans. a.
import pandas as pd
vow=pd.Series(0,index=['a','e','i','o','u'])
#Method 1
vow.loc['a':'u']=10
print(vow)
#Method 2
vow.iloc[0:5]=10
print(vow)
b.
import pandas as pd
vow=pd.Series(0,index=['a','e','i','o','u'])
vow.iloc[0:5]=10
print(vow/2)
c.
import pandas as pd
vow=pd.Series(0,index=['a','e','i','o','u'])
vow1=pd.Series([2,5,6,3,8],index=['a','e','i','o','u'])
print(vow1)
d.
import pandas as pd
vow=pd.Series(0,index=['a','e','i','o','u'])
vow1=pd.Series([2,5,6,3,8],index=['a','e','i','o','u'])
vow3=vow+vow1
print(vow3)
e.
import pandas as pd
vow=pd.Series(0,index=['a','e','i','o','u'])
vow1=pd.Series([2,5,6,3,8],index=['a','e','i','o','u'])
print(vow-vow1)
print(vow*vow1)
print(vow/vow1)
f.
import pandas as pd
vow=pd.Series(0,index=['a','e','i','o','u'])
vow.index=['A','E','I','O','U']
print(vow)
7. Using the Series created in Question 5, write commands for the following:
a. Find the dimensions, size and values of the Series EngAlph, Vowels,
Friends, MTseries, MonthDays.
b. Rename the Series MTseries as SeriesEmpty.
c. Name the index of the Series MonthDays as monthno and that of
#EngAlph
EngAlph=pd.Series(['a','b','c','d','e','f','g','h','i','j','k','l','m
','n','o','p','q','r','s','t','u','v','w','x','y','z'])
print("Size:",EngAlph.size)
print("Dimension:",EngAlph.ndim)
print("Values:",EngAlph.values)
#vowels
vow=pd.Series(0,index=['a','e','i','o','u'])
print("Size:",vow.size)
print("Dimension:",vow.ndim)
print("Values:",vow.values)
#Friends
d={"Samir":1,"Manisha":2,"Dhara":3,"Shreya":4,"Kusum":5}
friends=pd.Series(d)
print("Size:",friends.size)
print("Dimension:",friends.ndim)
print("Values:",friends.values)
#Method 1
MTseries=pd.Series(dtype=int)
#Method 2
MTseries.empty
print("Size:",MTseries.size)
print("Dimension:",MTseries.ndim)
print("Values:",MTseries.values)
#MonthDays
MonthDays=np.array([31,28,31,30,31,30,31,31,30,31,30,31])
month=pd.Series(MonthDays,index=np.arange(1,13))
print("Size:",month.size)
print("Dimension:",month.ndim)
print("Values:",month.values)
b.
import pandas as pd
MTseries=pd.Series(dtype=int)
print(MTseries)
MTseries=MTseries.rename("SeriesEmpty")
print(MTseries)
c.
import numpy as np
#Friends
d={"Samir":1,"Manisha":2,"Dhara":3,"Shreya":4,"Kusum":5}
friends=pd.Series(d)
friends.index.name="Fname"
print(friends.index)
#MonthDays
MonthDays=np.array([31,28,31,30,31,30,31,31,30,31,30,31])
month=pd.Series(MonthDays,index=np.arange(1,13))
month.index.name="MonthNo"
print(month.index)
d.
import numpy as np
import pandas as pd
#Friends
d={"Samir":1,"Manisha":2,"Dhara":3,"Shreya":4,"Kusum":5}
friends=pd.Series(d)
friends.index.name="Fname"
print(friends.iloc[2:0:-1])
e.
import numpy as np
import pandas as pd
#EngAlph
EngAlph=pd.Series(['a','b','c','d','e','f','g','h','i','j','k','l','m
','n','o','p','q','r','s','t','u','v','w','x','y','z'])
print(EngAlph.iloc[4:16])
f.
import numpy as np
import pandas as pd
#EngAlph
EngAlph=pd.Series(['a','b','c','d','e','f','g','h','i','j','k','l','m
','n','o','p','q','r','s','t','u','v','w','x','y','z'])
print(EngAlph.head(10))
g.
import numpy as np
import pandas as pd
#EngAlph
EngAlph=pd.Series(['a','b','c','d','e','f','g','h','i','j','k','l','m
', 'n','o','p','q','r','s','t','u','v','w','x','y','z'])
print(EngAlph.tail(10))
h.
Refer Answer (b)
8. Using the Series created in Question 5, write commands for the following:
a. Display the names of the months 3 through 7 from the Series
#MonthDays
MonthDays=np.array([31,28,31,30,31,30,31,31,30,31,30,31])
month=pd.Series(MonthDays,index=np.arange(1,13))
month=pd.Series(MonthDays,index=["Jan","Feb","Mar","Apr","May","June"
,
"July","Aug","Sept","Oct","Nov","Dec"])
#Method 1
print(month.iloc[2:7])
#Method 2
print(month.loc['Mar':'July'])
#Method 3
print(month['Mar':'July'])
b.
import numpy as np
import pandas as pd
#MonthDays
MonthDays=np.array([31,28,31,30,31,30,31,31,30,31,30,31])
month=pd.Series(MonthDays,index=np.arange(1,13))
month=pd.Series(MonthDays,index=["Jan","Feb","Mar","Apr","May","June"
,
"July","Aug","Sept","Oct","Nov","Dec"])
#Method 1
print(month[::-1])
#Method 2
print(month.iloc[::-1])
#Method 3
print(month.loc["Dec":"Jan":-1])
9. Create the following DataFrame Sales containing year wise sales figures
for five sales persons in INR. Use the years as column labels, and sales
person names as row labels.
2014 2015 2016 2017
#Method 2
print(Sales.iloc[-2:])
f.
#Method 1
print(Sales[[2014,2015]])
#Method 2
print(Sales[Sales.columns[0:2]])
#Method 3
print(Sales.iloc[:, 0:2] )
g.
import pandas as pd
dict1={2018:[160000,110000,500000,340000,900000]}
sales2=pd.DataFrame(dict1,index=["Madhu","Kusum","Kinshuk","Ankit","S
hruti"])
print(sales2)
h.
print(sales2.empty)
11.Use the DataFrame created in Question 9 above to do the following:
a) Append the DataFrame Sales2 to the DataFrame Sales.
b) Change the DataFrame Sales such that it becomes its transpose.
c) Display the sales made by all sales persons in the year 2017.
d) Display the sales made by Madhu and Ankit in the year 2017 and
2018.
e) Display the sales made by Shruti 2016.
f) Add data to Sales for salesman Sumeet where the sales made are
[196.2, 37800, 52000, 78438, 38852] in the years [2014, 2015, 2016,
2017, 2018] respectively.
g) Delete the data for the year 2014 from the DataFrame Sales.
h) Delete the data for sales man Kinshuk from the DataFrame Sales.
i) Change the name of the salesperson Ankit to Vivaan and Madhu to
Shailesh.
j) Update the sale made by Shailesh in 2018 to 100000.
k) Write the values of DataFrame Sales to a comma separated file
SalesFigures.csv on the disk. Do not write the row labels and column
labels.
l) Read the data in the file SalesFigures.csv into a DataFrame
SalesRetrieved and Display it. Now update the row labels and column
labels of SalesRetrieved to be the same as that of Sales.
Ans. a.
import pandas as pd
#Sales1
d = {2014:[100.5,150.8,200.9,30000,4000],
2015:[12000,18000,22000,30000,45000],
2016:[20000,50000,70000,10000,125000],
2017:[50000,60000,70000,80000,90000]}
Sales1=pd.DataFrame(d,index=['Madhu',"Kusum","Kinshuk","Ankit","Shrut
i"])
#Sales 2
dict1={2018:[160000,110000,500000,340000,900000]}
Sales2=pd.DataFrame(dict1,index=["Madhu","Kusum","Kinshuk","Ankit","S
Chapter 2: Data Handling with Pandas-I 65 | P a g e
Pr ep ar ed b y : Tap o sh Ka rm akar AI R FO R CE SC HO OL JO RH A T | M ob i l e : 7 00 20 70 3 13
Lat e st Up d a te on : 7 Ap r il 20 24
Class Notes : 2024-25
hruti"])
#Appending Dataframes
Sales1=Sales1.append(Sales2)
print(Sales1)
b.
import pandas as pd
d = {2014:[100.5,150.8,200.9,30000,4000],
2015:[12000,18000,22000,30000,45000],
2016:[20000,50000,70000,10000,125000],
2017:[50000,60000,70000,80000,90000]}
Sales=pd.DataFrame(d,index=['Madhu',"Kusum","Kinshuk","Ankit","Shruti
"])
print(Sales.T)
c.
#Method 1
print(Sales[2017])
#Method 2
print(Sales.loc[:,2017])
d.
import pandas as pd
d = {2014:[100.5,150.8,200.9,30000,4000],
2015:[12000,18000,22000,30000,45000],
2016:[20000,50000,70000,10000,125000],
2017:[50000,60000,70000,80000,90000]}
Sales=pd.DataFrame(d,index=['Madhu',"Kusum","Kinshuk","Ankit","Shruti
"])
#Method 1
print(Sales.loc[['Madhu','Ankit'], [2017,2018]])
#Method 2
print(Sales.loc[Sales.index.isin(["Madhu","Ankit"]),[2017,2018]])
e.
print(Sales.loc[Sales.index=='Shruti',2016])
f.
Sales.loc["Sumeet"]=[196.2,37800,52000,78438,38852]
print(Sales)
g.
Temporary deletion
Sales.drop(columns=2014)
Permanent deletion
Sales.drop(columns=2014, inplace=True)
Chapter 2: Data Handling with Pandas-I 66 | P a g e
Pr ep ar ed b y : Tap o sh Ka rm akar AI R FO R CE SC HO OL JO RH A T | M ob i l e : 7 00 20 70 3 13
Lat e st Up d a te on : 7 Ap r il 20 24
Class Notes : 2024-25
print(Sales)
h.
Sales.drop(“kinshuk”,axis=0)
Sales.drop(“kinshuk”)
i.
Sales=sales.rename({“Ankit”:”Vivaan”,”Madhu”:”Shailesh”},
axis=”index”)
print(Sales)
j.
Sales.loc[Sales.index==”Shailesh”,2018]=100000
print(Sales)
k.
Sales.to_csv(“d:\salesFigures.csv”,index=False,header=Fal
se)
l.
salesretrieved=pd.read_csv(“d:\salesFigures.csv”,names=[„
2015′,‟2016′,‟2017′,‟2018‟])
print(salesretrieved)