0% found this document useful (0 votes)
50 views

Data Frame 100 Questions

Most important question of pandas .. This will help you in your boards

Uploaded by

sonisahab9107
Copyright
© © All Rights Reserved
Available Formats
Download as PDF or read online on Scribd
0% found this document useful (0 votes)
50 views

Data Frame 100 Questions

Most important question of pandas .. This will help you in your boards

Uploaded by

sonisahab9107
Copyright
© © All Rights Reserved
Available Formats
Download as PDF or read online on Scribd
You are on page 1/ 16
WORKSHEET — Data Handling Using Pandas V be the output of following code- import pandas as pd ‘pd.Series([1,2,2,7,’Sachin’,77.5]) print(s1.headQ) print(s1.head(3)) A 7 4 Sachin atype: object OF 1 te 2) @ 22 dtype: object Write a program in python to find maximum value over index in Data frame. Ans: # importing pandas as pd import pandas as pd | # Creating the dataframe | df = pd.DataFrame({"A":[4, 5, 2, 6], oB7-[, 2,5, 8), ‘C":[1, 8, 66, 4]}) # Print the dataframe df # applying idxmax() function, dfidxmax(axis = 0) What are the purpose of following statements- 1. df.columns 5, dfiloc[ : -4, Ans: 1. It displays the names of columns of the Dataframe. 2. It will display all columns except the last 5 columns. T]Page i Tow index 2 to7, It will display entire dataframe with all rows and columns, Tewill display all rows except the last 4 four rome : [Sanjeev [Keshav Rahul [Accountant Ans: import pandas as pd name=pd Series(['Sanjeev', Keshav’ Rahul']) age=pd.Series([37,42,38]) designation=pd.Series([Manager’ 'Clerk’,'Accountant') d1={Name':name,'Age':age,'Designation’:designation) df=pd.DataFrame(d1) print(df) dfi=dfsort values(by="Age') print(af1) Write a python program to sort the following data according to descending order of Name. | Name Age Designation Sanjeev [37 Manager Keshav 42 Clerk Rahul 38 Accountant import pandas as pd name=pd Series([‘Sanjeev’,'Keshav','Rahul']) age=pd.Series([37,42,38]) designation=pd Series({'Manager','Clerk’,'Accountant'}) d1=('Name':name, Age’:age,'Designation':designation} df=pd.DataFrame(d1) print(df) www.pythondcsip.com df2=dfsort_values(b: print(df2) 'Name’ascending=0) Which of the following thing can be data in Pandas? 1. A python dictionary 2. Annd array 3. A scalar value 4. All of above Ans: 5. All the above All pandas data structure are, mutable, Size, value Semantic, size Value, size None of the above mutable but not always PONE Ans: 3. Value,size | What is the output of the following program? Data and index in an nd array must be of same length- 1. True 2. False Ans: 1. True port pandas as pd .d.DataFrame(index=[0,1,2,3,4,5],columns print dff‘one’].sumQ) ‘one’,two'}) Ans: It will produce an error. 10 1 | What will be the output of following code: | Users.groupby(‘occupation’).age.mean() 1. Get mean age of occupation 2. Groups users by mean age 3, Groups user by age and occupation 4, None Ans: 1. Get mean age of occupation Which object do you get after reading a CSV file using pandas. 1. Dataframe 2. Nd array 3. Char Vector ‘ead_csv()? 31Page Ans: Ans: 4. None 1. Dataframe What will be the output of df.iloc[3:7,3:6]? Ans: It will display the rows with index 3 dataframe ‘df How to select the rows where wi 1. df[dfl‘age’].isnull) 2. dfaff'age’ 3. dfldff'age” 4. None 4. None As the right answer is df[dff'age'].isnull()] here age is missing? to 6 and columns with index 3 to 5 ina | Consider the following record in dataframe IPL Player Team Category | BidPrice Hardik Pandya _| Mumbai Indians Batsman_| 13 KL.Rahul Kings Eleven Batsman 12, Andre Russel___| Kolkata Knight riders_| Batsman [7 Jasprit Bumrah | Mumbai Indians Bowler | 10 Virat Kohli RCB Batsman 17 Rohit Sharma | Mumbai Indians Batsman | 15 Retrieve first 2 and last 3 rows using python program. Ans: "Team':['Mumbai Indians’,'Kings Eleven’,’Kolkata Knight Riders’, Mumbai Indians','RCB',;Mumbai Indians‘], = "Category’:['Batsman','Batsman’,Batsman’,'Bowler’,'Batsman’,Batsman’] , "Bidprice':[13,12,7,10,17,15], ‘Runs':[1000,2400,900,200,3600,3700]} df=pd.DataFrame(d) print(df) print(dfiloc[:2,:]) print(dfiloc{-3:,]) d={ Player":[Hardik Pandya','K L Rahul','AndreRussel’, Jasprit Bumrah’,'Virat Kohli’, Rohit Sharma’], Ans: print(dffdf[ BidPrice'}==dil'BidPrice'].maxQ)) Write a command to Find most expensive Player. Write a command to Print total players per team. a] Page www.pythondcsip.com Ans: print(dfgroupby(‘Team’),Player.count()) 17 | Write a command to Find player who had highest BidPrice from each team. Ans: ifgroupby("Team') print(valf Player’ 'BidPrice'].max() Write a command to Find average runs of each team. Ans: print(df.groupby({'Team']).Runs.mean(Q) Write a command to Sort all players according to BidPrice. Ans: print(dfsort values(by="BidPrice')) We need to define an index in pandas- 1. True 2. False Ans: 2 False Who is data scientist? 1. Mathematician 2. Statistician 3. Software Programmer 4, All of the above Ans: 4 All the above 22 | What is the built-in database used for python? 1, Mysql 2. Pysqlite 3. Sqlite3 4, Pysqln Ans: 3 Sqlite3 23 | How can you drop columns in python that contain NaN? Ans: dfi.dropna(axis=1) BT Page 25 26 www.pythondcsip.com How can you drop all rows that contains NaN? Ans: dfi.dropna(axis=0) ASeriesis___array, which is labelled and. type. Ans: One dimensional array, homogeneous Minimum number of arguments we require to pass in pandas series = a0: PON ene Ans: 1.0 27 What we pass in data frame in pandas? 1. Integer 2. String 3. Pandas series 4. All Ans: 4 All How many rows the resultant data frame will have? import pandas as pd dfl=pd.DataFrame({‘key’:{'a’/b'/c’/’], ‘value’:[1,2,3, if.merge(df2, on="key’, how="outer’) onan ce 2. 3. 4, Ans: 4.6 29 How many rows the resultant data frame will have? import pandas as pd d.DataFrame({‘key’ value’:(1,2,3,4]}) =pd.DataFrame({‘key’:('a’'b’/e’,b’], ‘value’:[5,6,7,8]}) fl merge(df2, on='key’, how="inner’) 30 How many rows the resultant data frame will have? S[Page import pandas as pd dfl=pd.DataFrame({‘key’:['a’/b’, df2=pd.DataFrame({‘key’:['a’ ars a"), value’:[1,2,3,4)}) ’e,’b'], ‘value’: dfl.merge(df2, on="key’, how. ad cee a 1 2.4 3.5 4.6 Ans: 2.4 How many rows the resultant data fr: import pandas as pd dfl=pd.DataFrame({‘key’:['a'/b af af : “'d'), ‘value’[1,2,3,4]}) cl. DataFrame({‘key’:['a'/b'/e"/b'], ‘value’:[5,6,7,8]}) fl.merge(df2, on="key’, how='left’) 3 1 2.4 Sno) 4.6 3.5 series as a result. is an interactive way to quickly summari: ta frame will have? method is used to delete the series and also return the ee ize large amount of data. | sort_values() method. Ans: Inplace |36 | Write a program in python to calculate the su iven dataset- ‘E5:[45,55,78,95,99,971, ‘IP’:[87,89,98,94,78,77] Ans: ‘CS:[45,55,78,95,99,97], ‘IP':[87,89,98,94,78,77] } df=pd.DataFrame(d1) print(df['cs'].sumQ) Ans: Pivoting g [34 i Method is used to rename the existing indexes in a data frame, Ans: rename i 35 __Attribute that can prohibit to create a new data frame in CS subject ina age Write a python program to the list given below- {179,92}[86,96},{85,91,[80,99)} Ans: 1=[110,20},[20,30},{30,40)) aF-pd DataFrame(),columns=['CS'/1p")) print(df) How you can find the total number of rows and columns in a data frame. Ans: df.shape [MaxTemp_ _[Mintemp [ety [RainFall ciate [S0raaaae 7 Delk iam 256 __| Guwahati 415 Ha [48 a SEs Chennai 368 32 Bangluru 40.2 aa —}Mumbai 5 [Ease 7s] JalpuranaheeEma faa} Consider the above data frame as df- 1. Write command to compute sum of every column of the data frame. Ans: print(df.sum(axis=0)) ie | Based on the above data frame df, Write a command to compute mean of column MaxTemp. Ans: | Print(df[’MaxTemp']mean()) Based on the above data frame df, Write a command to compute average MinTemp, RainFall for first 4 rows. Ans: af{{'Mintemp:,'RainfallJI:4].mean() Which method is used to read the data from MySQL database through Data Frame? Ans: read_sql_query() Which method is used to perform a query in MySQL through Data Frame? _ Ans: execute() What will be the output of following code? BlPace www.python4csip.com import pandas as pd df= pd.DataFrame([45,50,41,56} print(df.iloc{True]) | index = [True, False, True, False]) Ans: It will display error message like- Ca y - Cannot index by location index with a ni key because iloc accept only integer Index, | os on nex WHR 8 non Write a program in python to join two data frame. Ans: xiia={'sub:{'eng’,'mat, ‘ip’ phy'che id':['302''041'/065',042',043''044"]} xiie=('sub’:['eng’'mat’ ip’ , '55',056',057']} dfl=pd.DataFrame(xiia) print(df1) df2=pd.DataFrame(xiic) print(df2) print(dfL.merge(df2,on print(df1.merge(df2,on What is a Series? Explain with the help of an example, [_CEMPT'Salary']=Sal Pandas Series is a one-dimensional labeled array capable of holding data of any type (integer, string, float, python objects etc.). The axis labels are collectively called. index, import pandas as pd data =pd.Series((1,2,3,4,5)) print(datAns: Hitesh wants to display the last four rows of the dataframe df and has written the following code: df.tail) But last 5 rows are being displayed. Identify the error and rewrite the correct code so that last 4 rows get displayed. i If tail) doesn’t receive any argument, then by default last 5 rows will be displayed. Correct Code is df.tail(4) 4 rite the command to add a new column in the last place(3rd place) named | alary” from the list of values, Sal=[10000,15000,20000] in an existing dataframe named EMP, assume already having 2 columns. Consider the following python code and write the outpu import pandas as pd d.series([2-4,6,8,10,12, 14) print(icquantite({0.50,0.75)) 0.75 11.0 ‘Write a small python code to drop a row from dataframe labeled as 0. df=dk.drop(0) ‘What is Pivoting? Name any two functions of Pandas which support pivoting. Pivoting is a technique to quickly summarize large amount of data so that data can be viewed in a different perspective. Pivot table in pivoting can be used to apply aggregate function like-count. STPage www.pythondcsip.com. ons Tor pivoting are: pivot) and pivot tablet} wi ite a vien code to create a dataframe with appropriate headings from the t given below: sit ‘Amy’, 70], ['S102', 'Risha’, 69], ['S104’, ‘Susan’, 75], [('S105','George', import pandas as pd — L-US101'Amy',70], ['S102:,'Risha’,69], ['S104',Susan’,75], ['S105',George',82]] d PPE Neat ihe A],columns=['ID',Name’,'Points']) pun Consider the following dataframe, and answer the questions given below: import pandas as pd df= pd.DataFrame({“Quarter1":[2000, 4000, 5000, 4400, 10000], “Quarter2":[5800, 2500, 5400, 3000, 2900], "Quarter3":[20000, 16000, 7000, 3600, 8200], "Quarter4":]1400, 3700, 1700, 2000, 6000]}) Write the code to find mean value from above dataframe df over the index and column axis. (Skip NaN value) print(dfimean(axis= Tue)) | print(dfimean(axis=1,skipna=True)) “Use sum() function to find the sum ofall the values over the index ax print(dfsum(axis=0) i he median of the dataframe df._ ~print(dfimedian0) 120),('a': 6,"b me(data,column pd.DataFrame(data,columns: print(df1: print (df2 | ab 0 10 20 1 632 abl 0 10 NaN 1 6NaN mark 150 30 451 20 302 20 703 50 import pandas as pd x1=[[10,150],[40,451],[15,302},[40,703]] : .d.DataFrame(x1,columns=['mark1,'mark2"]) {[30,20},[20,25},[20,30},{5.30]] df2=pd.DataFrame(x2,columns=[mark1'/mark2 ") print(aft) print(df2) ToTPage o add dataframes dfi and df print(dfl.add(df2y)_ To subtract df2 from dfi print(dFi.sub(df2)) To change index label of dfi from 0 to zero and from 1 to one, dfl=dfi rename(index={0'zero,1;one}) What will be the output of the following python code? import pandas as pd d=(‘Student’:['Ali’,'Al ‘Tom','Tom'], ‘House’:[/Red’,Red’ Blue','Blue'], ‘Points':[50,70,60,80]} df =pd.DataFrame(d) df f.pivot_table(index='Student',columns='House',values='Points' aggfun Ss um’) print(df1) House Blue Red Student Ali NaN 120.0 Tom 140.0 NaN For the given code fill in the blanks so that we get the desired output with maximum value for Quantity and Average Value for Cost: import pandas as pd import numpy as np ‘Apple’ 'Pear’,'Banana’,'Grapes'],'Quantity':[100,150,200,250], ‘Cost':[1000,1500,1200,900]} df = pd.DataFrame(d) Quantity 250.0 Cost 1150.0 dtype: float64 dfl=pd.DataFrame(dfl Quantity’].max(),dif'Cost’].mean()],index=['Quantity’ Cost'}) Find Output for the following program code: import pandas as pd dfl=pd.DataFrame( Teecream':{'Van; a’ ButterScotch’,Caramel'), ‘Oreo'}}) DairyMilk’,’ Hide and Seek,'Britannia’}) df2.reindex like(df1) print(afa) “Cookies':['Goodday";Britannia’ df2=pd.DataFrame({‘Chocolate’. [ Kitkat']Icecream':['Vanila',Butterscote h],'Cookies': = Cookies Hide and Seek Britannia Chocolate Teecream 0 DairyMilk Vanila 1 Kitkat Butterscotch A dictionary Smarks contains the following data: ‘rashmi’,harsh’,’priya’],’grade':[‘A1//A2’/1']} Write a statement to create DataFrame called df. Assume that pandas has been imported as pd. df=pd.DataFrame(Smarks,index=[1,2,3]) andas, Sis a series with the following resul S=pd.Series([5,10,15,20,25]) The series object is automatically indexed as 0,1,2,3,4. Write a statement to assign the series as a, b, c, d,e index expli 1d. Series([5,10,15,20,25],index 66. Write python statement to delete the 3rd and Sth rows from dataframe df dfi=dF.drop(index=[2,4] axis=0) or, dfl=dfdrop([2,4)) Given the two dataframes df1 and df2 as given below: dfl d2 [First [Second [Thir ] ]First | Secon | Third Air | la 7 [a7__|14 B a [as [35 14 za fay [as 6 [eons oa aaa Write the commands to do the following on the dataframe: To add dataframes df1 and df2. print(dfl.add(di2)) ——Mebythondcsip.com descending order. -————_____ "To display those r Ws | Print(@R [ANTE thie |>asy— Consider the following dataframe: student af Name class marks Anamay XI 95 Aditi XI 82 Mehak XI 65 Kriti XI 45 Write a statement to get the minimum value of the column marks print(student dif Marks'].min() Write a small python code to add a row to a dataframe. import pandas as pd : student_df=pd.DataFrame({'Name’:['Ananmay’,'Aditi;/Mehak’,'Kriti'] Class’ XI',XI'] 'Marks’:[95,82,65,45]},index=[1,2,3,4]) data=('Name':'Sohail’, newstd=pd.DataFrame(data,inde student _df-studen " Jitesh wants to sort a DataFrame df, He has written the following code. df=pd.DataFrame({"a":[13, 24, 43, 4],"b"[51, 26, 37, 48]}) print(df) df.sort_values(‘a’) print(df) He is getting an output which is showing original DataFrame and not the sorted DataFrame. Identify the error and suggest the correction so that the sorted DataFrame is printed. The possible reason is that the original dataframe is not modified. The correct answer is: df.sort_values(‘a’inplace=True) i Write a command to display the name of the company and the highest car price from DataFrame having data about cars. import pandas as pd ‘Name':['Innova’ 0,650000}} df=pd.DataFrame(car,index=[1,2,3,4]) print(dffdf Price==df,Price.max()]) vera','Royal’,'Scorpio'],'Price':[300000,800000,25000 Write a command in python to Print the total number of records in the | DataFrame. = saa print(arT.count0) BIPage Consider a DataFrame “ar created 1; exam_data = {‘name': [Anastasia’, ‘Dima’, ‘Katherine’ ‘Kevin’, ‘Jonas'], °, 16.5, np.NaN, 9, 20,14.5, np.NaN, 8, 19], ‘attempts’ : [1, 3, 2,3, 2,3, 1,1, 2,4], ' ‘James’, ‘Emily’, "Michael’,"Matthew’, ‘Lara’, ‘score’: [12.5, no','yes', ‘no’, ‘no’, 'yes', ‘yes’, ‘no’, ‘no’, 'yes'}} the rows having NaN values. € a command to create a pivot table based on “qualify” column and display | _sum of the score and attempt columns. print(dFpivot-table(column: lues=['se« empts'Jaggfunc="s tempts Tagghunc='sum')) jents who have qualified. mmand to change the indices to ‘zero’,/one’two’,three’ and ‘four’ ely. ——epythondcsip.com the questions given belo ng the dictionary given below, answer — af=dfrename(index=(0;"Zero"1"One\2! Two" 3! Three) Write command to compute mean of every column of the data frame. _ print(afmean(axis=0)) Write command to add one more row to the data frame with data [5,12,33,3] {coll 5, ‘col2": 12, ‘col3": 33, "colW’:3) dfappend(df2, ignore index=True) —_wwi ‘wwpythondcsip.com Dept Tr Finance aul aaat io 30 | Tg 140 —[ Ruchi_| RD —|~17000 onsider the above Data fr Write a Python Code to c and the Contract em int(dE-groupby(Statu: Contract ame as di alculate the average salary of the Regular employees ees separately, Wr ite a Python Code to print the dataframe in the descending order of Salary. fsort_values(by="Salary’,ascending=False print(df) : _ | Write a Python Code to update the Salary of all Contract employees to RS 00 dfSalary[df-Status=="Contract ]=19000 | Write a Python Code to count the total number of employees in each — department. print(dfgroupby(‘Dept).count().Name) ae "| Write a Python Code to display the maximum salary of the “Contract” staff. print(df[dil Status" ]=="Contract'].max() Salary) “Write a Python Code to display the 4" Record. — «| Print(dfiloc[3:4:]) 88. | Write a Python Code to delete the column Status. del df['Status’] 9, | Write a Python Code to display the ‘IT’ department. | print(df[df.Dept==IT'].max() Salary) Write a Python Code to delete the 1SCand the last record. di=didrop((0.4y) “Consider a dataframe as follows Aare 1.5691 13 2 -29 -63 34 Gunite a Python Code to : Replace all negative numbers with 0 dffdf<0)=0 Count the number of elements which are greater than 50 BTPage : ___—___waiiisytonacsiprea ts ae Print{@[alSS0].countQ.sam) — — Se ee "Write Python ¢ number of mumbai 7 code to taframunt the number of even numbers and number oredd | Print('No of Even Numbers: di[dP%: Peay Print('No of Odd Numbers:,dfldMo3=-1}count(paany ‘Consider the above data frame af. State [125600 | Deki] "235600_| Tamil Nad 213400 [Kerala Er [A SS000 [raat Haryana | 456000 West Bengal | 172000 Haryana [Kerala Write Python Program to create the above dataframe. import pandas as pd data={‘employee':['Sahay','George' 'Priya’ Manila’ Raina’ /Manila’,'Priya'], ‘Sales':[125600,235600,213400,189000,456000,172000,201400], ‘Quarter’:[1,1,1,1,1,2,2],'State':[’Delhi'"TamilNadu’, Kerala’ Haryana’,‘West Bengal’'Haryana',Kerala'}} df=pd.DataFrame(data) print(dA) ‘Write Python Program to find total sales per stat print(dfgroupby('State).sum()-Sales) Write Python Program to find total sales per employee. print(dfgroupby( employee')-sum()-Sales) "Write Python Program to find average sales on both employee and state wi: print(dfgroupby(I employee’, State']).sum().Sales) Write Python Program to find mean,median and minimum sale statewise. 99. _|_print(dfgroupby(’State’).min().Sales) Fint(di.groupby(’State’).mean().Sales) Pinar eroupbyt State’) ‘median().Sales) Write Python Program to find maximum sales quarter-wise. print(dfgroupby(‘Quarter’).max() Sales) Write Python Program to create a Pivot Table with State as the index, Sales as the values and calculating the maximum Sales in each State. print(df pivot table(index= State’ values="Sales'aggrunc="max’)) TeTPage

You might also like