0% found this document useful (0 votes)

31 views24 pages

23HCS4142 PDF

This document is a practical file for a Data Analysis and Visualization course using Python at Deen Dayal Upadhyaya College. It includes various programming tasks using NumPy and Pandas, such as creating arrays, performing statistical analysis, handling missing values, and visualizing data with plots. The file also contains instructions for working with Excel files and the Iris dataset to demonstrate data manipulation and visualization techniques.

Uploaded by

Rohan Singh

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

31 views24 pages

23HCS4142 PDF

Uploaded by

Rohan Singh

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 24

COMPUTER SCIENCE 1

PRACTICAL FILE

Data Analysis and

Visualization using python
(DSE:01)

Deen Dayal Upadhyaya College

(University of Delhi)
Sector-3, Dwarka · New Delhi-110078

Submitted To: Submitted By:

Prof. Arpita Sharma
Rohan Singh
Prof. Deepak Mittal
Roll no.-23HCS4142
(CS Department) BSC CS(H)
COMPUTER SCIENCE 2

PRACTICAL FILE
Q1. Write programs in Python using NumPy library to do the following:

a) Create a two-dimensional array, ARR1 having random values from 0 to 1. Compute the
mean, standard deviation, and variance of ARR1 along the second axis.
COMPUTER SCIENCE 3

b) Create a 2-dimensional array of size m x n integer elements, also print the shape, type and
data type of the array and then reshape it into an n x m array, where n and m are user
inputs given at the run time.

c) Test whether the elements of a given 1D array are zero, non-zero and NaN. Record the
indices of these elements in three separate arrays.
COMPUTER SCIENCE 4

d) Create three random arrays of the same size: Array1, Array2 and Array3. Subtract Array
2 from Array3 and store in Array4. Create another array Array5 having two times the
values in Array1. Find Covariance and Correlation of Array1 with Array4 and Array5
respectively.

e) Create two random arrays of the same size 10: Array1, and Array2. Find the sum of the
first half of both the arrays and product of the second half of both the arrays.
COMPUTER SCIENCE 5

f) Create an array with random values. Determine the size of the memory occupied by the
array.

g) Create a 2-dimensional array of size m x n having integer elements in the range (10,100).
Write statements to swap any two rows, reverse a specified column and store updated
array in another variable

Output
COMPUTER SCIENCE 6

Q2. Do the following using PANDAS Series:

a. Create a series with 5 elements. Display the series sorted on index and also sorted on
values separately

b. Create a series with N elements with some duplicate values. Find the minimum and
maximum ranks assigned to the values using ‘first’ and ‘max’ methods
COMPUTER SCIENCE 7

c. Display the index value of the minimum and maximum element of a Series

Q3. Create a data frame having at least 3 columns and 50 rows to store numeric dat
generated using a random function. Replace 10% of the values by null values whose
index positions are generated using random function.
Do the following:
COMPUTER SCIENCE 8

a. Identify and count missing values in a data frame

b. Drop the column having more than 5 null values.

c. Identify the row label having maximum of the sum of all values in a row and drop that
row.
COMPUTER SCIENCE 9

d. Sort the data frame on the basis of the first column.

e. Remove all duplicates from the first column.

f. Find the correlation between first and second column and covariance between second and
third column.

g. Discretize the second column and create 5 bins.

COMPUTER SCIENCE 10

Q4. Consider two excel files having attendance of two workshops, each of duration 5 days.
Each file has three fields ‘Name’, ‘Date, duration (in minutes) where names may be
repetitive within a file. Note that duration may take one of three values (30, 40, 50) only.
Import the data into two data frames and do the following:

a. Perform merging of the two data frames to find the names of students who had attended
both workshops.

b. Find names of all students who have attended a single workshop only.
COMPUTER SCIENCE 11

c. Merge two data frames row-wise and find the total number of records in the data frame.

d. Merge two data frames row-wise and use two columns viz. names and dates as multi-row
indexes. Generate descriptive statistics for this hierarchical data frame.
COMPUTER SCIENCE 12

Q5. Using Iris data, plot the following with proper legend and axis labels: (Download IRIS
data from: https://archive.ics.uci.edu/ml/datasets/iris or import it from sklearn datasets)

a. Load data into pandas’ data frame. Use pandas.info () method to look at the info on
datatypes in the dataset.

b. Find the number of missing values in each column (Check number of null values in a
column using df.isnull().sum())
COMPUTER SCIENCE 13

c. Plot bar chart to show the frequency of each class label in the data.

Output
COMPUTER SCIENCE 14

d. Draw a scatter plot for Petal Length vs Sepal Length and fit a regression line

Output

e. Plot density distribution for feature Petal width.

COMPUTER SCIENCE 15

Output

f. Use a pair plot to show pairwise bivariate distribution in the Iris Dataset.

Output
COMPUTER SCIENCE 16

g. Draw heatmap for any two numeric attributes

COMPUTER SCIENCE 17

Output

h. Compute mean, mode, median, standard deviation, confidence interval and standard error
for each numeric feature
COMPUTER SCIENCE 18

Output

i. Compute correlation coefficients between each pair of features and plot heatmap

Output
COMPUTER SCIENCE 19

Q6. Consider the following data frame containing a family name, gender of the family
member and her/his monthly income in each record.

a. Clean the data by dropping the column which has the largest number of missing values.

Output
COMPUTER SCIENCE 20

b. Find total number of passengers with age more than 30

c. Find total fare paid by passengers of second class
d. Compare number of survivors of each passenger class
e. Compute descriptive statistics for age attribute gender wise

Output

f. Draw a scatter plot for passenger fare paid by Female and Male passengers separately
COMPUTER SCIENCE 21

Output

g. Compare density distribution for features age and passenger fare

COMPUTER SCIENCE 22

Output

h. Draw the pie chart for three groups labelled as class 1, class 2, class 3 respectively
displayed in different colors. The occurrence of each group converted into percentage
should be displayed in the pie chart. Appropriately Label the chart.
COMPUTER SCIENCE 23

Output

i. Find % of survived passengers for each class and answer the question “Did class play a
role in survival?”

Q7. Consider the following data frame containing a family name, gender of the family
member and her/his monthly income in each record.

a. Calculate and display familywise gross monthly income.

b. Display the highest and lowest monthly income for each family name.
c. Calculate and display monthly income of all members earning income less than Rs.
80000.00.
d. Display total number of females along with their average monthly income.
e. Delete rows with Monthly income less than the average income of all members.
COMPUTER SCIENCE 24

Output

Excel Building Weight Calculator
0% (1)
Excel Building Weight Calculator
2 pages
DAV Practical File 234003
No ratings yet
DAV Practical File 234003
14 pages
Manishadav
No ratings yet
Manishadav
27 pages
GE Practical Sem 2
No ratings yet
GE Practical Sem 2
28 pages
DAV Guidelines
No ratings yet
DAV Guidelines
4 pages
21hcs4108 Davpracticals
No ratings yet
21hcs4108 Davpracticals
29 pages
DAV Practicle File
No ratings yet
DAV Practicle File
28 pages
Vanshika Goyal Gec Practicals
No ratings yet
Vanshika Goyal Gec Practicals
31 pages
Gec Practicals
No ratings yet
Gec Practicals
31 pages
Guidelines DAVP
No ratings yet
Guidelines DAVP
3 pages
DAV Guidelines
No ratings yet
DAV Guidelines
4 pages
Pandas Worksheet
No ratings yet
Pandas Worksheet
19 pages
23bet10114 Naman Gupta Assignment-1
No ratings yet
23bet10114 Naman Gupta Assignment-1
17 pages
2020-21 XIIInfo - Pract.S.E.155
No ratings yet
2020-21 XIIInfo - Pract.S.E.155
11 pages
DAV Practical
No ratings yet
DAV Practical
12 pages
Python 1
No ratings yet
Python 1
16 pages
QP DAV 3rd Sem Dec 2023
No ratings yet
QP DAV 3rd Sem Dec 2023
12 pages
Class 12 (IP) PT.1question Paper2024-25
No ratings yet
Class 12 (IP) PT.1question Paper2024-25
3 pages
Question 1
No ratings yet
Question 1
25 pages
Model Practical Examination 2024-25 Python Pandas QP
No ratings yet
Model Practical Examination 2024-25 Python Pandas QP
3 pages
Ge Sem II Dav Upc 2344001201 Sl. No. Qp. 2012 July 2023
No ratings yet
Ge Sem II Dav Upc 2344001201 Sl. No. Qp. 2012 July 2023
16 pages
Pracfile Program Index XII-C IP 2023-24
No ratings yet
Pracfile Program Index XII-C IP 2023-24
6 pages
2023 Data Analysis and Visualization Using Python
100% (2)
2023 Data Analysis and Visualization Using Python
9 pages
Ip 2019
No ratings yet
Ip 2019
12 pages
GE02 (DAVP) Assignment
No ratings yet
GE02 (DAVP) Assignment
3 pages
Assignment 1
No ratings yet
Assignment 1
2 pages
Data Analysis Lab - Final - 23-24
No ratings yet
Data Analysis Lab - Final - 23-24
11 pages
Ip 065 PT 4
No ratings yet
Ip 065 PT 4
6 pages
Xii Ip Ut 1 MS 24 25
No ratings yet
Xii Ip Ut 1 MS 24 25
9 pages
GE - Computer Scien EaQvs42
No ratings yet
GE - Computer Scien EaQvs42
6 pages
XII IP Practical List 2023-24
No ratings yet
XII IP Practical List 2023-24
4 pages
Kendriya Vidyalaya No. 3, Nal, Bikaner SESSION: 2021-22 Unit Test - 1
No ratings yet
Kendriya Vidyalaya No. 3, Nal, Bikaner SESSION: 2021-22 Unit Test - 1
2 pages
Cs Sem III Dav Upc 2343012002 Sl. No. Qp. 1673 Dec '23
No ratings yet
Cs Sem III Dav Upc 2343012002 Sl. No. Qp. 1673 Dec '23
12 pages
Scoring Key/marking Scheme
No ratings yet
Scoring Key/marking Scheme
9 pages
DXV Guidelines
No ratings yet
DXV Guidelines
3 pages
CS3361 Set2
No ratings yet
CS3361 Set2
6 pages
IP MODEL 1 QST Set 2
No ratings yet
IP MODEL 1 QST Set 2
4 pages
DAV Practicals
No ratings yet
DAV Practicals
26 pages
Practical List 2022-23
100% (1)
Practical List 2022-23
4 pages
22HCS4178
No ratings yet
22HCS4178
24 pages
CS3361 Set1
No ratings yet
CS3361 Set1
5 pages
Sessional QP-TaT
No ratings yet
Sessional QP-TaT
5 pages
PYQ Data Analysis and Visualisation Using Python GE May 2024
No ratings yet
PYQ Data Analysis and Visualisation Using Python GE May 2024
6 pages
XII IP Board Practical File
No ratings yet
XII IP Board Practical File
7 pages
Your Roll No ..............
No ratings yet
Your Roll No ..............
6 pages
IP - Class XII
No ratings yet
IP - Class XII
10 pages
Worksheet-1 (Python)
No ratings yet
Worksheet-1 (Python)
9 pages
Self Practical File Tina Gupta
No ratings yet
Self Practical File Tina Gupta
45 pages
IP Question Paper 2020-2021
No ratings yet
IP Question Paper 2020-2021
9 pages
IP Practical Record 2022-23
No ratings yet
IP Practical Record 2022-23
43 pages
InformaticsPractices SQP
No ratings yet
InformaticsPractices SQP
8 pages
2020 12 SP Informatics Practices New
No ratings yet
2020 12 SP Informatics Practices New
15 pages
Practical Assignment4 1
No ratings yet
Practical Assignment4 1
6 pages
VSG Ip Practical Index-1
No ratings yet
VSG Ip Practical Index-1
4 pages
DXE 24gksmknvj
No ratings yet
DXE 24gksmknvj
16 pages
Dav 2024 Pyq
No ratings yet
Dav 2024 Pyq
7 pages
IP Practical G12
No ratings yet
IP Practical G12
5 pages
CAMEL and Optimal Routing
100% (1)
CAMEL and Optimal Routing
21 pages
SETS
50% (2)
SETS
26 pages
IEEE Paper Format Template
No ratings yet
IEEE Paper Format Template
2 pages
MPL Series P21 - 33
No ratings yet
MPL Series P21 - 33
13 pages
Modelling Imperfectly Appropriable R&D Via Spillovers
No ratings yet
Modelling Imperfectly Appropriable R&D Via Spillovers
20 pages
Chap 3 Vectors EC
No ratings yet
Chap 3 Vectors EC
12 pages
Chemistry Chapter 5 PDF
No ratings yet
Chemistry Chapter 5 PDF
52 pages
Taking The Control System For Granted - Ensuring The Integrity of Sub-Sil Instrumented Functions
No ratings yet
Taking The Control System For Granted - Ensuring The Integrity of Sub-Sil Instrumented Functions
5 pages
Cavity Vent Valve
No ratings yet
Cavity Vent Valve
2 pages
Program Gempur SPM Perlis P2 - 2018
No ratings yet
Program Gempur SPM Perlis P2 - 2018
16 pages
Free Body Diagrams With Animated GIF Files: Paper ID #16401
No ratings yet
Free Body Diagrams With Animated GIF Files: Paper ID #16401
12 pages
Basics of A Jet Engine
No ratings yet
Basics of A Jet Engine
34 pages
Exp # 1 Melting Point
No ratings yet
Exp # 1 Melting Point
11 pages
Maintenance Schedules / Maintenance Parts
100% (1)
Maintenance Schedules / Maintenance Parts
29 pages
Pascal Output Answer
100% (1)
Pascal Output Answer
13 pages
Elements of Pure and Applied Mathematics
No ratings yet
Elements of Pure and Applied Mathematics
485 pages
Sheet Five Conduction MEP 212s
No ratings yet
Sheet Five Conduction MEP 212s
4 pages
9 Redox Notes
No ratings yet
9 Redox Notes
12 pages
Array or Binary Multiplier
No ratings yet
Array or Binary Multiplier
2 pages
Asphalt Testing Discussion-Conclusion
No ratings yet
Asphalt Testing Discussion-Conclusion
2 pages
01 - Python Pandas 1 & 2
No ratings yet
01 - Python Pandas 1 & 2
5 pages
19e Multifunctional Indicator Operator Manual
No ratings yet
19e Multifunctional Indicator Operator Manual
73 pages
EMR3 все необходимое
No ratings yet
EMR3 все необходимое
65 pages
Lab#2
No ratings yet
Lab#2
5 pages
Diploma in Electrical Engineering Industrial Traning Report
No ratings yet
Diploma in Electrical Engineering Industrial Traning Report
42 pages
Draft: Chapter 3 Introduction To Shells and Scripting
No ratings yet
Draft: Chapter 3 Introduction To Shells and Scripting
12 pages
12 Wcdma Hsdpa RRM and Parameters
No ratings yet
12 Wcdma Hsdpa RRM and Parameters
67 pages
Project Report
100% (1)
Project Report
58 pages
Screenshot 2022-06-06 at 11.36.22 AM
No ratings yet
Screenshot 2022-06-06 at 11.36.22 AM
17 pages