0% found this document useful (0 votes)
1 views

Data Science Practical Manual

Uploaded by

Medha Bandi
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
1 views

Data Science Practical Manual

Uploaded by

Medha Bandi
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 14

1. Write a program to find the mean absolute deviation for the given data set.

[26,46,56,45,19,22,24].

PROCEDURE:
Step 1: Calculate the mean.
Step 2: Calculate the distance of each data point from the mean. We need to find the
absolute value.
Step 3: Calculate the mean of the distances.
OUTPUT:
2. Write a program to find standard deviation for the following data set.

There are 39 plants in the garden. A few plants were selected randomly and their heights
in cm were recorded as follows: 1,2,3,5,8. Calculate the standard deviation of their
heights.

PROCEDURE:
Step 1: Calculate the mean by adding up all the data pieces and dividing it by the number
of pieces of the data.
Step 2: Subtract mean from every value.
Step 3: Square each of the differences.
Step 4: Find the average of squared numbers calculated in point number 3 to find the
variance.
Step 5: Lastly, find the square root of variance. That is the standard deviation.
OUTPUT:

There are 39 plants in the garden. A few plants were selected randomly and their heights in
cm were recorded as follows: 1,2,3,5,8. Calculate the standard deviation of their heights.

STEP3:
STEP1: STEP 2: CALCULATE STEP5:
CALCULATE CALCULATE SQUARE OF STEP 4: STANDARD
DATA SET MEAN DISTANCE DISTANCE VARIANCE DEVIATION
1 3.8 2.8 7.84 6.16 2.4819347
2 1.8 3.24
3 0.8 0.64
5 1.2 1.44
8 4.2 17.64
3. Write a program to collect data. Analyse it and interpret the result. Consider
the following data set for the statistical problem-solving process.

Consider that you have a food event in your residential society. Perform detailed
analysis and interpret what should be the top five cuisines that most people in the
society prefer for this event.

PROCEDURE:
Step 1: Formulate Statistical Investigative Questions
Step 2. Collect/Consider the Data
Step 3. Analyse the Data
Step 4. Interpret the Data
OUTPUT:
Data collected from each block of the apartment:

Consolidated data for analysis:

Data interpretation:
1. How many are interested in South Indian Cuisine?
26
2. How many people are interested in Chinese cuisine from block 3?
3
4. Write a program to find central limit theorem after observing the following data.
In a country in the middle east region, the recorded weights of the male population
follow a normal distribution. The mean and the standard deviations are 70 kg and 15
kg, respectively. If a person is eager to find the record of 50 males in the population,
then what would mean and the standard deviation of the chosen sample?
PROCEDURE:
Step 1: Draw groups of people at random from your area. We will call this a sample.
We will draw multiple samples in this case, each consisting of 30 people.
Step 2: Calculate the individual mean of each sample set.
Step 3: Calculate the mean of these sample means.
Step 4: To add up to this, a histogram of sample mean weights of people will
resemble a normal distribution.
The formula for the central limit theorem is:

μ = Population mean
σ = Population standard deviation
μx¯¯¯ = Sample mean
σx¯¯¯ = Sample standard deviation
n = Sample size
OUTPUT:
5. Write a program to find the quartile for the following odd dataset.
34 24 43 5 58 81 29 90 22 67 32 88 57 34 43 44 91 24 62
PROCEDURE:
Step 1: Sort in Ascending Order
Step 2: Find N
Step 3: Calculate Lower Quartile (Q1)
Lower Quartile (Q1) = (N+1)x1/4
Step 4: Calculate Middle Quartile (Q2)
Middle Quartile (Q2) = (N+1)x2/4
Step 5: Calculate Upper Quartile (Q3)
Upper Quartile (Q3)= (N+1)x3/4
OUTPUT:

N 19
SORTED
POSITION DATASET DATA
1 34 5 Q1 Q2 Q3
2 24 22 POSITION 5 10 15
3 43 24 DATA 58 67 43
4 5 24
5 58 29
6 81 32
7 29 34
INTER QUARTILE
90 34
8 RANGE: Q3-Q1 -15
9 22 43
10 67 43
11 32 44
12 88 57
13 57 58
14 34 62
15 43 67
16 44 81
17 91 88
18 24 90
19 62 91
6. Write a program to find the quartile for the following even dataset.
54 28 76 64 41 83 19 71 37 58
PROCEDURE:
Step 1: Sort in Ascending Order
Step 2: Find N
Step 3: Calculate Middle Quartile (Q2) or find the median of the dataset
Middle Quartile (Q2) =N/2 & (N+1)/2
Step 4: Split the Dataset into first half and second half
Step 5: Calculate Lower Quartile (Q1) for first half of the data set.
Lower Quartile (Q1) = (N+1)2
Step 6: Calculate Upper Quartile (Q3)
Upper Quartile (Q3) = (N+1)2
OUTPUT:

SORTED
POSITION DATASET DATA
1 54 19 N 10
2 28 28
3 76 37
4 64 41 Q2
5 41 54 POSTION 5.5 BETWEEN 5 AND 6
6 83 58 DATA 56 (54+58 )/2
7 19 64
8 71 71
9 37 76
10 58 83

SORTED First Last


POSITION DATASET DATA Half Half Q1 Q3
1 54 19 19 58 POSITION 3 3
2 28 28 28 64 DATA 37 71
3 76 37 37 71
4 64 41 41 76
5 41 54 54 83
56
6 83 58 INTERQUARTILE RANGE: Q3-Q1
7 19 64 34
8 71 71
9 37 76
10 58 83
7. Write a program to find the decile for the following data set.
4 9 10 10 12 13 88 90 91 96 99 100 16 49 49 52 55 58 60 60 63 64 65 65 65 73 75 81 83
84 86 17 26 27 33 38 42 43 46
PROCEDURE:
Step 1: Arrange the data set in ascending order.
Step 2: Give the position for each data points.
Step 3: Calculate the decile using the formula
Di = (N + 1) * i / 10
Step 4: Calculate the decile from D1 to D9
OUTPUT:
Position Data Set
1 4
2 9 n 39
3 10
4 10
5 12 Decile Data Position Data
6 13 D1 4 10
7 16 D2 8 17
8 17 D3 12 38
9 26 D4 16 49
10 27 D5 20 58
11 33 D6 24 64
12 38 D7 28 73
13 42 D8 32 84
14 43 D9 36 91
15 46
16 49
17 49
18 52
19 55
20 58
21 60
22 60
23 63
24 64
25 65
26 65
27 65
28 73
29 75
30 81
31 83
32 84
33 86
34 88
35 90
36 91
37 96
38 99
39 100

You might also like