Data Science Practical Manual
Data Science Practical Manual
[26,46,56,45,19,22,24].
PROCEDURE:
Step 1: Calculate the mean.
Step 2: Calculate the distance of each data point from the mean. We need to find the
absolute value.
Step 3: Calculate the mean of the distances.
OUTPUT:
2. Write a program to find standard deviation for the following data set.
There are 39 plants in the garden. A few plants were selected randomly and their heights
in cm were recorded as follows: 1,2,3,5,8. Calculate the standard deviation of their
heights.
PROCEDURE:
Step 1: Calculate the mean by adding up all the data pieces and dividing it by the number
of pieces of the data.
Step 2: Subtract mean from every value.
Step 3: Square each of the differences.
Step 4: Find the average of squared numbers calculated in point number 3 to find the
variance.
Step 5: Lastly, find the square root of variance. That is the standard deviation.
OUTPUT:
There are 39 plants in the garden. A few plants were selected randomly and their heights in
cm were recorded as follows: 1,2,3,5,8. Calculate the standard deviation of their heights.
STEP3:
STEP1: STEP 2: CALCULATE STEP5:
CALCULATE CALCULATE SQUARE OF STEP 4: STANDARD
DATA SET MEAN DISTANCE DISTANCE VARIANCE DEVIATION
1 3.8 2.8 7.84 6.16 2.4819347
2 1.8 3.24
3 0.8 0.64
5 1.2 1.44
8 4.2 17.64
3. Write a program to collect data. Analyse it and interpret the result. Consider
the following data set for the statistical problem-solving process.
Consider that you have a food event in your residential society. Perform detailed
analysis and interpret what should be the top five cuisines that most people in the
society prefer for this event.
PROCEDURE:
Step 1: Formulate Statistical Investigative Questions
Step 2. Collect/Consider the Data
Step 3. Analyse the Data
Step 4. Interpret the Data
OUTPUT:
Data collected from each block of the apartment:
Data interpretation:
1. How many are interested in South Indian Cuisine?
26
2. How many people are interested in Chinese cuisine from block 3?
3
4. Write a program to find central limit theorem after observing the following data.
In a country in the middle east region, the recorded weights of the male population
follow a normal distribution. The mean and the standard deviations are 70 kg and 15
kg, respectively. If a person is eager to find the record of 50 males in the population,
then what would mean and the standard deviation of the chosen sample?
PROCEDURE:
Step 1: Draw groups of people at random from your area. We will call this a sample.
We will draw multiple samples in this case, each consisting of 30 people.
Step 2: Calculate the individual mean of each sample set.
Step 3: Calculate the mean of these sample means.
Step 4: To add up to this, a histogram of sample mean weights of people will
resemble a normal distribution.
The formula for the central limit theorem is:
μ = Population mean
σ = Population standard deviation
μx¯¯¯ = Sample mean
σx¯¯¯ = Sample standard deviation
n = Sample size
OUTPUT:
5. Write a program to find the quartile for the following odd dataset.
34 24 43 5 58 81 29 90 22 67 32 88 57 34 43 44 91 24 62
PROCEDURE:
Step 1: Sort in Ascending Order
Step 2: Find N
Step 3: Calculate Lower Quartile (Q1)
Lower Quartile (Q1) = (N+1)x1/4
Step 4: Calculate Middle Quartile (Q2)
Middle Quartile (Q2) = (N+1)x2/4
Step 5: Calculate Upper Quartile (Q3)
Upper Quartile (Q3)= (N+1)x3/4
OUTPUT:
N 19
SORTED
POSITION DATASET DATA
1 34 5 Q1 Q2 Q3
2 24 22 POSITION 5 10 15
3 43 24 DATA 58 67 43
4 5 24
5 58 29
6 81 32
7 29 34
INTER QUARTILE
90 34
8 RANGE: Q3-Q1 -15
9 22 43
10 67 43
11 32 44
12 88 57
13 57 58
14 34 62
15 43 67
16 44 81
17 91 88
18 24 90
19 62 91
6. Write a program to find the quartile for the following even dataset.
54 28 76 64 41 83 19 71 37 58
PROCEDURE:
Step 1: Sort in Ascending Order
Step 2: Find N
Step 3: Calculate Middle Quartile (Q2) or find the median of the dataset
Middle Quartile (Q2) =N/2 & (N+1)/2
Step 4: Split the Dataset into first half and second half
Step 5: Calculate Lower Quartile (Q1) for first half of the data set.
Lower Quartile (Q1) = (N+1)2
Step 6: Calculate Upper Quartile (Q3)
Upper Quartile (Q3) = (N+1)2
OUTPUT:
SORTED
POSITION DATASET DATA
1 54 19 N 10
2 28 28
3 76 37
4 64 41 Q2
5 41 54 POSTION 5.5 BETWEEN 5 AND 6
6 83 58 DATA 56 (54+58 )/2
7 19 64
8 71 71
9 37 76
10 58 83