Oe Statistics Notes
Oe Statistics Notes
2022
Continuation of Module II: Univariate Data Analysis and Bivariate Data Analysis
Correlation Analysis: We have under consideration only two variables. Let us call them X and Y. Our intuition is that the
two variables are related, and we want to know the extent of their relationship. The variable X and Y could be related in
number of ways; however, we focus on the extend of linear relationship between X and Y. By linear relationship we mean
X and Y such that an equal increase/ decrease in X brings about a proportional increase /decrease in Y. Another way of
understanding the word ‘linear’ is a plot of X and Y on a graph sheet yield points that are more or less around the straight
line. · The measure of correlation called the correlation coefficient · The direction of change is indicated by a sign.
Types of correlation:
Positive Correlation: The correlation is said to be positive correlation if the values of two variables changing with same
direction.
i.e. , if both X and Y moves in the same direction ( ↑ x ↑ y or ↓ x ↓ y ) it is said to be positively correlated.
Negative Correlation: The correlation is said to be negative correlation when the values of variables change with opposite
direction.
More examples:
Positive relationship:
Negative relationships:
Note: If the direction is positive the sign of the correlation coefficient is + and if the direction is negative the sign
is negative.
Zero correlation:
If X and Y are not related, then ‘r’ between then will be zero. Under such situation X and Y are not correlated.
Methods of measurement of correlation:
Quantification of the relationship between variables is very essential to take the benefit of study of correlation.
For this, we find there are various methods of measurement of correlation, which can be represented as given
below:
Note : We are only learning SCATTER PLOT and KARL PEARSON’S COEFFICIENT OF
CORRELATION/ PRODUCT MOMENT CORRELATION
Scatter Diagram:
A scatter plot is used to have a visual representation and understanding of the type of correlation existing
between X and Y in a bivariate data . To construct a scatter, plot the two variables X and Y are taken
along the coordinate’s axis on graph.
Scatter Diagram is a graph of observed plotted points where each point represents the values of X & Y as
a coordinate. It portrays the relationship between these two variables graphically.
Remark:
If r is calculated using the formula discussed for any bivariate data , we may get a sensible value of ‘r’. However it is wrong t
give any interpretations. Such correlation is termed as non - sense correlation.
Ex: foot wear size of a person and monthly income.
Marks of students and heights of students.
Karl Pearson's Coefficient of Correlation or Product Moment Correlation Coefficient
It is numerical measure of the linear relationship between two variables (between X and Y).
The formula was developed by Professor Karl Pearson .
It is denoted by ‘ r’.
The product moment correlation coefficient between two variables x and y is defined as
cov ( x , y )
r=
σxσ y
where cov ( x, y) is the covariance between x and y.
σ x =standard devation of x σ y =standard devation of y
n
1
cov( x ,y) = ∑ x y -x y
n i=1 i i
Remarks:
1.‘r’ lies between -1 and 1. (-1 ≤ r ≤ 1).
2. ‘r’ is location and scale invariant ( ‘ r’ is independent of change in origin and scale).
1 and 2 are the properties of ‘r’.
Sl .No Pattern of points on the scatter plot Values of r
1 Perfect positive correlation +1
2 Perfect positive correlation -1
3 Strong/ High positive correlation r > 0.7
4 Strong High negative correlation r < - 0.7( means for the values less than – 0.7
ie. -0.84, - 0. 79 and so on )
( these two values are given for your
understanding )
5 Weak/Low positive correlation 0.3 ≤ r ≤ 0.7 ( this inequality means for any value
lying between and equal to 0.3 , 0.7)
6 Weak/Low negative correlation -0.3 ≤ r ≤ -0.7 (this inequality means for any value
lying between and equal to -0.3 , -0.7)
Problem1:
Find Karl Pearson’s coefficient of correlation between capital employed and profit obtained from the following
data.
Capital Employed (Rs. In Crore) 10 20 30 40 50 60 70 80 90 100
Profit (Rs. In Crore) 2 4 8 5 10 15 14 20 22 50
Solution: Let us assume that capital employed is variable X and profit is variable Y.
Karl Pearson Correlation coefficient is given by
cov (x , y )
r=
σxσ y
where cov ( x, y) is the covariance between x and y.
σ x =standard devation of x ; σ y =standard devation of y
n
1
cov( x ,y) = ∑ x i y i- x y ;
n i=1
r = cov(x,y) / σ x σ y = 0.85
Here capital employed and profit are strongly positively correlated.
Since r = 0.8519 . There is strong positive relationship between variable X and Y or in other words we can say X
and Y are strongly positively correlated.
COMPUTER:
Scatter Diagram – select data – insert tab – recommended charts- all charts – scatter diagram – fill up all
the chart elements
Correlation coefficient - Data tab – Data Analysis – correlation – ok – small dialog box will open – a)
input range – select both the columns including heading/ label
b) data is arranged by column
c) check the box – label in first row
d) select output range - click on the text box next to it and then select an
empty cell on worksheet --- click ok --Identify the correlation coefficient
between two different variables.
Regression
Introduction
In our discussion , we shall learn how to express the relationship mathematically , so that we can predict or
forecast one of the variable if we know the value of the other variable.
Let there be two variables under consideration x and y . Generally, x is the independent variable and y is the
dependent variable.
Independent variable :
The variables that are manipulated are changed and whose effect are measured and compared.
It helps in predicting or forecasting the values of the dependent variable.
The independent variable is also known as regressor / predictor / input variable.
Dependent variable:
The dependent variable refers to that type of variable that measure that affect of independent variable on test
units.
We can also say that dependent variable are the types of variables that are completely dependent on
independent variable.
They are variables which can be predicted with the help of independent variable.
The dependent variable is also known as regressed / response / output variable.
Examples :
a) the annual turnover of a company and the bonus given to the employees
b) the weight of the mother and the weight of the new born babies
The gross salary of the lecturers in a college and the income tax deduction
Note:
In any real-life situation, we could have situations involving a number of variables. Many independent variables
may affect another variable.
For example:
The number of units sold on a certain number would dependent on the brand , price , the memory , the features
etc
The Sales of a new 2-wheeler will dependent upon the mileage, the capacity of the engine, the advertisement
campaign and many more.
We are interested in establishing the mathematical relationship between x and y and is only focusing on
y=a +bx
( y - y ) = byx ( x - x )
σy
Where byx = cov(x,y) / σ x 2 = r.
σx
Solution : carry out the problem till you Cov(x,y) and standard deviations of x and y
( y - y ) = byx ( x - x )
σy
Where byx = cov(x,y) / σ x 2 = r. ; where x = y=¿
σx
y – 15 = 0.39 x – 21.45
y = 0.39x – 21.45 + 15
Interpretation of Intercept
Coefficient of Determination
R2= r2 X 100
EXCEL
1. For each of the following scatter plot choose the correlation which you think is the most appropriate from
the following:
Correlation is a) weak positive b) weak negative c) strong positive d) strong negative e) Zero
Strong Negative
Weak Negative
No Correlation
Strong Positive
2. Mr. Jhon calculated a correlation coefficient r = 0.95. Comment on the strength and direction of r.
Direction : positive sign indicates- upward direction
Strength : 0.95 - strong
There is a strong positive correlation between the two variables.
3. A teacher of a certain college in Bengaluru would like to examine the relationship between the number of
problems students solved during the semester and their final exam marks. He randomly selects 10 students
for study and asked them to keep track of the number of problems completed during the course of the
semester. At the end of the semester the following data is obtained
Problems Solved 50 5 64 65 7 77 7 84 8 91
8 6 8 5
Final Exam 61 6 66 66 7 73 7 73 7 75
Marks 8 2 2 6
a) Draw the scatter diagram and comment. – ( Either on graphsheet or computer
b) Compute the product moment correlation coefficient. ( Find r) – both manually and using computer
Formula writing is must.
Final Exam
Problems Solved Marks
Problems Solved 1
Final Exam
Marks 0.944391864 1
Here r is 0.94 .
There is a strong positive correlation between Number of problem solved and final exam marks
4. The following table gives the length (in mm) and weight (in gram) of a species of aquarium fish.
Length (X in mm) 8 10 15 17 20 22 24 25
Weight (Y in g) 25 30 32 35 37 40 42 45
a) Compute Karl Pearson’s correlation coefficient. What is the direction and strength of correlation
between the two variables? ( Find r )
Length (X in mm) Weight (Y in g)
Length (X in mm) 1
Weight (Y in g) 0.982681 1
r = 0.98
Since r = 0.98 , direction – positive sign indicates upward direction
Strength – 0.98 , strong
There is a strong positive correlation between Length and weight
Regression equation ( y - y ) = byx ( x - x ) Solve and get in the form as given below
Prediction Y when X = 23
Y = 17.53 + 1.03 *( 23) = 41.22 mm
d) Find coefficient of determination and give suitable interpretation.
R2 = r2 X 100 = 96.56% ; 96.56 % of change in Y( weight ) is influenced by the change in X(length)
R 0.96566
Square 3
5. From the following data on increase in temperature ( X in o C) and the mortality rate ( Y in number
of death per 1000 ) of species of bug.
X 20 32 43 44 39 46
Y 42 72 50 90 45 48
a) Find the product moment correlation coefficient between X and Y and interpret the result.
b) Obtain the regression equation of Y on X and predict the mortality rate ( Y in number
of death per 1000 ) when temperature ( X in o C) is 45.
c) Find coefficient of determination and give suitable interpretation
6. In a bivariate data, on x and y the two variances are 39 and 9. The covariance is -14.5. Find the coefficient
of correlation.
cov ( x , y )
r=
σxσ y
7. In a bivariate data, with 10 observations on x and y, ∑x = 56, ∑y = 138, ∑x2 = 1357, ∑y2 =2136, and
∑xy= 836. Find the coefficient of correlation.
cov (x , y )
r=
σxσ y
where cov ( x, y) is the covariance between x and y.
σ x =standard devation of x ; σ y =standard devation of y
n
1
cov( x ,y) = ∑ x y -x y
n i=1 i i
8. In a bivariate data, with 10 observations on x and y, ∑x = 10, ∑y = 210, ∑x2 = 14, ∑y2=5340, and
∑xy= 180. Find the coefficient of correlation .
*********************
1
P(A)= 6 =0.1667
2
P(B ) = 6 =0.3333
3
P(C) = 6 =0.5
Problem2: In a single throw with two uniform die. Find the probability of throwing sum of the
die is
a) five b) eight
Solution : Random experiment: Throwing two die
Sample space S= { (1, 1), (1, 2), ( 1, 3), (1, 4), (1, 5), (1, 6)
(2, 1), (2, 2), (2, 3), (2, 4), ( 2, 5), ( 2,6)
(3, 1), (3, 2), (3, 3), (3, 4), (3, 5), (3,6)
(4, 1), (4, 2), (4, 3), (4, 4), (4, 5), (4,6)
(5, 1), (5, 2), (5, 3), (5, 4), (5, 5), (5,6)
(6, 1), (6, 2), (6, 3), (6, 4), (6, 5), (6,6) }
Exhaustive cases = 36
4
= 36 = 0.111
Important concepts:
Combination: A combination is simply a manner of selecting some objects from a given set of objects in such a way
that the order of their selection doesn’t matter.
A bag is full of 4 green marbles and 5 blue marbles
GREEN BLUE
1. Out of 4 green marble, you are selecting 1 green marble = 4 C1
2. Out of 4 green marbles, you are selecting 2 green marbles = 4 C2
3. Out of 5 blue marbles, you are selecting 3 blue marbles = 5 C3
n
Cr 4
C4 = 1
0! =1
1! = 1
2! = 2 x 1!= 2 X 1 =2
3! = 3 X 2!= 3 x 2 x 1 =6
4! = 4X 3!= 4 X 3 X2! =4 x 3 x 2 x 1 =24
5! = 5 X 4! = 5 x 4 x 3 x 2 x 1 =120
….10!
Formula
n!
n
Cr =
( n−r ) ! r !
For example:
4! 4! 4 X3 X 2 X 1 24
1. 4
C1 = = = =¿ = 4
( 4−1 ) ! 1! 3! 1 !3 X 2 X 1 X1 6
10 ! 10 ! 10 X 9 X 8! 10 X 9
2. 10C8 = = = = 2 X 1 = 5X 9 = 45
( 10−8 ) ! 8 ! 2! 8 ! 2!8! ¿
¿
Using calculator:
Steps: Enter the expression in given format nCr
Consider the example 5C2
Problem 3: A card is drawn at a random from a pack of playing cards, what is the probability of getting
a) an Ace b) a king c) a red card d) a club
b) E2 : getting a king
Favourable cases : 4C1 = 4
Number of Favourable cases
P(E2) =
Total number of Exhaustive cases
= 4C1 / 52C1
4
=
52
a) A1 : getting 3 queens
Favourable cases : 4C3 = 4
Number of Favourable cases
P(A1) =
Total number of Exhaustive cases
= 4C3 / 52C3
=0.0129 0≤ P(A)≤1
Problem 5. A bag has 2 white balls and 4 Red balls. One ball is picked at random. Find the probability of getting
a) White b) Red
Solution: Random experiment: Picking a ball from a bag
Total number of balls = 2 white+ 4 Red = 6 balls
Exhaustive cases : 6C1 = 6
a) B1 : getting a white ball
Favourable cases : 2C1 = 2
Number of Favourable cases
P(E1) =
Total number of Exhaustive cases
= 2C1 / 6C1
2
= = 0.333
6
keywords: either of the event , A or B, atleast one of the event ---- P(A ∪ B)
Example: Suppose an experiment has the outcomes 1, 2, 3, … , 12 where each outcome has an equal chance of
occurring.
Let event A = number less than & equal to 6 - {1, 2, 3, 4, 5, 6} and
event B = number more than & equal to 6 - {6, 7, 8, 9}.
A intersect B = A ∩ B = A and B = { 6 } Here A and B are non- mutually exclusive events
A union B = A ∪ B = A or B = A either B = Atleast A or B = 1,2,3,4,5,6,7,8,9
The Venn diagram is as follows:
Problem1: A card is drawn from pack of playing card. What is the probability of getting
a)a diamond or king
b) a red or queen
(Or is the keywords – P( A U B))
Solution: Random experiment: A card is drawn
Exhaustive cases: 52C1
a) A: getting a diamond
B: getting a king
Here A and B are non mutually exclusive event.
Because there is one sample point common in both event i.e diamond’s king/ king of diamond card
P( diamond or king) = P(A ∪ B) = P(A) + P(B) – P(A ∩ B) - using addition theorem
= (13C1/52C1) + (4C1/52C1) -(1C1/52C1)
= (13/52) +(4/52) – (1/52)
= 16/52
= 4/13
b) A: getting a red
B: getting a queen
Here A and B are non mutually exclusive event.
Because there are two sample points common in both event i.e queens belongs to red card
Note : If A and B are two mutually exclusive events then P(A ∪ B ) = P(A) + P(B)
example:
Problem1: A card is drawn from pack of playing card. What is the probability of getting
a)a queen or king
b) a red or black
a) A: getting a queen
B: getting a king
Here A and B are mutually exclusive event.
Because there is no sample point common in both events
Solution: Random experiment: Picking three card from a pack of playing cards
Exhaustive cases : 52C3
b) A1 : getting 1 queen, 1 king and 1 jack
Favourable cases : 4C1 X 4C1 X 4C1
Problem 5. A bag has 2 white balls and 4 Red balls. Three balls are picked at random. Find the probability of
getting
a) 1 White and 2 Red b) 1 Red and 2 white
Solution: Random experiment: Picking a ball from a bag
Exhaustive cases : 6C3
c) B1 : 1 White and 2 Red
Favourable cases : 2C1X 4C2
Solution:
Random experiment: Drawing two cards successively
i) With replacement
A: getting jack in first draw P(A) = 4C1/52C1
B: getting jack in second draw P(B) = 4C1/52C1
Problem 2: Two cards are drawn successively (one after the other) from pack of playing cards.
Find the probability of getting two red cards under the following conditions
i) with replacement
ii) without replacement
P( a getting red card in first draw and getting red card in second draw)
( here A and B are independent)
P(A ∩ B) = P(A) . P(B)
= (26C1/52C1 ) . (26C1/52C1)
Problem 3: Two cards are drawn successively (one after the other) from pack of playing cards.
Find the probability of getting red card in first draw and black card in the second draw under the following
conditions
i) with replacement
ii) without replacement
P( a getting red card in first draw and getting black card in second draw)
( here A and B are independent)
P(A ∩ B) = P(A) . P(B)
= (26C1/52C1 ) . (26C1/52C1)
P(a getting king card in first draw and getting number divisible by 5 in second draw)
( here B is dependent on A)
P(A ∩ B) = P(A) . P(B|A)
= (4C1/52C1 ) . (8C1/51C1)
Conditional Probability ( Refer multiplication theorem)
The conditional probability of occurrence of event A given event B has already occurred is defined as
P(A|B) = P(A ∩ B) / P(B) ; P(B)>0
The conditional probability of occurrence of event B given event A has already occurred is defined as
P(B|A) = P(A ∩ B) / P(A) ; P(A)>0
Problem1 : A sample of 500 respondents were selected in a large metropolitan area to study consumer behavior
with the following results
a) Suppose the respondent chosen is female. What is the probability that she does not enjoy shopping for
clothing?
b) Suppose the respondent chosen enjoys shopping for clothing. What is the probability that the individual is
male?
c) Are enjoying shopping for clothing and gender of the individual independent?
Solution : Random Experiment : Selecting a respondent in a survey
a) A: Female respondent
B : Respondent don’t enjoy shopping
The conditional probability of respondent doesn’t enjoy shopping (B) given respondent is female(A) is given by
P(B|A) = P(A ∩ B) / P(A) ; P(A)>0
a)Method 1 : P(A ∩ B) = 36 / 500 and P( A) = 260/500
36/500
P(B|A) = P(A ∩ B) / P(A) = = 224/260 =
260/500
b)Direct Method - P( B | A ) = 36/260 =
Bayes’ Theorem
Suppose E1, E2, ….En are mutually exclusive events. Let A be any event related which occurs along with all of Ei’s
then
P ( A|Ei ) P( Ei) E3
P( Ei| A) = E2
P ( A|E 1 ) . P ( E 1 ) + P ( A|E 2 ) . P ( E 2 )+ … … ..+ P ( A|En ) . P( En) E4
E1
A
E5
En
E6
E7
Problem 1: A company has two machines to produce CDs, Machine M1 produces 45% of the production and
machine M2 produces 55 %. The defective rates for machine M1 is 8% and for machine M2 is 10%
a) What is the probability that an item selected at random is defective?
E2
b) If a defective item is drawn, what is the probability that it is from machine M1? E1
A
Solution:
E1 = CDs produced by machine M1
E2 = CDs produced by machine M2
Here E1 and E2 are mutually exclusive events
Let A be the event of defective item produced
P(E1) = 45% =0.45 ; P(E2) = 55% =0.55
P(A|E1) = 8% =0.08 i.e probability of item being defective from machine M1
P(A|E2) = 10% =0.10 i. e probability of item being defective from machine M2
(A∩ E1) = P(A|E1) P(E1) =0.08 * 0.45 =0.036 ( use multiplication theorem)
(A∩ E2) = P(A|E2) P(E2) = 0.1* 0.55 =0.055
P(E3| A)=? Probability of bullet drawn from machine 3 given that it is defective
From Bayes theorem we have
P ( A|E 3 ) P( E 3)
P( E3| A) = = 0.316
P ( A|E 1 ) . P ( E 1 ) + P ( A|E 2 ) . P ( E 2 )+ P ( A|E 3 ) P(E 3)
(Substitute and solve the problem)
Homework
Problem 3: In a certain assembly plant three machines A, B and C make 30%, 45% and 25% of the products
respectively. It is known from past experience that 2%, 3% and 2% of the products made by
each machines respectively are defective. Suppose a finished product is randomly selected a) what is the
probability that it is defective b) if a product was chosen randomly and found to be defective what is the
probability that it was made by machine C?
Solution :
E1 : products produced by machine A ; P( E1) = 30% = 0.3
E2 : products produced by machine B; P( E2) = 45% = 0.45
E3 : products produced by machine C ; P( E3) = 25 % = 0.25
1. Which average will you prefer in the following situation? "To obtain the average speed of Joe when he drives a car at 20
mph for the half of the journey and 30 mph for second half."(0.5 Points)
a. Arithmetic Mean
b. Geometric Mean
c. Harmonic Mean
2. What is the strength and direction in a scatter plot indicated below? (2 points)
3. The given stem and leaf plot displays the number of visitors in Mall during a month. What is median number visitors? (1
Point)
0 123 5 6 7
1 579
2 1134 55 6 6 6 6 7 7 7 8 9 9
3 0057
2 The life in hours) of 15 bulbs of a certain brand were collected and it was found that Q1=900hours and D8 =
1500hours. Interpret Q1 and D8 in the given scenario. (2 points)
3 The following boxplot shows the ages of students in School. Comment on the skewness. (0.5 points)
4 A data is collected on increase in temperature (x) and the sales of ice cream (y). it was found that correlation
coefficient (r) between temperature and sales is 0.85 and regression equation is Y= 2.5+0.7x
Answer the following
a. Identify independent and dependent variable.
b. Predict y when x = 40
c. Find coefficient of determination. (3 points)
5 If the coefficient of variation of product A is 17% and if the cofficient of variation of product B is 20%. Then
we can conclude that product A is ________ than product B. (1 point)
a. More consistent
b. Less consistent
Date
4. The given stem and leaf plot displays the number of birds visitng a Feeder in a
day
5. The front row in a movie theatre has 23 seats. If you were asked to sit in the seat that occupied
the median position, in which seat would you have to sit? 12th seat
6. The mean weight of five complete computer stations is 167.2 pounds. The weights of four of
the computer stations are 158.4 pounds, 162.8 pounds, 165 pounds, and 178.2 pounds
respectively. What is the weight of the fifth computer station?
Mean = Sum of weight five computer / 5
5 * mean – ( weights of four computers) = fifth computer weight = 172.6 pounds
7. The mean salary of 20 women employees is 15 ( in 1000’s) and the mean salary of 30 male
employees is 25 ( in 1000’s). Then the average salary of women and men employees is
______________( in 1000’s). Combined mean = ( 20*15 +30* 25 ) / ( 20+30)
8. If the mean pocket expenses of a I B.Sc student for June, July and August is Rs 500. If the
pocket expenses for the month September and October is 350 and 400 respectively. What
would be the mean expenses of student in five months?
Method 1 : Mean of ( June, July , Aug) = Sum of J, J A / 3
500 = sum of J J A /3
Sum of J J A = 1500
9. The following data represent the number of pop-up advertisements received by 10 families
during the past month.
43 37 35 30 41 23 33 31 06 21
i) Calculate the mean, median and mode number of advertisements received by each family
during the month.
ii) Which measure of central tendency provides best measure of center for this data? Explain.
Median ( because data has an outlier ( i.e 6 ) and there is no mode)
10. The following are the life (in hours) of 15 bulbs of a certain brand. Find the values of the three
quartiles, P68 and D4.
850 997 878 900 902 730 900 950 975 1000 1050 750 885 1100 1075
Interpret your answers. What can you say about the values of the 40th percentile and 5th decile?
11. A person invested certain amount in Reliance stock which gained 10% in the 1st Year, 20%
in the 2nd Year and 30% in the 3rd Year. What is the average rate of gain?
- Geometric Mean ( rate %)
12. What would be the average speed of swimmer if he swims first 5 min. at 15km/hr and
another 5 min. at 10km/hr ? Harmonic mean ( data is Km/Hr)
13. A man drove car from home to his office at a speed of 55 km per hour and returned home at a
speed of 80 km per hour. What was my average speed for the whole trip? Harmonic mean ( data is Km/Hr)
Mount Carmel College, Autonomous, Bengaluru
Department of Statistics and Analytics
Open Electives – Elementary Statistics using Excel
Worksheet – Measures of Dispersion
Date
i) What is the age of the youngest student? 8 ii) What is the age of the oldest student? 17
iii) What is the range of the data? 9 iv) What is the upper quartile? 14 Q3
v) What is inter-quartile range? 4 ( Q3- Q1) iv) Comment on skewness of the data. Left skewed
2. The following double box plot represents the average prices of two brands of Sneakers.
Answer the following questions
a) Which brand of sneakers has a price range of $10? Nike
b) Which is the median value of Nike Sneakers? 47
c) Which is the value of lower quartile of Reebok Sneakers? 34
d) Comment on the skewness of Reebok Sneakers. Right Skewed
4. Sketch a box-and-whisker plot for the data from a certain school giving the average score of 15
students attending remedial classes before appearing for their board exams. Also check for outliers.
Find skewness and comment.
9.66 5.90 8.02 5.79 8.73 0.82 8.01 8.35 0.49 6.68 15.64 4.08 6.17 9.91 5.47
5. The amount of rainfall in a city during a particular season for 6 days are given as
Monday Tuesday Wednesday Thursday Friday Saturday
17.8 cm 19.2 cm 16.3 cm 12.5 cm 12.8cm 11.4cm
a) Find the range, variance and its coefficient.
b) Construct the box and whisker plot and comment on the distribution.
c) Evaluate skewness and comment.
6. For a summer job, you were working in the quality control department for a computer company
that manufactures computer parts. The specific part that you are to evaluate the quality of is supposed to be 8
micrometers in thickness. You obtained samples of five of these parts manufactured by the day shift and five parts
manufactured by the night shift workers. Here are the findings:
Day Shift 7.9, 8.0, 8.2, 8.3, 7.8 Night Shift 2, 4, 12, 14, 20
a) Find (i) mean thickness for each shift. (ii) standard deviations for each shift.
B) Which crew is doing a better job? Why? – Coefficient of variation ( CV day shift < CV night shift)
There is less variation in the measurement produced during day shift as compare to night shift
7. The following data provide the runs scored by two batsmen in the last 10 matches.
Batsman A: 25, 20, 45, 93, 8, 14, 32, 87, 72, 4 Batsman B: 33, 50, 47, 38, 45, 40, 36, 48, 3