Module 1
Data visualization
1. Define and discuss ‘Statistics’ in singular sense.
2. Define Statistics as per Horace Secrist. Explain in detail.
3. Discuss the relevance of Statistics for the management education.
4. Discuss the scope of Statistics in detail.
5. List and explain the characteristics of Statistics.
6. Explain the limitations of Statistics in detail.
7. Bring out the salient differences between Data and Information? Discuss the different types of
data, its sources, advantages and disadvantages.
8. Discuss the advantages and disadvantages of Primary data. Mention its sources.
9. Discuss the advantages and disadvantages of secondary data. Mention its sources.
10. Elaborate the basis of classification of data citing an example for each.
11. Explain the need of Sturge’s rule in frequency distribution.
12. 30 observations are recorded below. Calculate the number of classes based on Sturge’s method
and hence find the class width. Also construct the frequency distribution table.
94, 89, 88, 89, 80, 94, 92, 88, 87, 85, 88, 93, 94, 93, 94, 93, 92, 88, 94, 90, 93, 84, 93, 84, 91, 93,
85, 91, 89, 95
13. Prepare a bivariate frequency distribution for the following data of 20 students:
Marks in Ethics: 10, 11, 10, 11, 11 14, 12 12, 13, 10, 13, 12, 11, 12, 10, 14, 14, 12, 13, 10
Marks in Stats: 20, 21, 22, 21, 23, 23, 22, 21, 24, 23, 24, 23, 22, 23, 22, 22, 24, 20, 24, 23
Prepare a marginal frequency table for marks in Ethics and Stats.
Also prepare a conditional frequency distribution for marks in Ethics, when the marks in Stats is
more than 22.
14. Construct a bivariate frequency table classifying the income of the families (X) into
intervals 2000 – 3000 and so on and the percentage of expenditure (Y) into 10 – 15 and
so on. Also write the marginal distribution of X and Y. Further prepare the conditional
distribution of X when Y lies between 15 and 20.
X Y X Y X Y X Y X Y
5500 12 2250 25 6800 13 2020 29 6890 11
6230 14 3100 26 3000 25 2550 27 5230 12
3100 18 6400 20 4250 16 4920 18 3170 18
4200 16 5120 18 5550 15 5870 21 3840 17
6000 15 6900 12 3250 23 6430 19 4000 19
15. The following data give the points scored in a tennis match by two players X and Y at the end of
20 games:
(10, 12), (7, 11), (7, 9), (15, 19), (17, 21), (12, 8), (16, 10), (14, 14), (22, 18), (16, 7), (15, 16),
(22, 20), (19, 15), (7, 18), (11, 11), (12, 18), (10, 10), (5, 13), (11, 7), (10, 10)
Taking the class intervals as 5 – 9 and so on for both X and Y, construct Bivariate frequency table
and the Conditional frequency distribution for Y given X > 15
16. Write a short note on the objectives of tabulation and discuss various parts of the table.
17. In the year 2015, the total strength of three colleges X, Y and Z in a city were in the ratio 4:2:5.
The strength of college Y was 2000. The proportion of girls and boys in all colleges was in the
ratio 2:3. The faculty wise distribution of boys and girls in the faculties of Arts, Science and
Commerce was in the ratio 1:2:2 in all the three colleges. Suitably tabulate the above data.
Show all the relevant calculations in detail.
18. The Population of three cities X, Y and Z is 4.5 lakhs. They were distributed in the ratio 4:3:2 The
proportion of male and female in all these three cities were in the ratio 2:3. In an average, each
of these cities have the population from lower middle class, middle class and rich class in the
ratio 1:4:3. Suitably tabulate the above data. Show all the relevant calculations in detail.
19. In a particular year of MBA batch, the total strength of three campuses Global, Knowledge and
Elite were in the ratio 5:2:3. The strength of the Knowledge campus was 2500. The proportion of
boys and girls in all colleges were in the ratio 3:2. The specialization wise distribution of boys and
girls in HR, Finance and Marketing was in the ratio 1:2:2 in all the three campuses. Suitably
tabulate the above data by showing the relevant calculations.
20. Elucidate the importance of graphical representation of data. Discuss its advantages and
limitations.
21. The data on the production of oil seeds in a particular year is presented below. Draw a suitable
bar chart.
Oil seeds Yield (million tonnes)
Groundnut 5.80
Castor seed 3.30
Coconut 1.18
Cotton 2.20
Soya bean 1.00
22. The number of demonetized notes (Rs. 500/- and Rs. 1000/-) returned to SBI in 4
branches in the month of December 2016 is given below. Represent the same in the
form of sub-divided bar chart.
Branches No. of Rs.500/- notes No. of Rs.1000/- notes Total
Whitefield 1235 1867 3102
Redfield 1562 2096 3658
South end circle 1617 2212 3829
East end circle 1323 1914 3237
23. The data on fund flow (in crores of rupees) of an International Airport Authority during the
financial years 2012 – 13 to 2014 – 15 are given below. Represent this data as a multiple bar
chart.
2012 - 13 2013 - 14 2014 - 15
Non traffic revenue 40.00 50.75 70.25
Traffic revenue 70.25 80.75 110.00
Profit before tax 40.15 50.50 80.25
24. Bring out the differences between Bar chart and sector graph.
25. The following data relate to area in millions of square kilometer of oceans of the world. Draw a
pie diagram for the same showing all relevant calculations
Oceans Area (million sq km)
Pacific Ocean 70.8
Atlantic Ocean 41.2
Indian Ocean 28.5
Antarctic Ocean 7.6
Arctic Ocean 4.8
26. The following data represent the estimated gross area under different cereal crops
during a particular year. Draw a sector graph to represent the data. Show all the
calculations in detail.
Crop Gross area (thousands of hectares)
Paddy 34321
Wheat 18287
Jowar 22381
Bajra 15859
Ragi 2656
Maize 6749
Barley 4422
Small millets 6258
27. The following data represent the sales of car tyres of various brands by a retail showroom of
tyres during the year 2011 – 12. Draw a sector graph to represent the data. Show all the
calculations in detail.
Brand of Tyres Tyres sold
Dunlop 136
Modi 221
Firestone 138
Ceat 84
Goodyear 101
JK 120
Module 2
Measure of central tendency
1. Write a short note on the following
a. Mean – Direct v/s Shortcut method
b. Median
c. Mode
2. The human resource manager at a city hospital began a study of the overtime hours of the
registered nurses. Twenty-five nurses were selected at random and their overtime hours during a
month were recorded:
13, 13, 12, 15, 7, 15, 5, 12, 6, 7, 12, 10, 9, 13, 12, 5, 9, 6, 10, 5, 6, 9, 6, 9, 12
Calculate the arithmetic mean of overtime hours by converting the above data into discrete
series.
3. A company is planning to improve plant safety. For this, accident data for the last 50 weeks was
compiled. These data are grouped into the frequency distribution as shown below. Calculate the
Arithmetic mean of the number of accidents per week by direct method. Verify the solution
using step deviation method.
No. of accidents 0–4 5–9 10 – 14 15 – 19 20 – 24
No. of weeks 5 22 13 8 2
4. 168 handloom factories have the following distribution of average number of workers in various
income groups. Find the mean salary paid to the workers.
Income groups No. of firms Average no. of workers
800 – 1000 40 8
1000 – 1200 32 12
1200 – 1400 26 8
1400 – 1600 28 8
1600 – 1800 42 4
5. Find the missing frequencies in the following frequency distribution of 60. The
arithmetic mean is given as 11.09
Class interval Frequency
9.3 – 9.7 2
9.8 – 10.2 5
10.3 – 10.7 A
10.8 – 11.2 B
11.3 – 11.7 14
11.8 – 12.2 6
12.3 – 12.7 3
12.8 – 13.2 1
6. The pass result of 50 students who took a class test is given below. If the average marks for all
the students were 51.6, find out the average marks of the students who failed.
Marks 40 50 60 70 80 90
No. of students 8 10 9 6 4 3
7. The average dividend declared by a group of 10 chemical companies was 18%. Later on, it was
discovered that one correct figure 12, was misread as 22. Find the correct average dividend.
8. The mean of 200 observations was 50. Later on, it was found that two observations were
misread as 92 and 8 instead of 192 and 88. Find the correct mean.
9. There are two units of an automobile company in two different cities employing 760 and 800
employees respectively. The arithmetic mean of monthly salaries paid to employees in these two
units is Rs.18750/- and Rs.16950/- respectively. Find the combined arithmetic mean of salaries
of the employees in both the units.
10. Find the median wage of a daily worker from the following data. Also calculate the third quartile,
fourth decile and 79th percentile. The total number of daily workers employed is 6000. Out of
which 5% earn less than Rs.350 per day, 1160 earn from Rs.351 to Rs.400 per day, 30% earn from
Rs.401 to Rs. 450 per day, 1000 earn from Rs.451 to Rs.500 per day, 20% earn from Rs.501 to
Rs.550 per day and the rest earn Rs.551 or more per day.
11. A set of adventurous MBA students decided to take up a project in Oceanography. They
met the oceanographers, discussed in detail and decided to do a statistical survey in the
Ocean. Each student in turns jumped into the ocean as scuba divers and found different
objects and recorded the number of objects found by them each time. The total
numbers of dives were 213. The number of objects found varied from zero to hundred.
Calculate the modal value of number of objects found by the scuba divers.
No. of objects frequency
0 – 10 1
10 – 20 6
20 – 30 24
30 – 40 37
40 – 50 49
50 – 60 38
60 – 70 23
70 – 80 18
80 – 90 12
90 – 100 5
12. The annual salaries (in rupees’ thousands) of employees in an organization is given
below. The total salary of 10 employees in the class 40 and above is Rs.900000. Compute
the mean salary. Every employee belonging to the top 25% of earners has to pay 5% of
their salary to the workers’ relief fund. Estimate the contribution to relief fund.
Salary (Rs. In thousands) Number of Employees
Below 10 4
10 – 20 6
20 – 30 10
30 – 40 20
40 and above 10
13. The following is the age distribution of 1000 persons working in an organization.
Age group No. of persons
20 – 25 30
25 – 30 160
30 – 35 210
35 – 40 180
40 – 45 145
45 – 50 105
50 – 55 70
55 – 60 60
60 – 65 40
Due to continuous losses, it is desired to bring down the manpower strength to 30% of
the present number according to the following scheme:
Retrench the first 15% from the lower age group. Absorb the next 45% in other
branches. Make 10% from the highest age group retire permanently, if necessary.
Find the average age of the above three groups and the retained employees.
Module 3
Measure of Dispersion
1. Why do we need measures of dispersion?
2. Is it possible to have coefficient of range more than 1? If yes, cite an example. If no, state the
reason.
3. What are the characteristics of a good measure of dispersion?
4. Find the range and co-efficient of range for the following data.
Profits No. of shops
0 – 10 1
10 – 20 6
20 – 30 24
30 – 40 37
40 – 50 49
50 – 60 38
60 – 70 23
5. Find the mean absolute deviation and co-efficient of MAD for the following data.
Class interval Frequency
10 – 15 3
15 – 20 9
20 – 25 12
25 – 30 25
30 – 35 10
35 – 40 6
6. Find the interquartile range, quartile deviation and co-efficient of quartile deviation for the
following data.
Profits No. of shops
0 – 10 1
10 – 20 6
20 – 30 24
30 – 40 37
40 – 50 49
50 – 60 38
60 – 70 23
7. Discuss the significance of standard deviation. What is a good standard deviation?
8. Find the standard deviation for the following data.
x f
14 1
15 3
18 6
20 7
21 4
25 2
9. A study of 100 engineering companies gives the following information. Calculate the standard
deviation of the profit earned and co-efficient of variation.
Profits (in crores) No. of companies
0 – 10 8
10 – 20 12
20 – 30 20
30 – 40 30
40 – 50 20
50 – 60 10
10. Two salesmen X & Y selling same product show the following results over a long period of time.
Which salesman seems to be more consistent in the volume of sales?
Salesman X Salesman Y
Average sales INR 30000 INR 35000
Standard deviation INR 2500 INR 3600
11. There are a number of possible measures of sales performance, including how consistent a sales
person is, in meeting established sales goals.
The following data represent the percentage of goal met be each of three sales persons over the
last five years. Which salesman is most consistent.
Raman 88 68 89 92 103
Sindhu 76 88 90 86 79
Prasad 104 88 118 88 123
Module 4
Measure of Association
1. Define correlation. Discuss its significance.
2. Give a skeletal explanation to interpret the correlation value.
3. Mention the various ways of classifying correlation. Explain each of them with an illustration.
4. Discuss positive and negative correlations with an interpretation.
5. Define linear and non-linear correlation.
6. Differentiate simple, partial and multiple correlations.
7. “Correlation is not causation”. Justify the statement with an illustration.
8. Calculate the value of the Karl Pearson’s correlation coefficient for the following data
Index of production 50 52 53 55 54 56 57 51
No. of unemployed 8 6 7 6 7 6 9 10
9. Ten competitors in a beauty contest are ranked by two judges in the following order. Compute
the coefficient of rank correlation.
First 1 6 5 3 10 2 4 9 7 8
Second 6 4 9 8 1 2 3 10 5 7
10. Define Regression. Explain its need in management.
11. Discuss the different types of regression available to tackle the management problem.
12. The following data relate to the scores obtained by 9 salesmen of a company in an
intelligence test and their weekly sales (Rs. In 1000’s)
Salesmen A B C D E F G H I
Test scores 50 60 50 60 80 50 80 40 70
Weekly sales 30 60 40 50 60 30 70 50 60
i. Obtain the regression equation of sales on intelligence test scores of the
salesmen.
ii. If the intelligence test score of a salesman is 65, what would be his expected
weekly sales?
Module 5
Hypothesis testing
1. Define the following: Population, sample, hypothesis testing.
2. Differentiate between the following.
i. Parametric and non – parametric tests
ii. Z – test and t – test
iii. Type I and Type II error
iv. One tailed and two tailed test
v. Left tailed and right tailed test
vi. Null and alternate hypothesis
vii. Level of significance and level of confidence
3. Discuss the general procedure for hypothesis testing.
4. What do you mean by critical values and critical region in hypothesis testing?
5. Individual filing of income tax returns prior to 30 June had an average refund of Rs.1200.
Consider the population of ‘last minute’ filers who file their returns during the last week of June.
For a random sample of 400 individuals who filed a return between 25 and 30 June, the sample
mean refund was Rs.1054 with standard deviation of Rs.1600. Using 5% level of significance, test
the belief that the individuals who wait until the last week of June to file their returns to get a
higher refund than early filers.
6. A packaging device is set to fill detergent powder packets with a mean weight of 5 kg with a
standard deviation of 0.21 kg. The weight of packets can be assumed to be normally distributed.
The weight of packets is known to increase over a period of time due to machine fault, which is
not tolerable. A random sample of 100 packets is taken and weighed. This sample has a mean
weight of 5.03 kg. Can we conclude that the mean weight produced by the machine has
increased? Use a 5% level of significance.
7. The mean life time of a sample of 400 fluorescent light bulbs produced by a company is found to
be 1600 hours with a standard deviation of 150 hours. Test the hypothesis that the mean life
time of the bulbs produced in general is higher than the mean life of 1570 hours at α = 0.01 level
of significance.