Tabulation and Presentation of Data - Unit II
Tabulation and Presentation of Data - Unit II
DATA:
COLLECTION OF DATA
The process of counting or enumerating and recording the same systematically is called the
“Collecting of Data”
STATISTICAL DATA
1. Primary Data
2. Secondary Data
a. Qualitative
b. Quantitative
PRIMARY DATA
1. Published Sources
International publications
Official publications (Central Government, State Government)
Semi-official publications ( IIM B)
Publications of commercial and financial institutions
Publications of research institutions.
Committee Reports
Newspapers and Journals
2. Unpublished Sources
Not published by agencies. It is not useful for the public instead used for
internal purposes.
Ex: private institutions (HR policy etc).
SAMPLING DESIGN
Types of Population
Merits of Census:
Demerits of Census:
2. Sample Method: Randomly selecting, collect a sample that represents the whole
population.
Methods of Sampling:
1. Probability Sampling
2. Non – Probability Sampling
1. Probability Sampling: Probability sampling is defined as a sampling technique in
which the researcher chooses samples from a larger population using a method based
on the theory of probability.
i. Simple Random Sampling: Merely selecting at random. There is no
criteria/technique
Lottery System: writing every unit and try to pick randomly
Table of random numbers: randomly write few numbers on the table
and select.
ii. Stratified Random Sampling: the population is classified into small groups and
pick a sample from the group. The homogenous group is also called strata.
iii. Systematic Random Sampling: We try to adopt some kind of system.
Randomly select some number and follow the system Eg: 4 th house from all 10
streets.
1. Non-probability sampling techniques are a more conducive and practical method for
researchers deploying surveys in the real world.
1. An unknown proportion of the entire population is not included in the sample group
i.e. lack of representation of the entire population
2. The lower level of generalization of research findings compared to probability
sampling
3. Difficulties in estimating sampling variability and identifying possible bias
CLASSIFICATION OF DATA
It is the process of arranging the available facts into homogenous groups or classes according
to resemblance or similarities.
Definition: according to Secrist “classification is the process of arranging data into sequences
and groups according to their common characteristics or separating them into different related
parts”
Objectives of Classification:
1. To present the complex, scattered data in a concise, logical, and understandable form.
2. To make a comparative study possible.
3. To remove irrelevant details from the data and make a possible tabulation of data and
its further analysis.
4. To make possible generalizations of the data.
5. To pinpoint the most significant features of the data at a glance.
6. To know similarities and dissimilarities
7. To find out relationships
8. It facilitates statistical treatment
Characteristic/Essentials of classification
1. Exhaustive: the classification must be exhaustive so that every unit of the distribution
may find a place in one group or another.
2. Suitability: classification must confirm the objects of investigation.
3. Homogeneity: all the items constituting a group must be homogeneous.
4. Flexibility: classification should be flexible so that new facts and figures may be
easily adjusted.
5. Mutually exclusive: the data must not overlap. Each item of the data must be found in
one class.
Methods of Classification:
Population
Male Female
Literate Literate
Illiterate Illiterate
5. Quantitative Classification
When the data are classified based on a characteristic that can be measured such
as age, income, height, production, etc, it is called quantitative qualification.
Methods: there are two stages of quantitative classification
a. Raw
When the investigator has collected the data and not systematically arranged
the same, it is called raw data or unorganized data.
Ex: an investigator has collected the data regarding the weight of 20 workers
in a factory and his findings are shown in the table below:
40 90 64 53
70 62 88 59
47 90 35 36
73 34 10 84
40 59 12 20
In the raw form, the data is scattered, and even after carefully studying the
details given in them are not understood. Presentation of data in its raw form
does not give any useful information.
b. Statistical Data:
Statistical series refers to data that is presented in some order and sequence. It
is an arrangement of data in different classes according to a given order.
Statistical Series:
Types of Series
Frequency
Individual Series Distribution
series
Discrete Series
Continuous
Series
a. Individual Series: under this method the values of all the units are shown separately.
Ex: the data of workers heights can be arranged in two forms:
According to the code number of workers.
The magnitude of weights of workers (ascending or descending).
The presentation though better than the raw data does not reduce the volume of
the data.
b. Frequency Distribution:
It is a summary presentation of the values of the variable. According to their
magnitude individually or in groups.
Tally bars are small vertical bars scored parallel to each other and put opposite to a
particular value or group of values to facilitate the counting of the frequencies.
(a) Discrete Series is a statistical series in which all the observations are listed
out along with their corresponding frequency in the form of a table. All the
observations may not have the same frequency.
(b) Continuous Series: Continuous Series is a statistical series in which all the
class intervals along with their corresponding frequency are listed out in
the form of a table. All the class intervals may not have the same
frequency.
Important points:
Class Interval – the size of each class or group in which the values of variables are
classified to condense the data. It begins with a lower limit and ends with an upper
limit.
Class limits – are two end values of a class interval. The smaller limit is called the
lower limit and the larger limit is called the upper limit.
Inclusive class interval – is a class in which both class limits are considered in the
process of frequency distribution while counting.
Exclusive Class interval – is a class in which the lower limit is considered and the
upper limit is excluded in the process of frequency distribution while counting.
The magnitude of class interval – is the difference between the lower limit and upper
limit of class interval.
Mid Value (MidPoint/Class Mark) – is the center point of the class interval which is
exactly at the middle of the two extreme limits or boundaries of class interval.
Class frequency – is the number of observations corresponding to a specific class, it is
the rate of occurrence of particular events or values relating to a particular class.
Cumulative frequency – is the running total of all the frequencies up to and including
the respective class interval when the class intervals are in ascending or descending
order of values.
Less than cumulative frequency – are running totals of the frequencies downward
starting from the first frequency.
More than cumulative frequency – are running totals of the frequencies upward
starting from the last frequency.
Empty class interval – the class that does not have any frequency.
A frequency distribution table is a chart that summarizes values and their frequency. It's a
useful way to organize data if you have a list of numbers that represent the frequency of a
certain outcome in a sample. A frequency distribution table has two columns. The first
column lists all the various outcomes that occur in the data, and the second column lists the
frequency of each outcome. Putting this kind of data into a table helps make it simpler to
understand and analyze.
Definition:
Tabulation involves the orderly and systematic presentation of numerical data in a form
devised to elucidate the problem under consideration
Objectives of Tabulation
1. Simplification
2. Comparison
3. Provides Bird’s Eye view of the data
4. Quick location of required data
5. Easy to analyse the data
1. Classification refers to the process of grouping the data where as tabulation refers to
process of placing the classified data in columns and rows.
2. Classification deals with grouping the data into classes where as tabulation carried on to
prepare for further statistical analysis.
Format of table
Title:
Table No: Head note:
Parts of a Table
1. Table Number: A table should be numbered for identification and for future reference
especially when there are a large number of tables in a study.
2. Title: every table must be given a suitable title which describes the contents of the table.
The title should be clear and brief; it should be carefully worded and capable of clear
interpretation.
3. Date: the date of preparing a table should be written so that the reader can identify the
chronology (order of tables according to time) of the tables prepared.
4. Stub: stub are the designation of the rows i.e row heading. They are at the extreme left of
the table explaining what the horizontal items represent.
5. Captions: they refer to column headings. They explain what the columns items represent
under caption there maybe sub-captions.
6. Body of the table: the actual data are arranged in the body part of the table. It is the most
important part of the table.
7. Head note: it is a brief explanatory statement applying to all or a major part of the data in
the table. For eg: the unit of measurement like Rs in crores etc are written as head notes.
It is presented on right top corner of a table.
8. Source: a note at the bottom of the table indicating the sources from which the data
contained in table are collected.
9. Foot note: in case of any irregularities occurring in a table or when anything thereof has
not been adequately explained or any abbreviations are used, it is preferably added or an
explanatory note at the bottom of the table is given.
Types of tables
1. Simple table: is also called one way table showing only one characteristic of the data.
2. Complex or Manifold Table: when two or more characteristics are shown simultaneously
in a table it is called complex table or manifold table.
3. General purpose table: are tables which provide information for general use or reference.
They usually contain detailed information for general purpose.
4. Special purpose table: provide information for a particular purpose. They serve the
purpose of that particular group for which they have been prepared. These are brief in
nature and are targeted towards a particular objective.
41 55 48 47 53 48 33 32 42 55
44 38 60 65 71 80 41 53 47 48
55 20 31 34 42 51 35 35 26 25
48 27 38 13 10 5 49 35 26 1
25 33 47 9 19 46 22 17 35 20
3 8 31 45 25 19 40 19 45 18
20 41 39 15 9 40 15 37 29 30
47 16 48 30 40 10 25 20 37 47
12 5 44 32 16 20 2 45 17 34
Ans:
30 45 48 55 39 32 31 22 21 18
54 59 61 33 34 44 10 38 19 62
74 43 73 41 46 43 51 37 85 85
71 29 22 62 29 58 55 63 64 44
43 27 32 43 52 31 47 64 18 51
Prepare a frequency distribution table and calculate the cumulative frequency.
4. Form a continuous frequency table from the following data having class interval of 40-50,
50-60 etc
90 78 86 51 96 104 51 78 50 72
68 106 79 76 49 77 92 84 76 42
74 70 69 65 80 54 79 73 58 91
65 60 77 78 67 50 84 76 110 53
74 40 60 42 82 41 61 75 115 81
5. Marks of students in two subjects is given, prepare univariate and bi variate frequency
distribution table.
Marks in accountancy: 24, 22, 21, 25, 23, 26, 21, 22, 23, 24, 25, 22.
Marks in statistics: 12, 18, 17, 14, 11, 15, 13, 16, 12, 13, 16, 18.
Ans:
Marks of Statistics
Marks of 11 12 13 14 15 16 17 18 Total
Accountancy
21 1 1 2
22 1 11 3
23 1 1 2
24 1 1 2
25 1 1 2
26 1 1
Total 1 2 2 1 1 2 1 2 12
Problems on tabulation:
6. Draw a blank table to show the candidates gender, appearing for first year, second year
and third year exams of a university in the faculties of Arts, Science and Commerce in a
certain year.
Ans:
a) Gender: Male, Female
b) Year: I, II and III
c) Faculty: Arts, Science and Commerce
Faculty Gender
Male Female Total
I II year III Total I Year II III Total
year year year year
Arts
Science
Commerce
Total
7. In the house of lok sabha there were 600 members present during discussion on a
resolution put to vote, 400 voted in favour of the resolution. The government members in
the house were 380, 65 members belonging to the opposition voted in favour of the
resolution. Members were belonging to either of the 2 groups and there were no
absentees. Tabulate the information.
Ans:
a. Members: Ruling Party and Opposition Party
b. Vote: in favour and against
Vote Members
Ruling Party Opposition party Total
In favour 335 65 400
Against 45 155 200
Total 380 220 600
8. Present the following information in a suitable tabular form. In 2018, out of 2000 workers
in a factory 1550 were members of a trade union. The number of women workers
employed was 250, out of which 200 did not belong to any trade union. In 2019, the
number of union workers was 1725 of which 1600 were men. The number of non – union
workers was 380, among whom 155 were women.
Ans:
1. Trade Union: members and non – members
2. Year : 2018 and 2019
3. Gender: Male and Female
Year
Trade 2018 2019
Union Male Female Total Male Female Total
Members 1500 50 1550 1600 125 1725
Non – 250 200 450 225 155 380
Members
Total 1750 250 2000 1825 280 2105
9. In a sample survey about coffee habit in two towns the following information is received
Town A: Females were 40%, total coffee drinkers were 45% and Males non-coffee
drinkers were 20%.
Town B: Males were 55%, Males non-coffee drinkers were 30% and female coffee
drinkers were 15%.
Ans:
a) Towns: Town A and Town B
b) Gender: Male and Female
c) Coffee Habits: Coffee Drinkers and Non – Coffee Drinkers
No’s In Percentage
Coffee Town
Habits Town A Town B
Male Female Total Male Female Total
Coffee 5 40 45 25 15 40
Drinkers
Non – 20 35 55 30 30 60
Coffee
Drinkers
Total 25 75 100 55 45 100
10. Present the following information in a suitable form supplying the figures not directly
given. In 2018 out of a total of 4000 workers in a factory, 3300 were member of trade
union, the number of women workers employed was 500 out of which 400 did not belong
to any union.
In 2019 the number of workers in the union was 3450 of which 3200 were men, the
number of non union workers was 760 of which 330 were women.