0% found this document useful (0 votes)

34 views

Classification of Data

The document discusses classifying and tabulating data through frequency distributions. It covers concepts like variables, ordered arrays, data classification objectives, and creating frequency distribution tables with parts and types of tables.

Uploaded by

ayushmishra.222skp

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

34 views

Classification of Data

Uploaded by

ayushmishra.222skp

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 34

LEARNING OBJECTIVES

 The trainees will be able to do

meaningful classification of large
mass of data and interpret the same.

 They will be able to construct

frequency distribution table and
interpret the same.

 They will be able to describe different

parts of tables and types of table
CONTENT
 Concept of Variable
 Ordered array
 What is data Classification ?
 Objectives of Classification
 Frequency distributions
 Variables and attributes
 Tabulation of data
 Parts of a table
 Type of tables
Concept of Variable
 Variable
A characteristic which takes on different
values in different persons, place or
things.

Example: Diastolic/Systolic blood

pressure, heart rate, the heights of adult
males, the weights of preschool children
and the ages of patients seen in a dental
clinic
 Quantitative Variable:- One that can be
measured and expressed numerically. The
measurements convey information regarding
amount.
Example: Diastolic/Systolic blood pressure, heart
rate, the heights of adult males, the weights of
preschool children and the ages of patients seen
in a dental clinic

 Qualitative Variable:- The characteristics that

can’t be measured quantitatively but can be
categorized. The measurement convey
information regarding the attribute. The
measurement in real sense can’t be achieved but
persons, places or things belonging to different
categories can be counted.
Example: sex of a patient, colure and odour of
stool and urine samples etc.
Random Variable
 Values obtained arise as a result of chance
event/factor, so that can’t be exactly predicted
in advance.
Example: heights of a group of randomly
selected adult.
 Discrete Random Variable:- Characterized by
gaps or interrupts in the values that it can
assume.
It assumes values with definite jumps.
It can’t take all possible values within a
range.
It is observed through counting only
Example: No. of daily admission to a general
hospital, the no. of decayed, missing or filled
teeth per child in an elementary school.
 Continuous Random Variable:-
• It can take all possible values positive,
negative, integral and fractional values
within a specified relevant interval.
• Doesn’t possess the gaps or
interruptions within a specified relevant
interval of values assumed by the
variable.
• Derived through measurement
Example: height, weight and skull
circumference
Because of limitations of available
measuring instruments, however
observations on variables that are
inherently continuous are recorded as if
they are discrete.
The ordered array
 A first step in organizing data is
preparation of an ordered array.
 It is a listing of values of a data series
from the smallest to the largest values.
 It enables one to quickly determine the
smallest and largest value in the data set
and other facts about the arrayed data
that might be needed in a hurried manner.
 Look at the unordered and ordered data in
the file DataExample.xls
 DATA CLASSIFICATION: The grouping of
related facts/data into different
classes according to certain common
characteristic.
 Basis of data Classification:
• Broadly 4 broad basis
1. Geographical i.e. area wise
• Total Population of Orissa by
districts
• No. of death due to malaria by
districts.
• Infant deaths in Orissa by districts
2. Chronological or Temporal
• i.e. on the basis of time
Table: 2 Death by lightening

Year Number
1990 10
1991 5
1992 12
1993 6
1994 9
1995 3
1996 3
1997 5
1998 12
1999 12
2000 8
2001 7
2002 8
Total 100
3. Qualitative i.e. on the basis of some
attributes
Example: People by place of residence, sex
and literacy
Place of residence
Rural Urban
Male Female Male Female
Literate Illiterate Literate Illiterate Literat Illiterate Literate Illiterate
e
4. Quantitative: On the basis of
quantitative class intervals

For example students of a college may be classified

according to weight as follows
Table 3 :Weight of students of a college
Wt. In (LBS) No. of students

90-100 50
100-110 200
110-120 260
120-130 360
130-140 90
140-150 40
Total 1000
Classification of Age of 600 person in
the Social Survey
Class Relative
Interval Frequency frequency
15 -24 56 09.3
25-34 153 25.5
35-44 149 24.8
45-54 75 12.5
55 - 64 61 10.2
65 - 74 70 11.7
75 - 84 28 4.7
85 - 94 8 1.3
Total 600 100.0
In a survey of 35 families in a village,
the number of children per family was recorded data were obtained.

1 0 2 3 4 5 6
7 2 3 4 0 2 5
8 4 5 9 6 3 2
7 6 5 3 3 7 8
9 7 9 4 5 4 3
OBJECTIVES OF CLASSIFICATION
 Helps in condensing the mass of data
such that similarities and dissimilarities can
be readily distinguished.
No. of No of Cum. Fre. Cum.
children families Less than Fre.
(Frequency) Greater
than
0-2 7 7 35
3-5 16 23 28
6 and 12 35 12
above
Total 35
 Facilitate comparison
No. of No of Cum. Fre. Cum.
children families Less than Fre.
(Frequency) Greater
than
0-2 7 7 (20%) 35
(100)
3-5 16 23 28
(65.7%) (80%)
6 and 12 35 12
above (100%) (34%)
Total 35
 Most significant features of the data
can be pin pointed at a glance
 Enables statistical treatment of the
collected data
 Averages can be computed
 Variations can be revealed

 Association can be studied

 Model for prediction / forecasting can be

built
 Hypothesis can be formulated and tested

etc.
Principles of Classification:

There is no hard and fast rules for

deciding the class interval,
however it depends upon:
 Knowledge of the data
 Lowest and highest value of the

set of observations
 Utility of the class intervals for

meaningful comparison and

interpretation
r
 The classes should be collectively
exhaustive and non-overlapping i.e.
mutually exclusive.

 The number of classes should not be too

large other wise the purpose of class i.e.
summarization of data will not be served.

 The number of classes should not be too

small either, for this also may obscure the
true nature of the distribution.

 The class should preferable of equal

width. Other wise the class frequency
would not be comparable, and the
computation of statistical measures will
be laborious.
 More specifically Struges formula can be used to
decide the no. of class interval;
• K=1+3.322(log 10n )
Where k = no. of classes, n=no. of
observation
 The width of the class interval may be
determined by dividing the range by k
w= R
k

where R= difference between the highest and the

lowest observation.
w = width of the class interval
 When the nature of data make them appropriate
class interval width of 5 units, 10 units and width
that are multiple of 10 tend to make the
summarization more comprehensible.
r
Classification will be called exclusive (Continuous),
when the class intervals are so fixed that the upper
limit of one class is the lower limit of the next class
and the upper limit is not included in the class.
An example

Income (Rs.) No. of

families
1000 – 1100 = (1000 but under 15
1100)
1100 – 1200 = (1100 but under 25
1200)
1200 – 1300 = (1200 but under 10
1300)
Total 50
 Classification will be inclusive
(discontinuous) when the upper and lower
limit of one class is include in that class itself

Income (Rs.) No. of

persons
1000 – 1099 = (1000 but < 50
1099)
1100 – 1199 = (1100 but < 100
1199)
1200 – 1299 = (1200 but < 200
1299)
Total 300
 Discontinuous class interval can be
made continuous by applying the
Correction factor.
Lower limit of 2nd Class – Upper
limit of the 1st Class
CF =
2

The correction factor is subtracted

from the lower limit and added to
the upper limit to make the class
interval continuous.
Frequency distributions
 Quantitative Variables:
• Discrete variable
• Continuous variable
 Qualitative variable (attributes)

 The manner in which the total

number of observations are
distributed over different classes
is called a frequency distribution.
Frequency distribution of an attribute
Table 4 : Results of survey
on Awarenesson HIV / AIDs
State of Number of
 In 1993, 1674 Knowledge people
inhabitants of
Aware 620
Calcutta, Bombay
Unaware 1054
and Madras were
Total 1674
surveyed. Each was
asked, among,
other questions, Table 5 : Proportion of
people Aware of HIV /
whether he/she AIDS
knew about the HIV
State of Relative
/ AIDS. The results
Knowledge frequency
is tabulated.
Aware 0.370
Unaware 0.630
Total 1.000
Frequency distribution of a discrete
variable
 Data grouped in to classes and the number of
cases which fall in each class are recorded

Example: In a survey of 35 families in a village, the number

of children per family was recorded data were obtained.

1 0 2 3 4 5 6
7 2 3 4 0 2 5
8 4 5 9 6 3 2
7 6 5 3 3 7 8
9 7 9 4 5 4 3
Steps for frequency distribution
• Find the largest & smallest value;
those are 9 and 0 respectively.

• Form a table with 10 classes for the

10 values 0,1,2……9

• Look at the given values of the

variable one by one and for each value
put a tally mark in the table against
the appropriate class.

• To facilitate counting, the tally marks

are arranged in the blocks of five
every fifth stroke being drawn across
the proceeding four. This is done
below.
Table 6: Frequency Table

Cumulative Cumulative
No. of Frequency Frequency
Tallies Frequency
children Less than More than
type type
0  2 2 35
1  1 3 33
2  4 7 32
3  13 28
6

4  5 18 22
5  5 23 17
6  3 26 12
7  4 30 9
8  2 32 5
9  3 35 3
TABULATION OF DATA

 Compress the data into rows and columns

and relation can be understood.

 Tabulation simplifies complex data,

facilitate comparison, gives identify to the
data and reveals pattern
Different parts of a table
 Table number
 Title of the table
 Caption: Column Heading
 Stub : Row heading
 Body : Contains data
 Head notes: Some thing that is not
explained in the title, caption, stubs
can be explained in the head notes on
the top of the table below the title.
 Foot notes: Source of data, some
exception in the data can be given in
the foot notes.
Table can be classified into 3
ways
Type of table Characteristic Feature
1. Simple table only one characteristic is shown
2. Complex table

a. Two way table shows two characteristics and is

formed when either the stub or the
caption is divided in to two co-
ordinate parts
b. Higher order When three or more characteristic are
table represented in the same table, such a
table is called higher order table
3. General and published by Govt. such as in the
special purpose table statistical Abstract of India, or census
reports are general purpose table
Complex table
Average Number of OPD patients in a PHC in a
tribal area in different age group according of sex

OPD Patients
Age in yrs
Male Female Total
Below 25 25 5 30
25-35 30 4 34
35-45 25 5 30
45-55 22 3 25
Above 55 15 1 16
Total 117 18 135
Number of patients in OPDs of Public sector
hospital by Religion, Age, Rank and Sex
Religion Age(in yr.) Rank
Supervisor Assistant Clerks Total
F M T F M T F M T FMT
Hindu Below 25
25- 35
35 – 45
45 – 55
55 & above
Muslim Below 25
25- 35
35 – 45
45 – 55
55 & above
Total
 Exercise
Age of 169 subjects who
participated in a study of
Sparteine and Mephynytoin
Oxidation is given. Calculate the
frequency, cumulative frequency
both less than and more than
type, relative frequency and
cumulative relative frequency.
From these find out what is the
proportion of patient aged 60 and
more. (Data in the file
DataExample.xls)
 Next Class:
• Understanding data through
presentations

Classification of Data
No ratings yet
Classification of Data
46 pages
Classification of Data
No ratings yet
Classification of Data
3 pages
Organisation of Data
No ratings yet
Organisation of Data
8 pages
Statistics Introduction
No ratings yet
Statistics Introduction
26 pages
QA Chapter 1 Updated 1
No ratings yet
QA Chapter 1 Updated 1
83 pages
University School of Business MBA: SUBJECT NAME: Decision Science-I Subject Code: 21bat604
No ratings yet
University School of Business MBA: SUBJECT NAME: Decision Science-I Subject Code: 21bat604
33 pages
Classification of Data
No ratings yet
Classification of Data
22 pages
1 Elements, Variables and Data Categorization
No ratings yet
1 Elements, Variables and Data Categorization
27 pages
Lecture 2_classification_frequency
No ratings yet
Lecture 2_classification_frequency
43 pages
Stats Reviewer
No ratings yet
Stats Reviewer
3 pages
Probability and Statistics MATH-361 (3-0) : Instructor: Sophia Siddique
No ratings yet
Probability and Statistics MATH-361 (3-0) : Instructor: Sophia Siddique
22 pages
Statistics.handouts 1
No ratings yet
Statistics.handouts 1
6 pages
Chapter 1
No ratings yet
Chapter 1
27 pages
1 - Intro To Bio - Data Types&pres - SFB
No ratings yet
1 - Intro To Bio - Data Types&pres - SFB
71 pages
Chapter 1 - Descriptive Statistics
No ratings yet
Chapter 1 - Descriptive Statistics
16 pages
MMW Stat 24 25
No ratings yet
MMW Stat 24 25
42 pages
Introduction To Statistics: There Are Three Kinds of Lies: Lies, Damned Lies, and Statistics." (B.Disraeli)
No ratings yet
Introduction To Statistics: There Are Three Kinds of Lies: Lies, Damned Lies, and Statistics." (B.Disraeli)
26 pages
Section 6 Data - Statistics For Quantitative Study
No ratings yet
Section 6 Data - Statistics For Quantitative Study
142 pages
Organisation of Data - Class 11
No ratings yet
Organisation of Data - Class 11
32 pages
Biostatistics Biochemistry 1
No ratings yet
Biostatistics Biochemistry 1
22 pages
Basics of Statistics Unit-I SCLS
No ratings yet
Basics of Statistics Unit-I SCLS
135 pages
Basics of Statistics Unit-I SCLS
No ratings yet
Basics of Statistics Unit-I SCLS
127 pages
BS Math Chp.2
No ratings yet
BS Math Chp.2
27 pages
Written Report Gathering and Organizing Data
No ratings yet
Written Report Gathering and Organizing Data
13 pages
Biostatistics
No ratings yet
Biostatistics
234 pages
Ch.04 Organisation of Data
No ratings yet
Ch.04 Organisation of Data
10 pages
MATH 2207 - Basics of Statistics
No ratings yet
MATH 2207 - Basics of Statistics
21 pages
Basic Statistics: Chapter One
No ratings yet
Basic Statistics: Chapter One
15 pages
statistics lecture_1
No ratings yet
statistics lecture_1
58 pages
1 Intro Tree Diagram
No ratings yet
1 Intro Tree Diagram
35 pages
Organisation of Data
No ratings yet
Organisation of Data
18 pages
Lesson 1 Intro To Statistics
No ratings yet
Lesson 1 Intro To Statistics
3 pages
1-Inroduction Statistics and Queuing Theory
No ratings yet
1-Inroduction Statistics and Queuing Theory
72 pages
PAS 111 Week 1
No ratings yet
PAS 111 Week 1
3 pages
Lecture 1 Statistics and Lecture2 (1)
No ratings yet
Lecture 1 Statistics and Lecture2 (1)
44 pages
Introduction To STATISTICS-new
No ratings yet
Introduction To STATISTICS-new
44 pages
Statistic Reviewer
No ratings yet
Statistic Reviewer
9 pages
MSE1_STAT_CLASS
No ratings yet
MSE1_STAT_CLASS
81 pages
Basic Statistical Concepts: Lesson 1
No ratings yet
Basic Statistical Concepts: Lesson 1
33 pages
Statistics and Probabilities Quarter 1
No ratings yet
Statistics and Probabilities Quarter 1
6 pages
STAT. Lec.1
No ratings yet
STAT. Lec.1
30 pages
BS1 Statistics
No ratings yet
BS1 Statistics
26 pages
Statistics For Educational Research
100% (1)
Statistics For Educational Research
3 pages
CHAPTER 4 ORGANISATION OF DATA
No ratings yet
CHAPTER 4 ORGANISATION OF DATA
6 pages
Descriptive & Inferential Statistics Classification
No ratings yet
Descriptive & Inferential Statistics Classification
3 pages
Adobe Scan 06 Jul 2024
No ratings yet
Adobe Scan 06 Jul 2024
5 pages
Course Introduction Inferential Statistics Prof. Sandy A. Lerio
No ratings yet
Course Introduction Inferential Statistics Prof. Sandy A. Lerio
46 pages
Lesson Plan For Sounds
No ratings yet
Lesson Plan For Sounds
27 pages
Lesson1 - Data Definitions
No ratings yet
Lesson1 - Data Definitions
57 pages
Lecture 1 Introduction To Biostatistics
No ratings yet
Lecture 1 Introduction To Biostatistics
31 pages
Note for Int to Statistics
No ratings yet
Note for Int to Statistics
24 pages
Introduction To Statistics-Part I
No ratings yet
Introduction To Statistics-Part I
28 pages
PPT2 Types and Classification of Variables
No ratings yet
PPT2 Types and Classification of Variables
13 pages
BBFH 103 Notes
No ratings yet
BBFH 103 Notes
38 pages
Chapter 1. Biostatistics
No ratings yet
Chapter 1. Biostatistics
34 pages
Introduction to Biostatistics Copy
No ratings yet
Introduction to Biostatistics Copy
8 pages
Lecture 01 Introduction to Statistics Ppt 06022025 095924am
No ratings yet
Lecture 01 Introduction to Statistics Ppt 06022025 095924am
40 pages
Lesson 1. Nature of Statistics
No ratings yet
Lesson 1. Nature of Statistics
65 pages
Statistics I Essentials
From Everand
Statistics I Essentials
Emil G. Milewski
No ratings yet
Business Statistics I Essentials
From Everand
Business Statistics I Essentials
Louise Clark
5/5 (5)
Lab Course File: Course Code:-Bcse2014 Room No.
No ratings yet
Lab Course File: Course Code:-Bcse2014 Room No.
35 pages
What Is Database
No ratings yet
What Is Database
19 pages
Mail Merge Presentation
No ratings yet
Mail Merge Presentation
8 pages
Summative Test (Sampling Distribution, Known and Unknown Variance, CLT-WPS Office
No ratings yet
Summative Test (Sampling Distribution, Known and Unknown Variance, CLT-WPS Office
2 pages
Tecnomatix Plant Simulation Modeling and Programming by Means of Examples Steffen Bangsow - Quickly download the ebook in PDF format for unlimited reading
100% (2)
Tecnomatix Plant Simulation Modeling and Programming by Means of Examples Steffen Bangsow - Quickly download the ebook in PDF format for unlimited reading
68 pages
Siva Sankar Reddy
No ratings yet
Siva Sankar Reddy
1 page
How To Access EBSCOhost
No ratings yet
How To Access EBSCOhost
5 pages
Namma Kalvi 12th Computer Science Question Bank em 216955
No ratings yet
Namma Kalvi 12th Computer Science Question Bank em 216955
12 pages
Who Needs SSAS When You've Got SQL - 403 PDF
No ratings yet
Who Needs SSAS When You've Got SQL - 403 PDF
32 pages
Leron - Arlene - Psychological Statistics - Lesson 2 Bell Curve in Excel
No ratings yet
Leron - Arlene - Psychological Statistics - Lesson 2 Bell Curve in Excel
42 pages
Counterfeit Currency Detection Using Deep Convolutional Neural Network
No ratings yet
Counterfeit Currency Detection Using Deep Convolutional Neural Network
4 pages
SQL-Server-to-Snowflake-Migration-with-LTI-Canvas-PolarSled
No ratings yet
SQL-Server-to-Snowflake-Migration-with-LTI-Canvas-PolarSled
3 pages
Data Analysis With Hive
No ratings yet
Data Analysis With Hive
2 pages
Business Statistics-1 08.03.2022 MBS
No ratings yet
Business Statistics-1 08.03.2022 MBS
16 pages
Cloud Digital Leader Class Notes Jun 2023
No ratings yet
Cloud Digital Leader Class Notes Jun 2023
18 pages
Abhilash Dash Resume
No ratings yet
Abhilash Dash Resume
3 pages
How to learn M Query
No ratings yet
How to learn M Query
5 pages
Performance Tuning For The InfiniDB Analytics Database (For Version 1.0.3)
100% (1)
Performance Tuning For The InfiniDB Analytics Database (For Version 1.0.3)
72 pages
3.5. SQL - DDL - Commands
No ratings yet
3.5. SQL - DDL - Commands
10 pages
Exercises - Mastering Postgresql - Mastering SQL Using Postgresql
No ratings yet
Exercises - Mastering Postgresql - Mastering SQL Using Postgresql
25 pages
4.11 Big Data Questions
No ratings yet
4.11 Big Data Questions
3 pages
AISSCE Set-A CS
No ratings yet
AISSCE Set-A CS
1 page
Baitap Statistics
No ratings yet
Baitap Statistics
2 pages
CI000-083 Foundation On Cloud V3 With Answers
No ratings yet
CI000-083 Foundation On Cloud V3 With Answers
12 pages
CIE 1 DBMS Lab
No ratings yet
CIE 1 DBMS Lab
13 pages
Lecture 11
No ratings yet
Lecture 11
18 pages
Ashish Verma: Career Summary
No ratings yet
Ashish Verma: Career Summary
3 pages
History and Features of Java
No ratings yet
History and Features of Java
65 pages
Unit 3 - Database Management System - WWW - Rgpvnotes.in
No ratings yet
Unit 3 - Database Management System - WWW - Rgpvnotes.in
24 pages
BDA Unlocked
100% (1)
BDA Unlocked
69 pages

Classification of Data

Uploaded by

Classification of Data

Uploaded by

LEARNING OBJECTIVES

 The trainees will be able to do

 They will be able to construct

 They will be able to describe different

Example: Diastolic/Systolic blood

 Qualitative Variable:- The characteristics that

For example students of a college may be classified

 Association can be studied

 Model for prediction / forecasting can be

There is no hard and fast rules for

meaningful comparison and

 The number of classes should not be too

 The number of classes should not be too

 The class should preferable of equal

where R= difference between the highest and the

Income (Rs.) No. of

Income (Rs.) No. of

The correction factor is subtracted

 The manner in which the total

Example: In a survey of 35 families in a village, the number

• Form a table with 10 classes for the

• Look at the given values of the

• To facilitate counting, the tally marks

 Compress the data into rows and columns

 Tabulation simplifies complex data,

a. Two way table shows two characteristics and is

You might also like