0% found this document useful (0 votes)
5 views

Scientific Data

The document discusses various types of scientific data, including qualitative and quantitative data, and outlines methods for data collection and representation. It details different measurement scales (nominal, ordinal, interval, and ratio), as well as sampling techniques and the importance of data analysis and interpretation. Additionally, it covers measures of central tendency, including mean, median, and mode, providing examples for better understanding.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views

Scientific Data

The document discusses various types of scientific data, including qualitative and quantitative data, and outlines methods for data collection and representation. It details different measurement scales (nominal, ordinal, interval, and ratio), as well as sampling techniques and the importance of data analysis and interpretation. Additionally, it covers measures of central tendency, including mean, median, and mode, providing examples for better understanding.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 22

Scientific Data

TYPES OF SCIENTIFIC DATA, METHODS OF


COLLECTION & REPRESENTATION

Dr Anand Prakash
Department of Zoology
SVP College, Bhabua
What is data
Data is a collection of collection of measurements or observations in the form of numbers, text, sound, images, behaviors,
test or in any other format.
On the basis of nature, data can be

Qualitative data refers to information about qualities, or information that cannot be measured. It’s usually descriptive and
textual. Examples: eye color or the type of car. In surveys, it’s often used to categories ‘yes’ or ‘no’ answers.
Quantitative data is numerical. It’s used to define information that can be counted. Examples include distance, speed,
height, length and weight.
On the basis of the source of data it could be
Primary data: Data obtained directly from the event or experiments
Secondary data: Data which is obtained from the other sources like news paper, internet, metrological dept, pollution
department etc.
What is Information
Data when processed, organized, or structured in a way so that it produces a meaningful, valuable and useful conclusion is
called Information which gives knowledge, understanding and insights that can be used for decision-making , problem-
solving, communication and various other purposes.
Levels of Measurements
• To measure or observe data, we need to an object that can measure or quantify any event
• This object or device is called scale
Examples of Nominal Scale data
Total four different scales we wise to measure the data • Gender: Male or female
1. Nominal Scale • Race/cast/religion: The Race/cast/religion of a
2. Ordinal Scale person
• Blood type: The blood type of a person
3. Interval Scale
• Eye color: The color of a person's eyes
4. Ratio Scale • Marital status: Single, married, or divorced
• Type of car: Sedan, SUV, or truck
• Mode of transportation: Bus, train, car, bike, or
1. Nominal scale walking
• Continent: North America, South America, Asia,
• Nominal scales are also called “names” or labels. Europe, Africa, or Australia
• Behavioral pattern: Extroverted/ Introverted
• Nominal scales variables are used for labeling/ classifying/ categories/ discriminate, without any quantitative value.
• Data from Nominal scales are mutually exclusive (no overlap) and non-numeric (quantitative) none of them have
any numerical significance.
Example: Please list the type of blood group
• There is no order or ranking between the categories 1-A 2-AB 3-B
In this particular example, Here numbers are simply used as tags and
have no value.
2. Ordinal Scale

• Ordinal scale is the 2nd level of measurement that reports the ranking and ordering of the data without the distance
between the variables. The ordinal scale cannot answer “how much” different the two categories are?
• It is used for qualitative data
• Ordinal scale date include ratings about opinions or perceptions, or demographic factors that are categorized into
levels or brackets
• Ordinal variables are useful in social science research
• Ordinal data/ variables are collected using closed-ended survey questions, opinion polls, survey to compare data between
participants.
Example of Ordinal scale
• Olympic medal positions
• Out of the five mentioned laptop brand, rate the order of preference – • Level of pain on a pain scale
1. HP 2. Apple 3.Lenovo 4.Dell 5.Acer • Siblings' ages
• Ranking in a class/ company
You can not answer “how much” different the two categories are • Customer satisfaction
• Socio-economic background
• Frequency of occurrence
3. Interval Scale
• An interval scale is defined as a type of scale where the distance between any two points can be measured, requiring a
zero point and a unit of measurement.
• An example is the Celsius temperature scale, which is referenced to the melting point of ice, with each temperature
measurement located a specific number of degrees above or below this reference point.
• The interval scale is quantitative as it can quantify the difference between the values
• It allows calculating the mean and median of the variables
• To understand the difference between the variables, you can subtract the values between the variables
• The interval scale is the preferred scale in Statistics as it helps to assign any numerical values to arbitrary assessment
such as feelings, calendar types, etc.

Example of interval scale


temperature (Farenheit), temperature
(Celcius), pH, SAT score (200-800), credit
score (300-850).
4. Ratio Scale
• Ratio scale is a type of variable measurement scale which is quantitative in nature.

• It allows any researcher to compare the intervals or differences.

• Ratio scale is the 4th level of measurement and possesses a true zero point or character of
origin. This is a unique feature of this scale.

• Ratio scale helps to understand the ultimate-order, interval, values, and the true zero characteristic is
an essential factor in calculating ratios

• Example
• Temperature in Kelvin
• Height in feet and inches
• distance in miles or kilometers
• Age in years
• Price of goods in dollar
The temperature outside is 0-degree Celsius. 0 degree
doesn’t mean it’s not hot or cold, it is a value
Discreate and continuous data (Variables)
Discreate Variation : When data or variables we observe, are found in a form of countable number like 1, 2, 3, 4,5…………
For examples: Number of house in Bhabua, Number of students in colleges, number of bulb produced in a company
Continuous Variation : Data which we observed are found to have any real value (it could be in fraction). There are infinite
possibilities exists between two limits. 1.1, 1.21, 1.22 etc.
like height of human/plant, Size of planet and moons in our Galaxy.

Binary Data: A binary data only takes on two possible values. For example, lamp is on or lamp is off, answer is true
or false, 0 or 1, yes or no, Head and tails, Male and females birth, Black and white etc.
Sample and sampling
Sample
• When you wish to conduct research for a data with large numbers
like production of eclectic wires, scales, thermometers, or large
population size (consider population of Kaimur district).
• It’s rarely possible to collect a data from every item/person from that
group.
• So, we take few items or individual from the group, which is called
sample.
• The sample is the group of individuals who will actually participate in
the research/study.

Imagin your Doctor takes your blood sample to check your health status

Sampling
• To draw valid conclusions from your results, you have to carefully decide
how you will select a sample that is representative of the group as a whole
is called a sampling.
Sampling Methods

1. Simple Random Sampling: In this method each item of the


population has the equal chance of being included in the sampling
without any deliberate discrimination.

2. Systematic sampling: When population size is large,


scattered, and non homogeneous. Systematic sampling was done
by taking sample at regular intervals.

3. Stratified sampling: When population is not homogeneous


then the population is first divided into homogeneous groups or
class called strata and then the sample is drawn from each strata
at random in proportion to its size. Example: electricity distribution
in urban and village households

4. Cluster Sampling: It is a randomly selected group for natural


groups such as hospital wards, slums of a town, districts under
flood area. Such cluster sampling used to study/ trial some
scheme or vaccination etc.
Collection – Presentation – Analysis – Interpretation

good are unless


Importance analyze
statistics their and have
useless to and then
interpret and people
numbers meaning you

Statistics and numbers are


useless unless you have
good people to analyse
and then interpret their
meaning and importance.

Statistics and
numbers lack
value without
skilled analysis.
Raw and Arrayed data

Raw Data: The data expressed in a way as they were collected are called raw data.
Arrayed Data: Data arranged in ascending or descending order are called arrayed data.
Simple frequency table
Example: Marks obtained by 50 students in a test of 100 marks. Marks Number of
Marks obtained by the 50 students: Obtained Students
4 3
80, 70, 0, 20, 20, 45, 50, 65, 30, 50, 70, 20, 4, 90, 49, 40, 45, 30, 30, 50, 20 7
20, 80, 39, 30, 50, 50, 70, 70, 20, 40, 90, 30, 40, 50, 65, 45, 70, 79, 20, 30 7
39 1
4, 30, 50, 20, 45, 50, 45, 90, 30, 4, 50 40 3
45 5
49 1
Data arranged in an ascending order 50 9
4, 4, 4, 20, 20, 20, 20, 20, 20, 20, 30, 30, 30, 30, 30, 30, 30, 39, 40, 40, 65 2
40, 45, 45, 45, 45, 45, 49, 50, 50, 50, 50, 50, 50, 50, 50, 50, 65, 65, 70, 70 5
79 1
70, 70, 70, 70, 79, 80, 80, 90, 90, 90, 90. 80 2
90 4
Total 50

The data displayed in the given table can be arranged in a shorter form by
grouping the whole collection called distribution table
Framing a continuous distribution table form arrayed data

In order to make a continuous distribution table form simple frequency table 16 Frequency
15-30; 14
We have to decide Frequency distribution table 14

1. Number of Class Class interval


12
45-60; 10
10
2. Class Interval lower upper upper class Frequenc
30-45; 9

frequency
8 60-75; 7 75-90; 7
Class class limit y 6
0 15 15 3 4 0-15; 3
15 30 30 14 2
30 45 45 9 0
Calculate 45 60 60 10 0-15 15-30 30-45 45-60 60-75 75-90
Class Interval
1. Number of Class 60 75 75 7
Rule 1- Identify lowest Value here it is 4 75 90 90 7 0-15; 3
75-90; 7
Rule 2- Identify lowest Value here it is 90 Total 50
Difference (H-L): 90-4= 86
Through struggle rule number of class (k) can be 60-75; 7
15-30; 14

known by formula k = 1+3.322 log10 N


Pi-Chart
Where N= Total number of observation, (here it is
50)
1+ 3.322 x 1.69897 = 1+ 5.61418 = 6.61 or 7 class
45-60; 10
30-45; 9
2. Class Interval (i ) = H-L/K
Measures of central tendency
In Statistics, measures of central tendencies are mean, median and mode. The mean represents the average of the dataset,
the median represents the middle value of the data set, and the mode represents the repeated value in the dataset

1. Mean: It is the average of all the values given in a set of data.


Mean for ungrouped data = [number of observations]/[total number of observations]
Mean for grouped data (for continuous frequency distribution table):

Frequency distribution table


Class
Interval (Ci) Mid value (Xi) Frequency (fi) Xifi
0-15 7.5 3 22.5
15-30 22.5 14 315
30-45 37.5 9 337.5
45-60 52.5 10 525
60-75 67.5 7 472.5
75-90 82.5 7 577.5 Mean =45
Total = 50 =2250
Median

1) Find the median of the following set of data


22, 25, 21, 24, 22, 32, 18
Step: Arranging the data in ascending order of magnitude, we have
18, 20, 21, 22, 24, 25, 32
There are 7 ( odd number ) observations, therefore, the
median is the value of (7+1/2)th = 4th observation.
Median = 22
Median of a discrete data

Number of goals in Cumulative Example: The number of goals scored per match by Kriti during a
Frequency
a football Mach Frequency hockey season was recorded. What is the median number of goals
scored by Kriti during a game?
0 1 1
1 6 7
2 7 14
3 2 16
Therefore, the median is the arithmetic mean
4 3 19 of (20/2)th and {(20/2)+1}th observation =
5 1 20 10th and 11th observation.

According to the table, the numbers in


the 10th and 11th position are 2’s.
Therefore, the median is (2+2)/2
=4/2=2 goals.
Median of a grouped data
Marks out Cumulative
Frequency
of 50 frequency From the table above, we have
0-10 2 2 l=20,
Oct-20 4 6 N=17,
20-30 5 11 F=6,
30-40 4 15 f=5 and
40-50 2 17
h=10.
Example: The marks obtained in English test by 17 students So Median
were recorded. What is the median marks of the students

where l = lower limit of the median class


f = frequency of the median class
F = cumulative frequency of the class preceding
the median class
N = total number of observations
h = width of the median class
Steps: 1. Find the class whose cumulative
frequency is just greater than the value N/2. This
class is known as the median class.
N= 17 so median class is 17/2 =8.5
So, 20-30 is the class whose cumulative
frequency is 11 which is greater than 8.5.
Mode

Mode is defined as the value of a variable which occurs most frequently.


It is the value of the variable that corresponds to the maximum frequency of the distribution.

Example 1) The posted speed limit along a busy highway is 80 Km/h. The following values represent the speeds
( in Km/h) of 10 cars that were stopped for violating the speed limit:
96, 101, 99, 100, 98, 103, 97, 99, 102, 95
What is the mode?
Arrange 95 96 97 98 99 99 100 101 102 103
Mode= 99

Example 2) The following table represents the number of times that 100 randomly selected students ate at
the school cafeteria during the first month of school:
What is the mode of the number of times a student ate at the cafeteria?

Number of times 2 3 4 5 6 7 8
Number of students 3 8 22 29 20 8 10

mode is simply the value that has the highest frequency


Therefore, mode is 29
Finding the Mode of grouped Frequency
Distribution
For example, suppose we have the following grouped
data:

The modal class is simply the class with the highest


frequency. Here it is 11-20

Where,

Where, l = lower limit of the modal class

l = 11 h = size of the class interval


•Mode = 11 + 9[(25-8) / ( (25-8) + (25-14) )]
h=9 •Mode = 16.46 f1 = frequency of the modal class

f1 = 25 f0 = frequency of the class preceding the modal class

f0 = 8 f2 = frequency of the class succeeding the modal class

f2 = 14
The shoe size of 155 people was recorded and
the raw data was presented in the form of the
following frequency table:
Types of vulnerable data
• Personal information, such as names, addresses, and social security numbers
• Financial information, such as bank details and credit cards
• Medical information, such as medical history and test results
• Corporate information, such as company secrets and customer lists

Variables, Traits, event, mutually exclusive event


References

• https://studyonline.unsw.edu.au
• https://www.geeksforgeeks.org
• https://www.graphpad.com/support/
• https://www.sciencedirect.com/topics/computer-science/interval-scale
• https://flexbooks.ck12.org/cbook/ck-12-cbse-math-class-10/section/14.4/primary/lesson

You might also like