0% found this document useful (0 votes)

45 views42 pages

Biostat - Group 3

This document provides information on categorical data analysis techniques used in biostatistics and epidemiology. It discusses categorical variables and how they are summarized using probability tables. It describes different types of categorical data like nominal and ordinal data. Contingency tables are presented as a way to understand relationships between categorical variables. The Cochran-Mantel-Haenszel test, kappa statistics, and goodness of fit tests are introduced as statistical methods for analyzing categorical data while accounting for potential confounding variables.

Uploaded by

Jasmin Jimenez

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

45 views42 pages

Biostat - Group 3

Uploaded by

Jasmin Jimenez

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 42

BIOSTATISTICS

AND
EPIDEMIOLOGY
Presented by GROUP 3
Categorical Data
Analysis
is statistical data made up of categorical variables of data that
have been converted into categories.

One of the examples is grouped data. More specifically,

categorical data could be derived from countable qualitative
data analysis or from quantitative data analysis grouped
within given intervals. These data are summarized in the
form of a probability table.
Categorical Data
Analysis
The categorical data consists of categorical variables which represent
the characteristics such as a person’s gender, hometown etc.
Sometimes categorical data can take numerical values, but those
numbers do not have mathematical meaning. Some of the examples of
the categorical data are as follows:
Birthdate
Favourite sport
School Postcode
Travel method to school etc.
Categorical Data
Analysis
Types of Categorical Data
Nominal data is a type of data that is used to label the variables
without providing any numerical value. It is also known as the nominal
scale.

Ordinal data is a type of data that follows a natural order. The

notable features of ordinal data are that the difference between data
values cannot be determined.
Contingency Table
It displays frequencies for combinations of two
categorical variables. It is also known as “Cross
tabulation” and “two-way tables.”
Classify outcomes for one variable in rows and other in
columns.
Use contingency tables to understand the relationship
between categorical variables.
Contingency Table
Example:
Contingency Table
Relative contingency table:
FORMULA:
Count value in cell X x100
Total number surveyed

Cell 1: (72/203) x 100 = 35.47% (male smoker)

Cell 2: (44/203) x 100 = 21.67% (male non-smoker)
Cell 3: (34/203) x 100 = 16.75% (female smoker)
Cell 4: (53/203) x 100 = 26.11% (female non-smoker)
Contingency Table
JOINT DISTRIBUTION

MARGINAL DISTRIBUTION

CONDITIONAL DISTRIBUTION
Cohran Mantel Haenszel Test
It's a test used to examine matched or stratified categorical data. It enables an
investigator to examine, while accounting for stratification, the relationship
between a binary predictor or therapy and a binary outcome such as case- or
control-status.
It is often used in observational studies where random assignment of subjects to
different treatments cannot be controlled, but confounding covariates can be
measured.
The Cochran–Mantel–Haenszel test is also known as the Mantel-Haenszel
test or the Mantel-Haenszel chi-square test. Researchers use this statistical
procedure to evaluate the association between two categorical
variables while controlling for the effects of a third categorical
variable.
Cohran Mantel Haenszel Test
The Cochran-Mantel-Haenszel test also produces an estimate of the common
odds ratio, a way of summarizing how big the effect is when pooled across the
different repeats of the experiment. This require assuming that the odds ratio is
the same in the different repeats.
The Cochran-Mantel-Haenszel test's capacity to account for the effects of
confounding variables is one of its main advantages. Confounding can happen
when there is a third variable present since it can alter the relationship
between two variables. The test can take into account its
effects and give a more precise estimate of the connection
between the two variables of interest by stratifying the data
by the confounding variable.
Cohran Mantel Haenszel Test
We consider a binary outcome variable such as case status and
a binary predictor such as treatment status. The observations
are grouped in strata. The stratified data are summarized in a
series of 2 × 2 contingency tables, one for each stratum.
Cohran Mantel Haenszel Test
Using the notation in
this table estimates for
a risk ratio or an odds
ratio would be
computed as follows

To explore and adjust for confounding, we can use a stratified

analysis in which we set up a series of two-by-two tables, one for
each stratum (category) of the confounding variable. Having done
that, we can compute a weighted average of the estimates of the
risk ratios or odds ratios across the strata. The weighted average
provides a measure of association that is adjusted for
confounding. The weighted averages for risk ratios and odds ratios
are computed as follows:
Cohran Mantel Haenszel Test

Where ai, bi, ci and di are the numbers of participants in

the cells of the two-by-two table in the ith stratum of
the confounding variable. "ni" represents the number of
participants in the ith stratum.
Cohran Mantel Haenszel Test

To assess the relationship between two categorical variables while

accounting for the impact of a third categorical variable, the Cochran-
Mantel-Haenszel test is a helpful statistical technique. Because it is
resistant to deviation from the premise of normality, it is frequently used
in many domains. Multi-way contingency tables can be handled well as
well. Prior to using the test, it is crucial to confirm that the
observations are independent.
KAPPA STATISTICS
Cohen’s kappa statistic measures interrater reliability
(interobserver agreement).
Interrater reliability, or precision, happens when your data
raters (or collectors) give the same score to the same data
item.

This statistic should only be calculated when:

Two raters each rate one trial on each sample, or

One rater rates two trials on each sample.
The Kappa statistic varies from 0 to 1, where.

0 = agreement equivalent to chance.

0.1 – 0.20 = slight agreement.
0.21 – 0.40 = fair agreement.
0.41 – 0.60 = moderate agreement.
0.61 – 0.80 = substantial agreement.
0.81 – 0.99 = near perfect agreement
1 = perfect agreement.
Formula

Where:
Po = the relative observed agreement among raters.
Pe = the hypothetical probability of chance agreement
Example
The following hypothetical data comes from a medical test
where two radiographers rated 50 images for needing further
study. The researchers (A and B) either said Yes (for further
study) or No (No further study needed).

20 images were rated Yes by both.

15 images were rated No by both.
Overall, Rater A said Yes to 25 images and No to 25. On the other
hand, Rater B said Yes to 30 images and No to 20.
Example
Step 1: Calculate Po (the observed proportional agreement): Po
= number in agreement / total = (20 + 15) / 50 = 0.70

Step 2: Rater A said Yes to 25/50 images, or 50%(0.5). Rater B

said Yes to 30/50 images, or 60%(0.6). The total probability of
the raters both saying Yes randomly is: 0.5 * 0.6 = 0.30.
Step 3:

Rater A said No to 25/50 images, or 50%(0.5).

Rater B said No to 20/50 images, or 40%(0.4).
The total probability of the raters both saying No randomly is: 0.5 * 0.4 = 0.20.

Step 4: Calculate Pe. Pe = 0.30 + 0.20 = 0.50.

Step 5:

k = (Po – Pe) / (1 – Pe ) = (0.70 – 0.50) / (1 – 0.50) = 0.40.”

k = 0.40, which indicates agreement.
Goodness of fit test
The goodness of fit test tells if your sample data represents the
data you would expect to find in the actual population. More
specifically, it is used to test if sample data fits a distribution
from a certain population (i.e. a population with a normal
distribution or one with a Weibull distribution).
Principles
It is used to find out how the observed value
of a given phenomena is significantly different
from the expected value.

The statistical models that are analyzed by

chi-square goodness of fit tests are
distributions.

The chi-square goodness of fit test is a hypothesis

test. It allows you to draw conclusions about the
distribution of a population based on a sample.
Function of test
1.
It is a statistical hypothesis test used to see how
closely observed data mirrors expected data.

2.
Can help determine if a sample follows a normal
distribution, if categorical variables are related, or if
random samples are from the same distribution.
Type of variable

A chi-square (Χ²) goodness of fit test is a

goodness of fit test for a categorical
variable.
Level of measurement
Categorical variables that
have discrete categories or
levels such as nominal,
dichotomous, or ordinal.
Type of study design

Qualitative study designs

Type of objective
A chi-square goodness-of-fit test can be conducted when
there is one categorical variable with more than two levels.
If there are exactly two categories, then a one proportion z
test may be conducted. The levels of that categorical
variable must be mutually exclusive. In other words, each
case must fit into one and only one category.
Formula
Example
We collect a random sample of ten bags of candies . Each bag
has 100 pieces of candy and five flavors. Our hypothesis is that
the proportions of the five flavors in each bag are the same.
Example
Let’s start by answering: Is the Chi-square goodness of fit test an appropriate method to
evaluate the distribution of flavors in bags of candy?
We have a simple random sample of 10 bags of candy. We meet this requirement.
Our categorical variable is the flavors of candy. We have the count of each flavor in
10 bags of candy. We meet this requirement.
Each bag has 100 pieces of candy. Each bag has five flavors of candy. We expect to
have equal numbers for each flavor. This means we expect 100 / 5 = 20 pieces of
candy in each flavor from each bag. For 10 bags in our sample, we expect 10 x 20 =
200 pieces of candy in each flavor. This is more than the requirement of five expected
values in each category.
Number of Pieces of Expected Number of
Flavour
Candy (10 bags) Pieces of Candy

Apple 180 200

Lime 250 200

Cherry 120 200

Grape 225 200

Orange 225 200

Number of Expected
Pieces of Number of Observed-
Flavour
Candy (10 Pieces of Expected
bags) Candy

Apple 180 200 180- 200= -20

Lime 250 200 250-200= 50

Cherry 120 200 120-200= -80

Grape 225 200 225-200= 25

Orange 225 200 225-200= 25

Number of Expected
Pieces of Number of Observed- squared
Flavour
Candy (10 Pieces of Expected difference
bags) Candy

180- 200=
Apple 180 200 400
-20

Lime 250 200 250-200= 50 2500

120-200=
Cherry 120 200 6400
-80

Grape 225 200 225-200= 25 625

Orange 225 200 225-200= 25 625

Number of Expected
squared
Pieces of Number of Observed- squared
Flavour difference/
Candy (10 Pieces of Expected difference
expected number
bags) Candy

180- 200=
Apple 180 200 400 400 / 200 = 2
-20

Lime 250 200 250-200= 50 2500 2500 / 200 = 12.5

Cherry 120 200 120-200= -80 6400 6400 / 200 = 32

Grape 225 200 225-200= 25 625 625 / 200 = 3.125

Orange 225 200 225-200= 25 625 625 / 200 = 3.125

Finally, we add the numbers in the final column to
calculate our test statistic:
2+12.5+32+3.125+3.125=52.75
To draw a conclusion, we compare the test statistic to a critical value from the Chi-Square
distribution. This activity involves four steps:

We compare the value of our test statistic (52.75) to the Chi-square value. Since 52.75 > 9.488,
we reject the null hypothesis that the proportions of flavors of candy are equal.
McNemar Test
Principles
It is used to analyze pretest-posttest study designs, as well as being
commonly employed in analyzing matched pairs and case-control studies.
McNemar's test has three assumptions that must be met:
1. Assumption 1: You have one categorical dependent variable with two
categories (i.e. a dichotomous variable) and one categorical independent
variable with two related groups
2. Assumption 2: The two groups of your dependent
variable must be mutually exclusive
3. Assumption 3: The cases (e.g. participants) are a
random sample from the population of interest.
McNemar Test
Function of test

It is used to determine if there are differences on a dichotomous

dependent variable between two related groups.
The McNemar test is a non-parametric test for paired nominal data. It is
used when you are interested in finding a change in proportion for the
paired data.
Types of variable

Nominal variable with two categories

One independent variable with two connected groups
McNemar Test
Level of measurement

It is the only test that can be used when one or both conditions being
studied are measured using nominal scale

Type of study design

Retrospective Study design (it is used to analyze pretest-posttest study
design)
McNemar Test
Type of objective
The McNemar test is used to determine if there are differences on a
dichotomous dependent variable between two related groups.
Example

Evaluate the effect of a connective tissue graft (CTG) in comparison to a guided tissue
regeneration (GTR) procedure in the treatment of gingival recession. This clinical trial
was formed by matched pairs in which one member of each matched pair is randomly
assigned to CTG and the other member to GTR. The patients are matched on age, sex,
oral hygiene standards, gingival health, probing depth, and other prognostic attributes.
THANK YOU!
MEMBERS:
Bautista, Timhry Erin Dae B.
Estrada, Christalle Claire
Jimenez, Jasmin M.
Landrito,Riva Ysabella
Pascual,Maria Victoria P.
Pepito, Angeljoy Y.
Yatco, Berlyne

Music Education Research：an Introduction（《音乐教育研究导论》）
100% (4)
Music Education Research：an Introduction（《音乐教育研究导论》）
546 pages
Analysis of Categorical Data
No ratings yet
Analysis of Categorical Data
75 pages
Lecture3 - Contingency Analysis
No ratings yet
Lecture3 - Contingency Analysis
16 pages
L3 Categorical Data Analysis
No ratings yet
L3 Categorical Data Analysis
25 pages
Chi Square Test
No ratings yet
Chi Square Test
23 pages
0064ED90-5D9C-4A27-93B4-DBC9A22B0382
No ratings yet
0064ED90-5D9C-4A27-93B4-DBC9A22B0382
37 pages
Probability and Statistics - Lecture 4
No ratings yet
Probability and Statistics - Lecture 4
35 pages
Chi-Square Test
No ratings yet
Chi-Square Test
20 pages
Biostatistics L11+12 2021
No ratings yet
Biostatistics L11+12 2021
9 pages
Chi Square Test
No ratings yet
Chi Square Test
4 pages
Application of Coefficient of Contingency Among Classification
No ratings yet
Application of Coefficient of Contingency Among Classification
12 pages
Chi Square Test
100% (2)
Chi Square Test
75 pages
Statistical Notes For Clinical Researchers: Chi-Squared Test and Fisher's Exact Test
No ratings yet
Statistical Notes For Clinical Researchers: Chi-Squared Test and Fisher's Exact Test
4 pages
Chi - Square Test: PG Students: DR Amit Gujarathi DR Naresh Gill
No ratings yet
Chi - Square Test: PG Students: DR Amit Gujarathi DR Naresh Gill
32 pages
Chi Square
No ratings yet
Chi Square
8 pages
BS IMI U8 Oct23
No ratings yet
BS IMI U8 Oct23
100 pages
Measurement 6th Sem (H) DSE4 Lec 4 05 05 2020
No ratings yet
Measurement 6th Sem (H) DSE4 Lec 4 05 05 2020
19 pages
Chapter12_X2 - Student(1)
No ratings yet
Chapter12_X2 - Student(1)
31 pages
Chi-Square Test: by Dr. M.Supriya Moderator:Dr.B.Aruna, M.D. (H)
No ratings yet
Chi-Square Test: by Dr. M.Supriya Moderator:Dr.B.Aruna, M.D. (H)
75 pages
Nonparametrictest 140723051620 Phpapp02 PDF
No ratings yet
Nonparametrictest 140723051620 Phpapp02 PDF
51 pages
Chi-Square Test and Its Application in Hypothesis
No ratings yet
Chi-Square Test and Its Application in Hypothesis
3 pages
Hypothesis Testing - Chi Squared Test
No ratings yet
Hypothesis Testing - Chi Squared Test
16 pages
Chi Square Goodness-of-Fit Tests
No ratings yet
Chi Square Goodness-of-Fit Tests
5 pages
Chisquare
No ratings yet
Chisquare
10 pages
Chi Square (KI Square) Test
No ratings yet
Chi Square (KI Square) Test
30 pages
chisquaretest
No ratings yet
chisquaretest
16 pages
Chi Square Method
No ratings yet
Chi Square Method
34 pages
Chi Square Test PDF
No ratings yet
Chi Square Test PDF
82 pages
Chi Squared for Beginners
From Everand
Chi Squared for Beginners
Stephanie Glen
No ratings yet
Lecture 11
No ratings yet
Lecture 11
30 pages
Ch.5 Chi-square Test
No ratings yet
Ch.5 Chi-square Test
20 pages
Lecture 17- Ch10- ChiSquare Test
No ratings yet
Lecture 17- Ch10- ChiSquare Test
35 pages
Chi Square Test
No ratings yet
Chi Square Test
16 pages
Non Parametric Tests
No ratings yet
Non Parametric Tests
22 pages
Lecture Note BUS173 02
No ratings yet
Lecture Note BUS173 02
16 pages
Parametric vs non parametric tests- Chi Square Test
No ratings yet
Parametric vs non parametric tests- Chi Square Test
21 pages
11. Hypo_Test_Lec_2
No ratings yet
11. Hypo_Test_Lec_2
21 pages
Chi Square Test
No ratings yet
Chi Square Test
9 pages
Lecture 13-14-15 Chi - Square Test
No ratings yet
Lecture 13-14-15 Chi - Square Test
22 pages
Comparing Frequencies Using Chi-Square
No ratings yet
Comparing Frequencies Using Chi-Square
4 pages
6.3 Chi-Square (2)
No ratings yet
6.3 Chi-Square (2)
35 pages
Chapter 6. Chi-Square Test
No ratings yet
Chapter 6. Chi-Square Test
25 pages
Week 6 Lecture
No ratings yet
Week 6 Lecture
28 pages
Introduction To Nonparametric Statistics Craig L. Scanlan, Edd, RRT
No ratings yet
Introduction To Nonparametric Statistics Craig L. Scanlan, Edd, RRT
11 pages
Chisquare Gonzales
No ratings yet
Chisquare Gonzales
32 pages
11. Categorical data analysis 2023
No ratings yet
11. Categorical data analysis 2023
73 pages
Maths report (2)
No ratings yet
Maths report (2)
15 pages
Statistical Theory Lecture 5-2025
No ratings yet
Statistical Theory Lecture 5-2025
13 pages
10measures of Association
No ratings yet
10measures of Association
249 pages
Univariate Statistics: Statistical Inference: Testing Hypothesis
No ratings yet
Univariate Statistics: Statistical Inference: Testing Hypothesis
28 pages
1categorical Data Analysis (Chi Square) June 2022
No ratings yet
1categorical Data Analysis (Chi Square) June 2022
194 pages
CHI SQUARE GOODNESS OF FIT TEST
No ratings yet
CHI SQUARE GOODNESS OF FIT TEST
25 pages
Chi-Square Test Presentation
No ratings yet
Chi-Square Test Presentation
28 pages
CHAPTER FOUR (1)
No ratings yet
CHAPTER FOUR (1)
26 pages
When To Use Chi-Square? Sample Problems
No ratings yet
When To Use Chi-Square? Sample Problems
5 pages
Lecture 3_Measuresof Assocn
No ratings yet
Lecture 3_Measuresof Assocn
55 pages
Chapter 6
No ratings yet
Chapter 6
10 pages
Define the null hypothesis (no difference between sample and theoretical distribution) and the alternative hypothesis (difference exists).
No ratings yet
Define the null hypothesis (no difference between sample and theoretical distribution) and the alternative hypothesis (difference exists).
21 pages
Chi-Square Test and Its Application in Hypothesis
No ratings yet
Chi-Square Test and Its Application in Hypothesis
3 pages
Statistics II Essentials
From Everand
Statistics II Essentials
Emil Milewski
2.5/5 (1)
Correlation and Regression: Six Sigma Thinking, #8
From Everand
Correlation and Regression: Six Sigma Thinking, #8
Sumeet Savant
5/5 (1)
Innovations in Digital Economy First International Conference SPBPU IDE 2019 St Petersburg Russia October 24 25 2019 Revised Selected Papers Dmitrii Rodionov - Download the ebook today and own the complete version
100% (3)
Innovations in Digital Economy First International Conference SPBPU IDE 2019 St Petersburg Russia October 24 25 2019 Revised Selected Papers Dmitrii Rodionov - Download the ebook today and own the complete version
55 pages
Research Chapter 2 and 3
No ratings yet
Research Chapter 2 and 3
17 pages
French Grade X Gap Analysis
No ratings yet
French Grade X Gap Analysis
1 page
Reading Challenges Among Junior Highschool Students 11
No ratings yet
Reading Challenges Among Junior Highschool Students 11
12 pages
Full Dokumen
No ratings yet
Full Dokumen
7 pages
Cryptocurrency Investment Research by Hushbot
No ratings yet
Cryptocurrency Investment Research by Hushbot
17 pages
Edp3141-Educational Research Methods
No ratings yet
Edp3141-Educational Research Methods
181 pages
Quality Report
No ratings yet
Quality Report
21 pages
Improving HOTS-SCience Based
No ratings yet
Improving HOTS-SCience Based
15 pages
How To Write A Literature Review For A Grant Proposal
No ratings yet
How To Write A Literature Review For A Grant Proposal
4 pages
Artikel KKN Sri Diana
No ratings yet
Artikel KKN Sri Diana
8 pages
What Makes A Good Research
85% (20)
What Makes A Good Research
4 pages
Project Sakthi
No ratings yet
Project Sakthi
58 pages
Pizarro 12 - STEM Diagram
No ratings yet
Pizarro 12 - STEM Diagram
1 page
Veronica
No ratings yet
Veronica
3 pages
F Odi 2056 Syllabi Methods of Research 1
No ratings yet
F Odi 2056 Syllabi Methods of Research 1
8 pages
Jurnal
No ratings yet
Jurnal
25 pages
Hypothesis
100% (1)
Hypothesis
29 pages
Research and Practucal Research: Research Methodology AMET University
No ratings yet
Research and Practucal Research: Research Methodology AMET University
17 pages
Research Defense
No ratings yet
Research Defense
1 page
Aritra Das Critical Analysis Report For CIA 3
No ratings yet
Aritra Das Critical Analysis Report For CIA 3
5 pages
Download The Reviewer s Guide to Quantitative Methods in the Social Sciences Gregory R. Hancock ebook All Chapters PDF
No ratings yet
Download The Reviewer s Guide to Quantitative Methods in the Social Sciences Gregory R. Hancock ebook All Chapters PDF
55 pages
Working With Research Participants Sampling and Ethics
100% (1)
Working With Research Participants Sampling and Ethics
45 pages
Methodology Six Hats Thinking
No ratings yet
Methodology Six Hats Thinking
13 pages
Single-Subject Research
No ratings yet
Single-Subject Research
24 pages
Ppt - 3 (Chi-square Test)
No ratings yet
Ppt - 3 (Chi-square Test)
12 pages
Pre Lab Flowchart
No ratings yet
Pre Lab Flowchart
2 pages
Evidence-Based Practice for Nursing and Healthcare Quality Improvement 1st Edition Geri Lobiondo-Wood 2024 Scribd Download
100% (2)
Evidence-Based Practice for Nursing and Healthcare Quality Improvement 1st Edition Geri Lobiondo-Wood 2024 Scribd Download
65 pages
Contoh Literature Review Makalah
100% (1)
Contoh Literature Review Makalah
6 pages

Biostat - Group 3

Uploaded by

Biostat - Group 3

Uploaded by

BIOSTATISTICS

One of the examples is grouped data. More specifically,

Ordinal data is a type of data that follows a natural order. The

Cell 1: (72/203) x 100 = 35.47% (male smoker)

To explore and adjust for confounding, we can use a stratified

Where ai, bi, ci and di are the numbers of participants in

To assess the relationship between two categorical variables while

This statistic should only be calculated when:

Two raters each rate one trial on each sample, or

0 = agreement equivalent to chance.

20 images were rated Yes by both.

Step 2: Rater A said Yes to 25/50 images, or 50%(0.5). Rater B

Rater A said No to 25/50 images, or 50%(0.5).

Step 4: Calculate Pe. Pe = 0.30 + 0.20 = 0.50.

k = (Po – Pe) / (1 – Pe ) = (0.70 – 0.50) / (1 – 0.50) = 0.40.”

The statistical models that are analyzed by

The chi-square goodness of fit test is a hypothesis

A chi-square (Χ²) goodness of fit test is a

Qualitative study designs

Apple 180 200

Lime 250 200

Cherry 120 200

Grape 225 200

Orange 225 200

Apple 180 200 180- 200= -20

Lime 250 200 250-200= 50

Cherry 120 200 120-200= -80

Grape 225 200 225-200= 25

Orange 225 200 225-200= 25

Lime 250 200 250-200= 50 2500

Grape 225 200 225-200= 25 625

Orange 225 200 225-200= 25 625

Lime 250 200 250-200= 50 2500 2500 / 200 = 12.5

Cherry 120 200 120-200= -80 6400 6400 / 200 = 32

Grape 225 200 225-200= 25 625 625 / 200 = 3.125

Orange 225 200 225-200= 25 625 625 / 200 = 3.125

It is used to determine if there are differences on a dichotomous

Nominal variable with two categories

Type of study design

You might also like