Lecture 3-5 - Analyzing Contingency Tables: Azadeh Alimadad. DANA 4820 Jan 17 - 24, 2022
Lecture 3-5 - Analyzing Contingency Tables: Azadeh Alimadad. DANA 4820 Jan 17 - 24, 2022
Example 1: The Vanguard Project was a study of HIV incidence and risk
behaviours among young MSM (men who have sex with men). Briefly,
HIV-seronegative self-identified gay and bisexual men between 18 and
30 years of age who lived in Vancouver were recruited at community
events, in community clinics and through advertisements in local
newspapers. At baseline and annually thereafter, participants
underwent HIV serologic testing with pre- and post-test counselling and
completed a confidential self-administered questionnaire. They studied
the associations between questionnaire answers (E) and the later risk of
becoming HIV-seropositive (D).
1. This is an example of an ________ study.
A. Experimental
B. Observational
Azadeh Alimadad. DANA 4820 Jan 17 – 24, 2022
Observational Studies:
To identify cohort, case-control and cross-sectional studies, think about
their E (exposure, determinant, or predictor) and D (disease or
outcome) variables:
The TIMING of a study concerns when the outcome (D) occurs in time
relative to when the study begins. If the outcome (D) has already
occurred (and therefore all events being measured) when the study
begins, then the study is RETROSPECTIVE. If the outcome has not yet
occurred when the study begins, then it is PROSPECTIVE.
Let r denote the # of row and c denote the # of columns. This table has
cells that display the ………… possible combinations of outcomes.
1. Joint probabilities:
2. Marginal probabilities:
3. Conditional probabilities:
What is the probability of belief in an afterlife within males?
Note: For rare disease (Probability of having disease is low) even a test
with high sensitivity and specificity can have low PPV.
100
Yes No
1 99
Test + - test + -
1 0 12 87
Azadeh Alimadad. DANA 4820 Jan 17 – 24, 2022
Myocardial Infarction
Group Yes No Total
Placebo 189 10,845 11,034
Aspirin 104 10.933 11,037
Placebo:
Aspirin:
Difference:
0.010 - 0.001=0.009
RR in MI example is=……………..
𝑆𝑜𝑚𝑒𝑡ℎ𝑖𝑛𝑔 ℎ𝑎𝑝𝑝𝑒𝑛𝑖𝑛𝑔
Odds=
𝑠𝑜𝑚𝑒𝑡ℎ𝑖𝑛𝑔 𝑛𝑜𝑡 ℎ𝑎𝑝𝑝𝑒𝑛𝑖𝑛𝑔
Example: For the probability of success 𝜋=.75, what is the odds of success?
Azadeh Alimadad. DANA 4820 Jan 17 – 24, 2022
What if 𝜋=.5?
What if 𝜋=.25?
𝜋1
𝑜𝑑𝑑𝑠1 ⁄1−𝜋
1
Odds ratio 𝜃 = = 𝜋2 which can be any non-negative number.
𝑜𝑑𝑑𝑠2 ⁄1−𝜋
2
Azadeh Alimadad. DANA 4820 Jan 17 – 24, 2022
Azadeh Alimadad. DANA 4820 Jan 17 – 24, 2022
𝑜𝑑𝑑𝑠1
When X and Y are independent, then Odds ratio 𝜃 = =1
𝑜𝑑𝑑𝑠2
Example (2.4 book). The opening 2018 World Cup odds against being the wining
team specifies by espn.com were 9/2 for Germany, 5/1 for Brazil, 11/2 for France,
20/1 for England, and 7/1 for Spain. Find the corresponding prior probabilities of
winning for these five teams.
Azadeh Alimadad. DANA 4820 Jan 17 – 24, 2022
A. Calculate the odds of drinking among those with fatal injuries? Odds of
drinking among those with non-fatal injuries. Calculate odds ratio?
The odds ratio is also called the cross product ratio because it equals the ratio of
𝒏𝟏𝟏 𝒏𝟐𝟐
the products
𝒏𝟏𝟐 𝒏𝟐𝟏
Azadeh Alimadad. DANA 4820 Jan 17 – 24, 2022
Inference:
The sampling distribution of odds ratio is highly skewed (It is not Normal), so we
usually use its natural logarithm, 𝑙𝑛(𝜃) for making statistical inference.
(𝑙𝑛(𝜃), ℎ𝑎𝑠 𝑎𝑝𝑝𝑟𝑜𝑥𝑖𝑚𝑎𝑡𝑒𝑙𝑦 𝑛𝑜𝑟𝑚𝑎𝑙 𝑑𝑖𝑠𝑡𝑟𝑖𝑏𝑢𝑡𝑖𝑜𝑛)
𝟏 𝟏 𝟏 𝟏
SE=√ + + +
𝒏 𝒏
𝟏𝟏 𝒏 𝒏
𝟏𝟐 𝟐𝟏 𝟐𝟐
Association between lung cancer and smoking. 709 cases and 709 controls were
selected, and the outcome measured is whether the subject ever was a smoker.
Lung Cancer
Smoker Cases Controls
Yes 688 650
No 21 59
Total 709 709
install.packages("epitools")
library(epitools)
$measure
odds ratio with 95% C.I.
Predictor estimate lower upper
Exposed1 1.000000 NA NA
Exposed2 2.973773 1.786737 4.949427
Computations for the score confidence interval our beyond our scope, but it is
available in software.
data:
95 percent confidence interval:
1.794500 4.928018
When 𝜋̂1 𝑎𝑛𝑑 𝜋̂2 both are close to zero, then the odds ratio and relative risk take
similar values.
Azadeh Alimadad. DANA 4820 Jan 17 – 24, 2022
Example: A research study estimated that under a certain condition, the probability
a subject would be referred for heart catheterization was 0.906 for whites and
0.847 for blacks.
a. A press released about the study stated that the odds of referral for cardiac
catheterization for blacks are 60% of the odds for whites. Explain how they
obtained 60% (more accurately, 57%).
Azadeh Alimadad. DANA 4820 Jan 17 – 24, 2022
b. An associated Press story, that described the study stated “Doctors were only
60% as likely to order cardiac catheterization for blacks as for whites”. What is
wrong with this interpretation? Give the correct percentage for this interpretation.
Example: Consider the following two studies reported in the New York Times: a. A
British study reported that of smokers who get lung cancer, “women were 1.7 times
more vulnerable than men to get small-cell lung cancer”. Is 1.7 an odds ratio or a
relative risk?
b. A National Cancer Institute study about tamoxifen and breast cancer reported
that the women taking the drug were 45% less likely to experience invasive breast
cancer compared to the women taking placebo. Find the relative risk for
OR shows the relationship between our response & explanatory variable. Now we
need to check is this relationship statistically significant or not.
Testing Independence:
We assume H0 is true
2
( 𝑛𝑖𝑗 − 𝜇𝑖𝑗 )2
𝑋 =∑
𝜇𝑖𝑗
2. Likelihood-Ratio Statistic
𝑛𝑖𝑗
𝐺 2 = 2 ∑ 𝑛𝑖𝑗 log( )
𝜇𝑖𝑗
495 498
𝐺 2= 2 [495 ∗ 𝑙𝑛 ( ) + … … … + 498 ∗ ln (485.4) ] =12.6
456.9
Azadeh Alimadad. DANA 4820 Jan 17 – 24, 2022
Simpson’s paradox: