0% found this document useful (0 votes)
7 views67 pages

Essential Medical Stats - MSC Clin Res

The document provides an overview of essential medical statistics, covering topics such as the distinction between sample and population, types and distribution of data, statistical errors, and the significance of p-values. It emphasizes the importance of understanding data types for appropriate statistical analysis and the implications of errors in research findings. Additionally, it discusses sample size calculation and the selection of statistical tests based on data characteristics and research goals.

Uploaded by

Ajay Upadhayay
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views67 pages

Essential Medical Stats - MSC Clin Res

The document provides an overview of essential medical statistics, covering topics such as the distinction between sample and population, types and distribution of data, statistical errors, and the significance of p-values. It emphasizes the importance of understanding data types for appropriate statistical analysis and the implications of errors in research findings. Additionally, it discusses sample size calculation and the selection of statistical tests based on data characteristics and research goals.

Uploaded by

Ajay Upadhayay
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 67

1

Essential Medical Statistics

Priya Ranganathan
Professor, Anaesthesiology
Tata Memorial Hospital

[email protected]
Overview

• Sample versus population


• Types and distribution of data
• Statistical errors
• “p” value
• Standard error and confidence intervals
• Principles of sample size calculation
• Statistical tests: Which test where?

2
Sample versus population

3
Average height of people aged 35 years

Population =
Sample =
all people in
people in this
the world aged
meeting aged
35 years
35 years

4
Sample versus population

160
cm
165 cm

162
cm 164
cm

5
Take home message 1

• We do studies in samples
• Samples are representative of the population
• Results of sample may differ from population
(“error”)
• Since we do not know true population results, we
accept the possibility of error

6
Types and Distribution of Data

7
• Categorical or qualitative data
– What type ?
• Eye colour: black, brown, blue
• Severity of pain: mild, moderate, severe

• Numerical or quantitative data


– How much ?
• Height: 160cm, 175cm, 180cm
• Severity of pain: 3, 5, 7

8
Categorical or qualitative data
Nominal data Ordinal data
No order between classes Order between classes

Brown = black = blue Mild < moderate < severe

Binary / Non-binary

Binary Binary
Only 2 categories More than 2 categories

Dead / Alive Dead / Alive with disease / Alive


Smoker / non-smoker without disease
Smoker / ex-smoker / non-smoker9
Numerical or quantitative data

Continuous Discrete
Objects are measured Objects are counted in
on a continuous scale whole numbers

ü Blood sugar level ü No. of children


ü Age ü GCS

10
Summarising data

• Categorical data

– Proportions

• Numerical data

– Mean (and standard deviation)

– Median (and inter-quartile range)

11
Normally distributed data
(Gaussian distribution)

Mean + 2 SD = 95%
of observations

•Mean = median

12
Distribution-free (Skewed) data

•Median value

•Range of values

•Inter-quartile ranges

13
Why is type and distribution of data
important?
• Determines
– how you report the results
– the choice of statistical test for analysis

14
Take home message 2

Data

Categorical Numerical

Ordinal Nominal Discrete Continuous

Proportions Mean / Median

15
Take home message 2…

• Collect data as numerical values


– Can be re-coded as categories
– Converse not possible

• For example,
– Pain score of 5/10 can be classified as moderate
– But moderate pain can mean anything from 4 to 7

16
Quiz - 1

Sample: Students in a school

What type of data would you expect?


• Height
• Gender
• Weight of diabetic students
• Number of siblings

17
18

Statistical errors
19
Study question

• Is dexamethasone better than placebo for preventing


hospitalization in infants with bronchiolitis ?

• Population - Infants with bronchiolitis


• Intervention - Dexamethasone
• Comparator - Placebo
• Outcome - Hospitalization

20
Statistical errors

Population truth Study finding

Dexa = Placebo √
Dexa = Placebo
Dexa is different False positive error
from Placebo Type I error
False negative error
Dexa is different Dexa = Placebo
Type 2 error
from Placebo
Dexa is different √
from Placebo

21
Type 1 error or α error

– Finding a difference where a difference does not


actually exist
– Conventionally set at not more than 5%
– If a difference is found, we are 95% sure that it is a
true difference
– Only 5% probability that the difference occurred by
chance

22
Type 2 error or β error

• Not finding a difference where a difference exists


• Power = 1 – β i.e. ability of the study to find a
difference
• β error can be set at 10% or 20%
• i.e. the power of the study can be 90% or 80%

23
24

“p“ value
25
• α is what we set as the acceptable limit for
finding a result by chance
– Usually set as 5 % or 0.05

• P value tells us – what is the probability that the


result observed is not a true result, but a result
occurring by chance

• If α = 0.05, then any p value less than 0.05 is


significant
26
27
Take home message 3
• Alpha error or Type 1 error
– False positive result

• Beta error or Type 2 error


– False negative result

• P value
– Probability that the result has occurred by
chance
Quiz – 2(a)

• In a study, the type 1 error is 1%

• This means that we want to


a) Restrict the possibility of a false positive result
to 1%
b) Restrict the possibility of a false negative
difference to 1%
Quiz – 2(b)

A study compared rates of intubation after HFNC


and NIV

HFNC – 50% intubation


NIV – 30% intubation
Difference – 20%
p value - 0.63
31

Standard Error
and
Confidence Intervals
32

Sample versus population

Sample statistic
differs from the
population
statistic
33

• In reality, we have only one sample

• Based on that sample, we need to predict the


population statistic

• For example, if sample mean height is 164 cm,


what is the likely population mean height?
34

Standard error and Confidence intervals

• Based on the standard error, we can calculate


confidence intervals

• Standard error tells us how much is the difference


between the sample result and the population result.

• Confidence intervals tell us about the possible range


of values in the population

• 95% confidence intervals = sample mean or sample


proportion + 2 standard errors
35
36

Interpretation of confidence intervals

• When 95% confidence intervals for difference in


means includes zero, the result is not statistically
significant

• When 95% confidence intervals for proportions


(odds ratio or relative risk) include 1, the result is
not statistically significant
Take home message 4

• Standard error
– Difference between sample result and true
population result

• Confidence intervals
– Range of values you could expect to see in
the population
40

Sample size calculation


41

What you need to know

• Type 1 error
• Type 2 error (power)
• Expected values in the two groups (margin of
difference)
42
43

Sample size calculation


• p1 = 40% = 0.4
• clinically meaningful reduction = 12%
• therefore, p2 = 28% = 0.28
• α = 0.05
• β = 0.02
44
45

Which test where?


Choice of statistical test
• Type of data
– Categorical or numerical
– Continuous or discrete
• Distribution of data
– Normal or skewed
• Goal of analysis
– Difference
– Association
– Agreement
• Number of groups
– Two versus more than two
• Paired or independent data
47

Are we comparing unpaired and


independent groups ?
Unpaired groups

Numerical Categorical

Normal Distribution-
free
Chi-square test
Fisher’s exact test
2 groups 2 groups
Unpaired-t Mann-Whitney

> 2 groups > 2 groups


ANOVA Kruskal-Wallis

48
49

Study of bone mineral density in resident doctors


working at a teaching hospital.

J Postgrad Med. 2010; 56: 65-70.

• Males versus females


• Bone mineral density values
• Unpaired groups, continuous data, normally
distributed
• Compared using unpaired t-test
50

A comparison of laparoscopic and open Nissen fundoplication


and gastrostomy placement in the neonatal intensive care unit
population.

J Pediatr Surg. 2010; 45: 346-9.

• Endpoint – Operative time in minutes


• Three independent groups
• Continuous data, normally distributed
• Compared using ANOVA
51

Comparison of ramosetron with ondansetron for


prevention of postoperative nausea and vomiting in
patients undergoing gynaecological surgery

Br J Anaesth. 2009; 103: 549-5

• Two independent groups


• End-point: Incidence of nausea and vomiting
(proportion)
• Compared using the chi-square test
52

Are we comparing paired groups – before


and after ?
Paired groups

Numerical Categorical

Normal Distribution-
free

2 readings 2 readings 2 readings > 2 readings


Paired-t Wilcoxon McNemar’s Cochran’s Q

> 2 readings > 2 readings


Repeated Friedman’s
measures
ANOVA

53
54

Simvastatin versus placebo for reduction in


lipid levels
• Two groups – simvastatin, placebo
• Within each group
– Baseline lipid levels
– Levels after administration of drug

• Paired t-test used to compare change from


baseline to post-drug
55

Is there association between two


variables?
56

Numerical data Categorical data

Both normally Otherwise Relative risk


distributed
Odds Ratio

Pearson’s Spearman’s
correlation Kendall’s
coefficient
57

Study of bone mineral density in resident


doctors working at a teaching hospital.

• Is there an association between bone mineral


density and weight?

• Both normally distributed data


• Pearson’s linear correlation coefficient
58

Is there agreement between assessments?


59

Screening tests
Diagnostic tests
Inter-rater validation

Numerical data Categorical data

Intra-class correlation Cohen’s kappa


coefficient

Bland-Altman plot Kendall’s coefficient


of concordance
60

Comparison of blood sugar levels measured


in the laboratory versus bedside glucometer

• Continuous data
• Agreement compared using a Bland Altman plot
61

How reliable are CT scans for the evaluation of


calcaneal fractures?
Arch Orthop Trauma Surg. 2011 May

• CT scans evaluated by 57 different evaluators


• Inter-rater agreement evaluated by Cohen’s
kappa statistic
Take home message 5

• Choose the statistical test depending on


– Type of data
– Distribution of data
– Goal of analysis
– Number of groups
– Paired or independent data
Take home message 1

• We do studies in samples
• Samples are representative of the population
• Results of sample may differ from population
(“error”)
• Since we do not know true population results,
we accept the possibility of error

63
Take home message 2

Data

Categorical Numerical

Ordinal Nominal Discrete Continuous

Proportions Mean / Median

64
Take home message 2…

• Collect data as numerical values


– Can be re-coded as categories
– Converse not possible

• For example,
– Pain score of 5/10 can be classified as
moderate
– But moderate pain can mean anything from 4
to 7
65
Take home message 3
• Alpha error or Type 1 error
– False positive result

• Beta error or Type 2 error


– False negative result

• P value
– Probability that the result has occurred by
chance
Take home message 4

• Standard error
– Difference between sample result and true
population result

• Confidence intervals
– Range of values you could expect to see in
the population
Take home message 5

• Choose the statistical test depending on


– Type of data
– Distribution of data
– Goal of analysis
– Number of groups
– Paired or independent data

You might also like