0% found this document useful (0 votes)

40 views

Introduction Biostat

Uploaded by

lailykurnia

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

40 views

Introduction Biostat

Uploaded by

lailykurnia

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

You are on page 1/ 32

Biostatistics (BIOL130024.

01)
Professor Zewei Luo
Dr Chenqi Lu

Telephone: 021-55665269
E-mail: [email protected]
[email protected]
Office: 2309 Guanghua east Building
Major Teaching Components
• Course lectures

• Exercises with computer software

(MiniTab)

• Inter-course tests

• Final examination
Reference
• Text: An Introduction to Biostatistics
Thomas Glover & Kevin Mitchell
Copyright © 2002 Waveland Press

• Reference Books
(1) Mather, K. (1973). Statistical Analysis in Biology
Chapman & Hall.
(2) Elandt-Johnson, R. C. (1971). Probability Models and
Statistical Genetics John Wiley & Sons.
Chapter 1. Introduction to Data Analysis

§1.1. A general concept of Scientific Methods

Design and Evaluation of Experiments

• Observation of a particular event

• Statement of the problem

• Formulation of a hypothesis

• Design of the experiment

• Making a Prediction
Statistics -- The subject which helps design and interpret
experiments properly

• collection
• manipulation

• summarization

• analysis of experimental data

• utilization of the data to test scientific hypotheses

Biostatistics (Biometry) – Statistics in Biosciences

§ 1.2.Basic concepts in Biostatistics
• Population vs. Sample
Descriptive measure

Parameters Statistics

• Variables & Data Types

1. Quantitative variables
(a). Continuous variables or interval data
(b). Discrete variables
2. Ranked (ordinal) variables
3. Categorical data
§ 1.3. Measure of Central Tendency
(1) Mean
• Population Mean: if a population contains N entities
whose measures are x1, x2, … xN, the arithmetic mean
is given by
N
1

N
x
i 1
i

• Sample Mean: if a sample collected from a

population contains n observations: x1, x2, … xn, the
sample mean is given by
1 n
X   xi
n i 1
An example demonstrating  and X
The population measures: 1,6,4,5,6,3,8,7 with N = 8

If a sample with n = 3 is randomly collected from the population,

there are a total of 56 possible such samples. Of the samples,

Four have a mean of 5, equal to the population mean

The rest have a mean differing from the population mean

Average over all the sample means gives a mean of 5, the population mean

Sample mean is an unbiased estimate of mean of

the population from which the sample is collected.
(2) Median – the “middle” value of an ordered list of observation
Depth of an observation (d) – its position relative to the nearest
extreme (end) when the data are listed ascendingly.

Population or sample median (M or X ) is defined as the observation

whose depth is d = (N+1)/2 or d = (n+1)/2, in the above example,

X  xd ( n1)/ 2  x8  38 cm
(3) Mode – the most frequent occurring observation in a data set
In the above example, the mode = 29 cm
• Data Types Measure of Central Tendency
1. Quantitative variables 1. Mean
(a). Continuous variables
(b). Discrete variables
2. Ranked (ordinal) variables 2. Median
informative

3. Categorical data 3. Mode

§ 1.4. Measure of Dispersion and Variability
(1) Range – the difference between the largest and smallest
observations in a group of data

(2) Variance – average of squared deviates of each observation

from mean of observations in a group of data
N
1
   x  
2 2
Population variance i
N i 1

1 n
s    xi  x 
2 2
Sample variance
n  1 i 1
n
It is easy to show that x  x   0
i 1
i and

1 n
1  n
 n

2

s    xi  x    xi    xi  / n 
2 2 2

n  1 i 1 n  1  i 1  i 1  
(3) Standard deviation (s.d.)
Population s.d. () and Sample s.d. (s)

§ 1.5. Descriptive statistics for grouped data


c
fi xi 857
X i 1
  1.1 plants/quadrat

c
f 800
i 1 i

  / n  1805  857 / 800  1.11 (plants/quadrat)

2

c c
i 1
fx 
2
i i i 1
f i xi 2
s2  2

n 1 799
§ 1.6. Quartile and Box Plots
A group of n observations (data points) are in an ascending order:

x1 , x2 , , Q1 , , Q2 , , Q3 , , xn 1 , xn
X

First quartile or Second quartile Third quartile or

25th percentile or 50th percentile 75th percentile

Inter-quartile range IQR  Q3  Q1

The five-number summary: x1 , Q1 , Q2 (or X ), Q3 , xn
Box Plot – a graphic presentation of the five-number descriptive
summary

Weights of 15 lake trouts caught in Geneva’s

Lake Trout Derby in 1994 outlier

f3  Q3  1.5( IQR)  6.545 lb

X
Q1

f1  Q1  1.5( IQR)  0.535 lb

Chapter 3. Probability Distribution
Random variable (r.v.) – a variable whose actual value is
determined by chance operations, which are
fully specified by a probability distribution.

Discrete R.V. – takes discrete values, e.g. X = -1, 0, 2, 5, …

Discrete R.V. – takes continuous values, e.g. X   R etc

§ 3.1. Probability distributions of discrete r.v.

Probability distribution or probability density function, f ( ), of a
discrete r.v. X is a real function giving probability that X takes a
value of x, i.e.
f is defined for all possible values of X
f ( x)  P  X  x  f ( x)  0

 f ( x)  1
all x
Uniformity in the probability across all possible values the r.v. X may
take! This distribution is also referred as to uniform distribution.

The uniformity in the above distribution no longer holds, graphically

If the number of rolling approaches infinity, r.v. X x1 , x2 , 
  E  X    x f ( x) expected value of X
all x

  E  X    x1 xf ( x)  (1  2 
6
In example 3.1.  6) / 6  3.5
In example 3.2.   E  X   
12
i 2
xf ( x)  (1 2  2  3   112) / 36  7
In general, if H(X) is a function of r.v. X with probability distribution f ( x, )
then
E  H ( X )    H ( x) f ( x)
all x

example 3.1 X1 ,
examples 3.2 X1 +X2 (X1 i.i.d. X2 )
and example 3.3. 2 X1,

E ( X 1  X 2 )  E ( X 1 )  E ( X 2 )  2 E ( X 1 )  2  3.5  7
E (2 X 1 )  2 E ( X 1 )  2  3.5  7
In the infinite die rolling experiment, we explore variation of the
outcomes.
Var ( X )   2   ( xi   )2 1/ N   ( xi   )2 f ( xi ) 
E ( X i   ) 2   E ( X 2 )   2  E ( X 2 )   E ( X ) 
2

In example 3.1. E  X 2   15.167 and E  X   3.5 i.e Var ( X1 )  2.917

In example 3.2. Var ( X1  X 2 )  5.834  2Var ( X1 )
Var ( X1  X 2 )  Var (2 X1 )
In example 3.3. Var (2 X1 )  11.668  4Var ( X1 )
§ 3.2. Binomial Distribution

n
A discrete r.v. X  X i with Xi being discrete r.v. and they are
i
independent each other but have the same probability distribution
(i.i.d)
1 p
Xi  
0 1 p

X follows a binomial distribution with parameters n and p. Its pdf

has a form

 
n x
f ( x)  P  X  x   P  i 1 X i  x    p (1  p)n  x
n

 x
It will be easy to demonstrate that

  E( X )  E i 1 X i  i 1 E ( X i )  i 1 p  np
 
n n n

 

 2  Var ( X )  Var i 1 X i   i 1Var ( X i )  i 1 p(1  p)  np(1  p)

n n n

 
It is important to notice that a binomial r.v. is sum of outcomes of a
series of independent 0-1 trial.

Example 3.8. males of 5 children: p = 0.5 and n = 5

  np  2.5 and  2  np(1  p)  1.25

This binomial distribution is

symmetric with respect to the mean
B(10,0.25)
B(20,0.25)
In some applications, we are interested in calculating
P  X      x f ( x)  FX ( )
which is referred as to cumulative distribution function (CDF)
of r.v. X

F ( x)  0 for any x

F ( x ) is monotonically increasing, and

F ()  1.0
§ 3.3. Poisson Distribution
A discrete r.v. X = the number of occurrences of a rare event in an
interval of time or space, occurrence of the event is independently
distributed across the time interval (or space location).

pdf of X is specified by a parameter  and given by

e   x
f ( x)  P( X  x)  x  0,1, 2,
x!
It is easy to show that E ( X )   and Var ( X )  

Examples of Poisson distribution

• Accidents on a pedestrian crossing per week
• Children with meningitis in a family
• Individuals of a rare species in a quadrat
• Bacterial cells in a very dilute liquid culture
• Chiasmata between two genes on a chromosome
Use of Poisson distribution
• As with binomial distribution, events are assumed
to be independent of each other

• So, the Poisson distribution can be used to test

for independence of events

• For Example:
A rare species could be distributed at random
over a site (independent) or clumped in certain
areas (non-independent).
Possible distributions.

Random

Clumped

Uniform
Effect of  on Poisson distribution
0.7
0.3 0.3
Probability

0.5
0.2 0.2

0.3
0.1 0.1

0.1

0 1 2 3 0 2 4 6 0 2 4 6 8 10
x x x

 = 0.5  = 2.0  = 5.0

As  increases the distribution tends towards a Normal distribution.

Poisson Approximation to Binomial Distribution

For X~ B(n, p) with n  100 and np  10 then X~P() with  =np

B(20,0.05)
P(1.0)
§ 3.4. Normal Distribution
A r.v. X follows a distribution with pdf defined as
1  ( x   )2 / 2 2
f ( x)  e
2 2

then the distribution is referred as to a normal (Gaussian) distribution

(a) f ( x)  0 for x   ,  


f ( x)
 (b) 

f ( x)dx  1

x
(c) P  X  x   F ( x)   f ( y )dy

 (d) E ( X )   and Var ( X )   2
The distribution is fully characterized by the two parameters  and
2
 
P    X       f ( x)dx  68%
 

  2
P    2  X    2    f ( x)dx  95%
 
2
  3
P    3  X    3    f ( x)dx  99%
 
3

The normal distribution with  = 0 and  2 =1.0 is referred as to

standard normal distribution N(0,1). If r.v. X ~ N(,  2 ), then
X 
Z N (0,1)

This unifies evaluation of probabilities of any normal distribution
(refer to Table C.3 or Minitab).
Normal approximation to Binomial distribution
A r.v. X ~ B(n,p) has mean  = np and variance  2 = np(1-p). If np(1-p)
> 3, then X ~ N( , 2)

But it should be noted that

n i x  0.5  np
FB ( x)  P( X  x)   i 0   p (1  p)  FN (
n i
x
)
i np(1  p)
Data type pdf Interval cdf
d-density

Continuous !=Probability x  f ( x )x

  f ( x )dx

Discrete =Probability 1  f ( x)
Normal approximation to Binomial distribution
A r.v. X ~ B(n,p) has mean  = np and variance  2 = np(1-p). If np(1-p)
> 3, then X ~ N( , 2)

0.1201
0.1196

But it should be noted that

n i x  0.5  np
FB ( x)  P( X  x)   i 0   p (1  p)  FN (
n i
x
)
i np(1  p)

Shruti-1 User Manual: Mutable Instruments
No ratings yet
Shruti-1 User Manual: Mutable Instruments
18 pages
Lecture Notes: Introduction To Condensed Matter Theory
No ratings yet
Lecture Notes: Introduction To Condensed Matter Theory
154 pages
Statistics For Traffic Engineers
No ratings yet
Statistics For Traffic Engineers
55 pages
Tear Analysis With A Single Variable: Computer Lab # 4
No ratings yet
Tear Analysis With A Single Variable: Computer Lab # 4
4 pages
Types of Statistics
No ratings yet
Types of Statistics
7 pages
IJMTT-V53P537
No ratings yet
IJMTT-V53P537
8 pages
Statistics and Probability Notes Part 1
No ratings yet
Statistics and Probability Notes Part 1
23 pages
mophong05_identifydistribution_09
No ratings yet
mophong05_identifydistribution_09
36 pages
Statistics
No ratings yet
Statistics
51 pages
BS Lect 05
No ratings yet
BS Lect 05
35 pages
3-Measures of Dispersion
No ratings yet
3-Measures of Dispersion
33 pages
4
No ratings yet
4
26 pages
Statistics
No ratings yet
Statistics
4 pages
Chapter Four
No ratings yet
Chapter Four
21 pages
4 Heterogenitas
No ratings yet
4 Heterogenitas
46 pages
Order-Statistics 2
No ratings yet
Order-Statistics 2
24 pages
Advanced Statistical Approaches To Quality: INSE 6220 - Week 4
No ratings yet
Advanced Statistical Approaches To Quality: INSE 6220 - Week 4
44 pages
Class 2 SP
No ratings yet
Class 2 SP
30 pages
11.modelling For Simulation
0% (1)
11.modelling For Simulation
120 pages
s00162-012-0273-y
No ratings yet
s00162-012-0273-y
12 pages
1 Heterogenitas
No ratings yet
1 Heterogenitas
46 pages
Chap1 Sampling Distribution
No ratings yet
Chap1 Sampling Distribution
20 pages
Lecture 11
100% (1)
Lecture 11
33 pages
Probability Distributions in R
No ratings yet
Probability Distributions in R
42 pages
Homework 2
No ratings yet
Homework 2
1 page
Basic Terms of Mathematical Statistics.: Enumeration
No ratings yet
Basic Terms of Mathematical Statistics.: Enumeration
5 pages
Input Modeling: Discrete-Event System Simulation
No ratings yet
Input Modeling: Discrete-Event System Simulation
14 pages
ch4 Standard Deviation
No ratings yet
ch4 Standard Deviation
14 pages
Chapter 05 W7 L1 Random Sample 2015 UTP C5
No ratings yet
Chapter 05 W7 L1 Random Sample 2015 UTP C5
8 pages
CHAPTER 5 Distributions of Functions of Random Variables
No ratings yet
CHAPTER 5 Distributions of Functions of Random Variables
6 pages
Chapter 09 - Analysis of Variance
No ratings yet
Chapter 09 - Analysis of Variance
14 pages
Chapter Four: Measures of Variation
No ratings yet
Chapter Four: Measures of Variation
26 pages
Lectur 4 Basic Statistical Descriptions of Data
No ratings yet
Lectur 4 Basic Statistical Descriptions of Data
44 pages
Hydrology Lesson 1 Uncertainty and Descriptive Statistics: Stefania Tamea
No ratings yet
Hydrology Lesson 1 Uncertainty and Descriptive Statistics: Stefania Tamea
14 pages
Probability and Statistics
No ratings yet
Probability and Statistics
92 pages
Statistics Notes
No ratings yet
Statistics Notes
15 pages
Chapter 1 - Descriptive Statistcs - L1 - Jan 2024
No ratings yet
Chapter 1 - Descriptive Statistcs - L1 - Jan 2024
13 pages
Lecture+2+slides+with+Q%26A+20242025
No ratings yet
Lecture+2+slides+with+Q%26A+20242025
38 pages
Gsbiju MA202 3 1
No ratings yet
Gsbiju MA202 3 1
5 pages
SPSC Final Chapter-4-1-1-3-1-1
No ratings yet
SPSC Final Chapter-4-1-1-3-1-1
63 pages
JEE Main Revision Notes On Statistics and Probability Free PDF
No ratings yet
JEE Main Revision Notes On Statistics and Probability Free PDF
7 pages
Chap1SamplingDistributions
No ratings yet
Chap1SamplingDistributions
14 pages
Basic Review w Sampling - Copy
No ratings yet
Basic Review w Sampling - Copy
28 pages
Week5 PDF
No ratings yet
Week5 PDF
33 pages
IPS (Points and Interval Estimate)
No ratings yet
IPS (Points and Interval Estimate)
23 pages
Notes Stat
No ratings yet
Notes Stat
6 pages
Intro Statistics
No ratings yet
Intro Statistics
9 pages
Chapter 1: Descriptive Statistics: Example 1: Making Steel Rods
No ratings yet
Chapter 1: Descriptive Statistics: Example 1: Making Steel Rods
20 pages
Standard Deviation and Variance
No ratings yet
Standard Deviation and Variance
20 pages
CH 02
No ratings yet
CH 02
41 pages
Econometrics 370HW1
No ratings yet
Econometrics 370HW1
3 pages
Measure of Central Tendancy
No ratings yet
Measure of Central Tendancy
31 pages
2 - Probability (Part 3) - Discrete PD (Binomial,Poisson,Hypergeometric)
No ratings yet
2 - Probability (Part 3) - Discrete PD (Binomial,Poisson,Hypergeometric)
30 pages
Chapter-4- Measures of Disperstion
No ratings yet
Chapter-4- Measures of Disperstion
7 pages
Point Estimation
No ratings yet
Point Estimation
7 pages
Chemometrics
No ratings yet
Chemometrics
201 pages
Chapter 4 Notes
No ratings yet
Chapter 4 Notes
16 pages
Statistik Dalam Hidrologi Uji Kecocokan Data Terhadap Distribusi Kemnungkinan (Goodness of Fit of Data To Probability Distibution)
No ratings yet
Statistik Dalam Hidrologi Uji Kecocokan Data Terhadap Distribusi Kemnungkinan (Goodness of Fit of Data To Probability Distibution)
63 pages
c08 Sampling
No ratings yet
c08 Sampling
6 pages
04 Dispersion Measures
No ratings yet
04 Dispersion Measures
17 pages
Statistics I: Introduction and Distributions of Sampling Statistics
No ratings yet
Statistics I: Introduction and Distributions of Sampling Statistics
22 pages
18 Statistics Formula Sheets Quizrr
No ratings yet
18 Statistics Formula Sheets Quizrr
7 pages
Learn Statistics Fast: A Simplified Detailed Version for Students
From Everand
Learn Statistics Fast: A Simplified Detailed Version for Students
Hesbon R.M
No ratings yet
Google Duplex
No ratings yet
Google Duplex
13 pages
Full Download (Ebook) Big Data Analytics: Tools and Technology for Effective Planning by Arun K. Somani, Ganesh Chandra Deka ISBN 9781138032392, 9781315391236, 9781315391243, 9781315391250, 1138032395, 1315391236, 1315391244, 1315391252 PDF DOCX
100% (12)
Full Download (Ebook) Big Data Analytics: Tools and Technology for Effective Planning by Arun K. Somani, Ganesh Chandra Deka ISBN 9781138032392, 9781315391236, 9781315391243, 9781315391250, 1138032395, 1315391236, 1315391244, 1315391252 PDF DOCX
65 pages
Manifolds, Tensors, and Forms: An Introduction For Mathematicians and Physicists
50% (2)
Manifolds, Tensors, and Forms: An Introduction For Mathematicians and Physicists
4 pages
Jpeg PDF
No ratings yet
Jpeg PDF
9 pages
DMM172
No ratings yet
DMM172
2 pages
Reconsideration of The Friction Factor Data and Eq PDF
No ratings yet
Reconsideration of The Friction Factor Data and Eq PDF
20 pages
Chapter 13 Homework Solution
No ratings yet
Chapter 13 Homework Solution
4 pages
Vector Addition
100% (1)
Vector Addition
17 pages
GRADE-8 Term-1 Portions (2024-2025) - 1
No ratings yet
GRADE-8 Term-1 Portions (2024-2025) - 1
4 pages
28 Bridge Service Life Estimation Considering Inspection Reliabilit
No ratings yet
28 Bridge Service Life Estimation Considering Inspection Reliabilit
13 pages
Assignment 1
No ratings yet
Assignment 1
7 pages
All Into One ML
No ratings yet
All Into One ML
432 pages
Excel First Review and Training Center, Inc.: 1. A. S / ( (S Square) + (W Square) ) 6. C
No ratings yet
Excel First Review and Training Center, Inc.: 1. A. S / ( (S Square) + (W Square) ) 6. C
4 pages
Chapter 2 Three Main Types of Psychological Research 1
No ratings yet
Chapter 2 Three Main Types of Psychological Research 1
18 pages
Estimation 2
No ratings yet
Estimation 2
20 pages
Unit 2 - Fluid Mechanics I - WWW - Rgpvnotes.in
No ratings yet
Unit 2 - Fluid Mechanics I - WWW - Rgpvnotes.in
10 pages
Linear First-Order ODE's: Classification of Differential Equations
No ratings yet
Linear First-Order ODE's: Classification of Differential Equations
7 pages
Old Course Structure For Bca
No ratings yet
Old Course Structure For Bca
19 pages
Physics Sample Test
No ratings yet
Physics Sample Test
3 pages
Xtra Math Challenge 2008-2009 15-Second Questions
No ratings yet
Xtra Math Challenge 2008-2009 15-Second Questions
3 pages
Bloom Filter: Algorithm Description
No ratings yet
Bloom Filter: Algorithm Description
11 pages
28 Maze Routing
No ratings yet
28 Maze Routing
15 pages
Koistinen Consistency of Spinoza's Modal
No ratings yet
Koistinen Consistency of Spinoza's Modal
20 pages
Detailed ML Project Presentation Titanic Housing
No ratings yet
Detailed ML Project Presentation Titanic Housing
13 pages
Fast Gapped Read Alignment With Bowtie 2 PDF
No ratings yet
Fast Gapped Read Alignment With Bowtie 2 PDF
2 pages
BUKU AHP Freerk A. Lootsma - Multi-Criteria Decision Analysis Via Ratio and Difference Judgement (1999, Springer) PDF
No ratings yet
BUKU AHP Freerk A. Lootsma - Multi-Criteria Decision Analysis Via Ratio and Difference Judgement (1999, Springer) PDF
292 pages