0% found this document useful (0 votes)
52 views19 pages

Proof: Applying Markov's Inequality, For Any: 0 PR (1 +) PR

This document contains: 1. A proof of Markov's inequality and Chernoff bounds for independent Poisson trials. The Chernoff bounds provide tighter concentration around the mean than Markov's inequality. 2. An example application of the Chernoff bounds to analyze the probability of deviations from the expected number of heads in coin flips. 3. An explanation of how the Chernoff bounds can be used to derive confidence intervals for estimating an unknown population parameter based on samples. The bounds provide a trade-off between sample size and interval size.

Uploaded by

Mirza Abdulla
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
52 views19 pages

Proof: Applying Markov's Inequality, For Any: 0 PR (1 +) PR

This document contains: 1. A proof of Markov's inequality and Chernoff bounds for independent Poisson trials. The Chernoff bounds provide tighter concentration around the mean than Markov's inequality. 2. An example application of the Chernoff bounds to analyze the probability of deviations from the expected number of heads in coin flips. 3. An explanation of how the Chernoff bounds can be used to derive confidence intervals for estimating an unknown population parameter based on samples. The bounds provide a trade-off between sample size and interval size.

Uploaded by

Mirza Abdulla
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 19

2/5/2015

• The first bound is the strongest, the other two


bounds are often easier to state and compute

Proof: Applying Markov's inequality, for any >0


we have
Pr (1 + ) = Pr

For any > 0, we can set = ln 1 + > 0 to get


(4.4.1):

Pr (1 + )< )
.
(1 + )
MAT-72306 RandAl, Spring 2015 5-Feb-15 170

For (4.4.2) we need to show that, for 0 < 1,

)
.
(1 + )
Taking the logarithm of both sides, we obtain the
equivalent condition
1 + ln 1 + 3 0
Computing the derivatives of , we have:
1+
)=1 ln 1 + + 2 3
1+
ln 1 + + 2 3
1 2
+
1+ 3
MAT-72306 RandAl, Spring 2015 5-Feb-15 171

1
2/5/2015

• We see that
< 0 for < 1/2 and
> 0 for 1 2
• Hence first decreases and then increases
over the interval [0,1]
• Since (0) = 0 and 1 < 0, we can conclude
that in the interval [0,1]
• Since (0) = 0, it follows that 0 in that
interval, proving (4.4.2).

MAT-72306 RandAl, Spring 2015 5-Feb-15 172

To prove (4.4.3), let = (1 + .


Then, for , 5. Hence, using
(4.4.1),

Pr (1 + )
(1 + )

1+

6
2

MAT-72306 RandAl, Spring 2015 5-Feb-15 173

2
2/5/2015

• We obtain similar results bounding the deviation


below the mean
Theorem 4.5: Let , … , be independent
Poisson trials s.t. Pr = 1 = . Let =
and ]. Then for 0 < < 1:

Pr (1 )
;
(1 + )
Pr (1 .

• Again, the first bound is stronger, but the latter is


generally easier to use and sufficient in most
applications
MAT-72306 RandAl, Spring 2015 5-Feb-15 174

• Often the following form of the Chernoff bound is


used
Corollary 4.6: Let , … , be independent
Poisson trials s.t. Pr = 1 = . Let =
and ]. For 0 < < 1:
Pr(| .

• In practice we often do not have the exact value


of ]
• Instead we can use in Theorem 4.4 and
[ ] in Theorem 4.5
MAT-72306 RandAl, Spring 2015 5-Feb-15 175

3
2/5/2015

4.2.2. Example: Coin Flips

• Let be the number of heads in a sequence of


independent fair coin flips
• Applying the Chernoff bound (4.6), we have
1 1 6 ln
Pr ln 2 exp
2 2 32
=2
• Thus, the concentration around the mean /2 is
very tight; most of the time, the deviations from
the mean are of the order of ( ln )
MAT-72306 RandAl, Spring 2015 5-Feb-15 176

• Consider the Pr of /4 or /4 heads in a


sequence of independent fair coin flips
• Chebyshev’s inequality showed that
Pr 2 4
• Already, this is worse than the Chernoff bound
just calculated for a significantly larger event!
• Using the Chernoff bound, we find that
1 1
Pr 2 exp =2
2 4 324
• Thus, Chernoff's technique gives a bound that is
exponentially smaller than that obtained using
Chebyshev's inequality
MAT-72306 RandAl, Spring 2015 5-Feb-15 177

4
2/5/2015

4.2.3. Application: Estimating a Parameter

• Evaluate the probability that a particular gene


mutation occurs in the population
• An expensive lab test determines if a DNA
sample carries the mutation
• We would like to obtain a relatively reliable
estimate from a small number of samples
• Let be the unknown value that we are trying to
estimate
• Assume that we have samples and that
= of these samples have the mutation
MAT-72306 RandAl, Spring 2015 5-Feb-15 178

• Given a sufficiently large number of samples, we


expect to be close to the sampled value
Definition 4.2: A confidence interval for a
parameter is an interval ], s.t.
Pr(
• Instead of predicting a single value for the
parameter, we give an interval that is likely to
contain the parameter
• If can take on any real value, it may not make
sense to try to pin down its exact value from a
finite sample, but it does make sense to estimate
it within some small range
MAT-72306 RandAl, Spring 2015 5-Feb-15 179

5
2/5/2015

• We want both the interval size and the error


probability to be as small as possible
• We derive a trade-off between these two
parameters and the number of samples
• In particular, given that among samples
(chosen uniformly at random from the entire
population) we find the mutation in exactly
samples, we need to find values of and
for which
Pr( )
= Pr( ), ))

MAT-72306 RandAl, Spring 2015 5-Feb-15 180

• Now ), so ]=
• If then we have one of the
following two events:
1. if , then =
](1 + );
2. if , then =
](1 )
• We can apply the Chernoff bounds of Thms 4.4
and 4.5 to compute Pr

= Pr + Pr 1+

MAT-72306 RandAl, Spring 2015 5-Feb-15 181

6
2/5/2015

• This bound is not useful because the value of


is unknown
• A simple solution is to use the fact that 1,
yielding
Pr
• Setting , we obtain a trade-
off between , , and the error probability

MAT-72306 RandAl, Spring 2015 5-Feb-15 182

4.3. Better Bounds for Some


Special Cases

• We can obtain stronger bounds using a simpler


proof technique for some special cases of
symmetric RVs

Theorem 4.7: Let ,…, be independent RVs


with
1
Pr = 1 = Pr 1 = .
2
Let = . For any > 0,
Pr = .
MAT-72306 RandAl, Spring 2015 5-Feb-15 183

7
2/5/2015

Proof: For any > 0,


1 1
=+ .
2 2
To estimate , we observe that
=1+ + !
+ !
and
=1 + !
+ 1 !
using the Taylor series expansion for .
Thus,
2
=
! !
MAT-72306 RandAl, Spring 2015 5-Feb-15 184

Using this estimate yields

and
Pr = Pr
Setting , we obtain
Pr

By symmetry we also have


Pr
MAT-72306 RandAl, Spring 2015 5-Feb-15 185

8
2/5/2015

Corollary 4.8: Let ,…, be independent RVs,


1
Pr = 1 = Pr 1 = .
2
Let = . For any > 0,
Pr =2 .
• Apply transformation = ( + 1)/2 to prove
Corollary 4.9: Let , … , be independent RVs,
1
Pr = 1 = Pr = 0 = .
2
Let = and 2.
1. For any > 0, Pr .
2. For any > 0, Pr (1 + ) = .
MAT-72306 RandAl, Spring 2015 5-Feb-15 186

4.4. Application: Set Balancing

• Given an matrix with entries in {0,1}, let

• We are looking for a vector with entries in


1,1} that minimizes
= max
,…,

MAT-72306 RandAl, Spring 2015 5-Feb-15 187

9
2/5/2015

• This problem arises in designing statistical


experiments
– Each column of the matrix represents a
subject in the experiment and each row
represents a feature
– The vector partitions the subjects into two
disjoint groups, so that each feature is
roughly as balanced as possible between the
two groups
– One of the groups serves as a control group
for an experiment that is run on the other
group
MAT-72306 RandAl, Spring 2015 5-Feb-15 188

• We randomly choose the entries of , with


1
Pr = 1 = Pr 1 =
2
• The choices for different entries are independent
• Surprisingly, although this algorithm ignores the
entries of the matrix , is likely to be only
ln
• This bound is fairly tight: When , there
exists a matrix for which is for
any choice of

MAT-72306 RandAl, Spring 2015 5-Feb-15 189

10
2/5/2015

Theorem 4.11: For a random vector with


entries chosen independently and with equal
probability from the set 1,1},
2
Pr ln .
Proof: Consider the th row ,…, , and
let be the number of 1s in that row.
If ln , then clearly
= ln .

On the other hand, if > ln then we note


that the nonzero terms in the sum
MAT-72306 RandAl, Spring 2015 5-Feb-15 190

are independent random variables, each with


probability 1/2 of being either +1 or 1.
Now using the Chernoff bound of Corollary 4.8 and
the fact that ,
2
Pr > ln .
By the union bound, the probability that the bound
fails for any row is at most 2/ .

MAT-72306 RandAl, Spring 2015 5-Feb-15 191

11
2/5/2015

5 Balls, Bins, and Random Graphs

• Let us throw balls randomly into bins, each


ball lands in a bin chosen independently and
uniformly at random (I+U@R)
• We use the techniques we have developed
previously to analyze this process and develop a
new approach based on
what is known as the
Poisson approximation

MAT-72306 RandAl, Spring 2015 5-Feb-15 192

5.1. Example: The Birthday Paradox

• Is it more likely that some two people in the


room share the same birthday or that no two
people in the room share the same birthday?
• We assume that the birthday of each person is a
random day from a 365-day year, each chosen
I+U@R
• We assume that a person's birthday is equally
likely to be any day of the year, we avoid leap
years, and we ignore the possibility of twins
MAT-72306 RandAl, Spring 2015 5-Feb-15 193

12
2/5/2015

• Let there be 30 people


• Thirty days must be chosen from the 365; there
are ways to do this
• These 30 days can be assigned to the people in
any of the 30! possible orders
• Hence there are 30! configurations where
no two people share the same birthday, out of
the 365 ways the birthdays could occur
• Thus, the probability is
30!
365
MAT-72306 RandAl, Spring 2015 5-Feb-15 194

• We can also consider one person at a time


– The first person has a birthday
– The probability that the second person has a
different birthday is (1 1/365)
– The probability that the third person then has
a birthday different from the first two, given
that they have different birthdays, is (1
2/365)
– Continuing on, the probability that the th
person has a different birthday than the first
1, assuming that they have different
birthdays, is (1 ( 1)/365)
MAT-72306 RandAl, Spring 2015 5-Feb-15 195

13
2/5/2015

• So the probability that 30 people all have


different birthdays is the product of these terms:
1 2 29
365 365 365

• This product is 0.2937, so when 30 people are in


the room there is more than a 70% chance that
two share the same birthday
• A similar calculation shows that only 23 people
need to be in the room before it is more likely
than not that two people share a birthday

MAT-72306 RandAl, Spring 2015 5-Feb-15 196

• More generally, if there are people and


possible birthdays then the probability that all
have different birthdays is

• Using that when is small


compared to , we see that if is small
compared to then

= exp

MAT-72306 RandAl, Spring 2015 5-Feb-15 197

14
2/5/2015

• Hence the value for at which the probability


that people all have different birthdays is 1/2
is approximately given by the equation
= ln 2 ,

or = ln 2
• For = 365, this approximation gives =
22.49, matching the exact calculation quite well
• Mars has = 687 days, need = 30.86 aliens
MAT-72306 RandAl, Spring 2015 5-Feb-15 198

• The following simple arguments give loose


bounds and good intuition
• Let us consider each person one at a time, and
let be the event that the th person's birthday
does not match any of the birthdays of the first
1 people
• Then the probability that the first people fail to
have distinct birthdays is
1 1)
Pr Pr

• If this Pr is < 1/2, so with people


the Pr is 1/2 that all birthdays will be distinct
MAT-72306 RandAl, Spring 2015 5-Feb-15 199

15
2/5/2015

• Now assume that the first people all have


distinct birthdays
• Each person after that has probability at least
=1 of having the same birthday as
one of these first people
• Hence the Pr that the next people all have
different birthdays than the first
1 1 1
< <
2
• Hence, once there are 2 people, the Pr is at
most 1/ that all birthdays will be distinct
MAT-72306 RandAl, Spring 2015 5-Feb-15 200

5.2. Balls into Bins

• balls are thrown into bins, with the location


of each ball chosen I+U@R from the
possibilities
• The question behind the birthday paradox is
whether or not there is a bin with two balls
• How many of the bins are empty?
• How many balls are in the fullest bin?
• Many of these questions have applications to the
design and analysis of algorithms
MAT-72306 RandAl, Spring 2015 5-Feb-15 201

16
2/5/2015

• Birthday paradox: place balls randomly into


bins then, for some , at least one of
the bins is likely to have more than one ball in it
• Another interesting question concerns the max
number of balls in a bin, or the maximum load
• Let us consider the case where , so that
the number of balls equals the number of bins
and the average load is 1
• Of course the maximum load is , but it is very
unlikely that all balls land in the same bin
• We seek an upper bound that holds with
probability tending to 1 as grows large
MAT-72306 RandAl, Spring 2015 5-Feb-15 202

• We can show that the maximum load is not more


than 3ln / ln ln with probability at most
for sufficiently large via a direct calculation and
a union bound
• This is a very loose bound; although the
maximum load is in fact (ln / ln ln ) with
probability close to 1, the constant factor 3 is
chosen to simplify the argument and could be
reduced with more care
Lemma 5.1: When balls are thrown I+U@R into
bins, the probability that the maximum load is
more than 3ln / ln ln is at most for
sufficiently large
MAT-72306 RandAl, Spring 2015 5-Feb-15 203

17
2/5/2015

Proof: The probability that bin 1 receives at least


balls is at most
1
.

This follows from a union bound; there are


distinct sets of balls, and for any set of balls
the probability that all land in bin 1 is .
We now use the inequalities
1 1
!

MAT-72306 RandAl, Spring 2015 5-Feb-15 204

• The second inequality is a consequence of the


following general bound on factorials: since

<
! !
we have
!>
• Applying a union bound again allows us to find
that, for 3 ln / ln ln , the probability that
any bin receives at least balls is bounded
above by
ln ln
3 ln
MAT-72306 RandAl, Spring 2015 5-Feb-15 205

18
2/5/2015

ln ln
ln

for sufficiently large.

MAT-72306 RandAl, Spring 2015 5-Feb-15 206

19

You might also like