FSMLecture3 - Copy (1)
FSMLecture3 - Copy (1)
1
Today, Lecture 3
1. Organizational matters:
• today stop 10 minutes early
• next week in lecture room C2
2. Basic Probability Theory
3. Kelly Betting
4. Revisiting OS/Counterfactuals in terms of Kelly Betting
5. (Basic Probability Theory Continued; Basic Bayesian Statistics)
2
Probability
• Ω : sample space
• either for just one or for fixed nr., say 𝑛 outcomes
• 𝑃: probability distribution on Ω , identified by its mass function 𝑝
(in case Ω countable) or density function 𝑓 (in case Ω=ℝ! )
Example:
• Ω = {1,2, … , 6} ; 𝑃 𝐴 = ∑"∈$ 𝑝(𝑎)
• 𝑃 2,4,6 = 𝑃(even) makes sense, 𝑝 2,4,6 does not
Random Variables
With events:
+ $∩-
• 𝑃 𝐴𝐵 ≔ if 𝑃 𝐵 > 0.
+(-)
e.g.
!
0
• 𝑃 4 even = P 4 2,4,6 = #
$ = 1
#
Conditioning
With events:
+ $∩-
• 𝑃 𝐴𝐵 ≔ if 𝑃 𝐵 > 0
+(-)
With random variables:
• For all 𝑥 ∈ 𝒳, 𝑦 ∈ 𝒴 with 𝑃 𝑌 = 𝑦 > 0:
+ )23,*25
• 𝑃 𝑋=𝑥 𝑌=𝑦 ≔
+(*25)
6 ),*
• Abbreviated to 𝑝 𝑋 𝑌 ≔ 6(*)
, written with capitals to denote that the
statement really holds for all ‘instantiations’ of 𝑋 and 𝑌.
Joint, Marginal, Independence
• …proof:
“Chain” or “Product” or “Telescoping” Rule
• …proof:
“Chain” or “Product” or “Telescoping” Rule
•
Extremely
…proof:
Important!
Today, Lecture 3
1. Organizational matters:
• today stop 10 minutes early
• next week in lecture room C2
2. Basic Probability Theory
3. Kelly Betting
4. Revisiting OS/Counterfactuals in terms of Kelly Betting
5. (Basic Probability Theory Continued; Basic Bayesian Statistics)
12
"̅! # "
Interpreting in terms of betting
"# (# " )
E-Processes and Betting!
Kelly (1956)
0 = 0 0
• Example: 𝑝: red = = , 𝑝0 red = 1 ;𝑝: 𝑀 = 1, 𝑝0̅ 𝑀 = ?
E-Processes and Betting
𝑝0̅ (𝑋< )
𝐄+% =1
𝑝: 𝑋<
• “In a real casino, it does not matter what strategy you use at time 𝑡, you
do not expect to gain any money”
E-Processes and Betting
𝑝0̅ (𝑋 (>) )
𝐄+% >
=1
𝑝: 𝑋
• “In a real casino, it does not matter what strategy you use and what rule
you use for deciding when to stop, you do not expect to gain any money”
E-Processes and Betting
• If the null is true, you do not expect to gain any money, under any
stopping time, no matter what strategy 𝑝0̅ you use
• So which one should you use?
• Intuitively, this should be a strategy that lets you get rich as fast as possible if the
alternative is true.
20
Composite 𝑯𝟏
• If the null is true, you do not expect to gain any money, under any
stopping time, no matter what strategy 𝑝0̅ you use
• If you think 𝐻: is wrong, but you do not know which alternative is true,
then… you can try to learn 𝑝0
• Use a 𝑝0̅ that better and better mimics the true, or just “best” fixed 𝑝0
21
Composite 𝑯𝟏
• If you think 𝐻: is wrong, but you do not know which alternative is true,
then… you can try to learn 𝑝0
• Use a 𝑝0̅ that better and better mimics the true, or just “best” fixed 𝑝0
0
Example, 𝐻:: 𝑋& ∼ Ber =
, set:
%! D0
𝑝0̅ 𝑋%D0 = 1 𝑥% ≔ %D=
, where 𝑛0 is nr of 1s in 𝑥 %
…we use notation for conditional probabilities, but we should really think of
𝑝0̅ as a sequential betting strategy with the “conditional probabilities”
indicating how to bet/invest in the next round, given the past data
22
Composite 𝑯𝟏
• If you think 𝐻: is wrong, but you do not know which alternative is true,
then… you can try to learn 𝑝0
• Use a 𝑝0̅ that better and better mimics the true, or just “best” fixed 𝑝0
0
Example, 𝐻:: 𝑋& ∼ Ber =
, set:
%! D0
𝑝0̅ 𝑋%D0 = 1 𝑥% ≔ %D=
, where 𝑛0 is nr of 1s in 𝑥 %
…still, formally, using telescoping-in-reverse, we find that 𝑝0̅ also uniquely
defines a marginal probability distribution for 𝑋 % , for each 𝑛 , and our
accumulated capital at time 𝑛 is again given by the likelihood ratio.
6̅! )) 6̅! ()* ∣)*+! )
= ∏&20..%
6% ()) ) 6% ()* ∣𝑿𝒊+𝟏 )
23
Extensions
• For the betting analogy to hold, the bets must be not-strictly favourable
to you if the null hypothesis were true. So the capital process in a real
casino (which includes a 0 outcome, so the actual probability 𝑃: red =
𝑃: black = 18/37) is also an e-process.
24
p-value problem 1 goes away
26
• Suppose I plan to test a new medication on exactly 100 patients.
I do this and obtain a (just) significant result
(p=0.03 based on fixed n=100).
But just to make sure I ask a statistician whether I did everything right.
• The statistician asks: what would you have done if your result had been
‘almost-but-not-quite’ significant?
27
• Suppose I plan to test a new medication on exactly 100 patients.
I do this and obtain a (just) significant result
(p=0.03 based on fixed n=100).
But just to make sure I ask a statistician whether I did everything right.
• The statistician asks: what would you have done if your result had been
‘almost-but-not-quite’ significant?
• I say “Well I never thought about that. Well, perhaps, but I’m not sure, I
would have asked my boss for money to test another 50 patients”.
28
• Suppose I plan to test a new medication on exactly 100 patients.
I do this and obtain a (just) significant result
(p=0.03 based on fixed n=100).
But just to make sure I ask a statistician whether I did everything right.
• The statistician asks: what would you have done if your result had been
‘almost-but-not-quite’ significant?
• I say “Well I never thought about that. Well, perhaps, but I’m not sure, I
would have asked my boss for money to test another 50 patients”.
• Now the statistician says: that means your result is invalid!
29
• Suppose I plan to test a new medication on exactly 100 patients.
I do this and obtain a (just) significant result
(𝑆 = 21 based on fixed n=100).
But just to make sure I ask a statistician whether I did everything right.
• The statistician asks: what would you have done if your result had been
‘almost-but-not-quite’ significant?
• I say “Well I never thought about that. Well, perhaps, but I’m not sure, I
would have asked my boss for money to test another 50 patients”.
• Now the statistician says: this is completely fine, since the validity of your
conclusion does not depend on the actual stopping rule you have used
30
• Suppose I plan to test a new medication on exactly 100 patients.
I do this and obtain a (just) significant result
(𝑆 = 21 based on fixed n=100).
But just to make sure I ask a statistician whether I did everything right.
• The statistician asks: what would you have done if your result had been
‘almost-but-not-quite’ significant?
• I say “Well I never thought about that. Well, perhaps, but I’m not sure, I
would have asked my boss for money to test another 50 patients”.
• Now the statistician says: this is completely fine, since the validity of your
conclusion does not depend on the actual stopping rule you have used
OR:
This is completely fine, since evidence is measured in terms of money
you gained and that amount does not depend what you would have
done in situations that never occurred 31
Today, Lecture 3
1. Organizational matters:
• today stop 10 minutes early
• next week in lecture room C2
2. Basic Probability Theory
3. Kelly Betting
4. Revisiting OS/Counterfactual Results in terms of Kelly Betting
5. (Basic Probability Theory Continued; Basic Bayesian Statistics)
32
Bayes’ Theorem
• We know
𝑃 𝐷∩𝐻
𝑃 𝐻𝐷 =
𝑃(𝐻)
and 𝑃 𝐷 ∩ 𝐻 = 𝑃 𝐻 ⋅ 𝑃(𝐷|𝐻)
…so combining both conditional probability statements we get
𝑃 𝐷 𝐻 ⋅𝑃 𝐻
𝑃 𝐻𝐷 =
𝑃(𝐷)
Bayes’ Theorem
• We know
𝑃 𝐷∩𝐻 likelihood
𝑃 𝐻𝐷 =
𝑃(𝐻)
and 𝑃 𝐷 ∩ 𝐻 = 𝑃 𝐻 ⋅ 𝑃(𝐷|𝐻)
…so combining both conditional probability statements we get
𝑃 𝐷 𝐻 ⋅𝑃 𝐻
𝑃 𝐻𝐷 =
𝑃(𝐷)
prior probability
posterior probability of 𝑯
of 𝑯
Two Fundamentally Different Uses of Bayes’
Theorem
1. A Priori Probabilities can be meaningfully estimated
(medical testing, for example!)
…just an application of a mathematical theorem
2. A Priori Probabilities are mere guess (and conceivably do not even exist)
– we will now see example of this
“Bayesian learning/Bayesian statistics”
• Every medical test has a certain sensitivity and specificity. The sensitivity
is the probability of a positive result, given that you have the disease.
The specificity is the probability of a negative result, given that you do
not have the disease.
• If you know the probability that an average person has the disease (i.e.
the frequency in the population) you can take this as your ‘prior’ and then
calculate the ‘posterior’ that you have the disease given a positive test
result via Bayes theorem ⇒ homework
Bayesian Inference, Toy Example
You get kidnapped, sedated and wake up in a foreign
country. You only know that:
• You are either in Sweden or France
• Two thirds of all Swedes are blond
• One third of all French are blond
Bayesian Inference, Toy Example
You get kidnapped, sedated and wake up in a foreign
country. You only know that:
• You are either in Sweden or France
• Two thirds of all Swedes are blond
• One third of all French are blond
• You see three blond and one non-blond persons
or
• Statistical model
• prior probability
•
• Statistical model
• prior probability
• likelihood
•
• Statistical model
• prior probability
• likelihood
•
• Statistical model
• prior probability
• likelihood
• likelihood
•
• Before you see anybody:
• Before you see anybody:
• 2nd is not-blond:
• Before you see anybody:
• 2nd is not-blond:
• And what happens if you are really in Germany, where 50% of the
people are blond?
Models
• NOTE: not
Models
𝒏
𝒑𝜽 𝑿 as function of 𝜽
The Bayesian Posterior
• Posterior is
• Posterior is
• Posterior is
• Posterior is
…and