0% found this document useful (0 votes)
4 views

Hypothesis Testing

Hypothesis testing
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views

Hypothesis Testing

Hypothesis testing
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 12

Class 4

• Data must be interpreted in order to add meaning.


• We can interpret data by assuming a specific structure our outcome and
use statistical methods to confirm or reject the assumption.
• The assumption is called a hypothesis and the statistical tests used for this
purpose are called statistical hypothesis tests.
• In statistics, a hypothesis test calculates some quantity under a given
assumption. The result of the test allows us to interpret whether the
assumption holds or whether the assumption has been violated.
• The assumption of a statistical test is called the null hypothesis, or
hypothesis 0 (H0 for short). It is often called the default assumption, or the
assumption that nothing has changed.
• A violation of the test’s assumption is often called the first hypothesis,
hypothesis 1 or H1 for short. H1 is really a short hand for “some other
hypothesis,” as all we know is that the evidence suggests that the H0 can
be rejected.
• Hypothesis 0 (H0): Assumption of the test holds and is failed to be rejected at
some level of significance.
• Hypothesis 1 (H1): Assumption of the test does not hold and is rejected at
some level of significance.
Statistical Test Interpretation
• The results of a statistical hypothesis test must be interpreted for us to start
making claims.
• There are two common forms that a result from a statistical hypothesis test
may take, and they must be interpreted in different ways. They are the p-
value and critical values.
Interpret the p-value
• We describe a finding as statistically significant by interpreting the p-value.
• A statistical hypothesis test may return a value called p or the p-value. This is
a quantity that we can use to interpret or quantify the result of the test and
either reject or fail to reject the null hypothesis. This is done by comparing the
p-value to a threshold value chosen beforehand called the significance level.
• The significance level is often referred to by the Greek lower case letter
alpha.
• A common value used for alpha is 5% or 0.05. A smaller alpha value
suggests a more robust interpretation of the null hypothesis, such as 1%
or 0.1%.
• The p-value is compared to the pre-chosen alpha value. A result is
statistically significant when the p-value is less than alpha. This signifies
a change was detected: that the default hypothesis can be rejected.
• If p-value > alpha: Fail to reject the null hypothesis (i.e. not significant
result).
• If p-value <= alpha: Reject the null hypothesis (i.e. significant result).
For example, if we were performing a test of whether a data sample was
normal and we calculated a p-value of .07, we could state something like:
The test found that the data sample was normal, failing to reject the null
hypothesis at a 5% significance level.
• The significance level can be inverted by subtracting it from 1 to give a
confidence level of the hypothesis given the observed sample data.
confidence level = 1 - significance level
The test found that the data was normal, failing to reject the null
hypothesis at a 95% confidence level.
“Reject” vs “Failure to Reject”
• The p-value is probabilistic.
• This means that when we interpret the result of a statistical test, we
do not know what is true or false, only what is likely.
• Rejecting the null hypothesis means that there is sufficient statistical
evidence that the null hypothesis does not look likely. Otherwise, it
means that there is not sufficient statistical evidence to reject the null
hypothesis.
• it is safer to say that we “fail to reject” the null hypothesis, as in, there
is insufficient statistical evidence to reject it.
• it is safer to say that we ‘reject” the null hypothesis, as in, there is
sufficient statistical evidence to reject it.
Common p-value Misinterpretations
True or False Null Hypothesis
• The interpretation of the p-value does not mean that the null
hypothesis is true or false.
• It does mean that we have chosen to reject or fail to reject the null
hypothesis at a specific statistical significance level based on empirical
evidence and the chosen statistical test.
• You are limited to making probabilistic claims, not crisp binary or
true/false claims about the result
p-value as Probability
• A common misunderstanding is that the p-value is a probability of the
null hypothesis being true or false given the data.
• Instead, the p-value can be thought of as the probability of the data
given the pre-specified assumption embedded in the statistical test.
• Again, using probability notation, this would be written as:
Pr(data | hypothesis)
• It allows us to reason about whether or not the data fits the
hypothesis.
Interpret Critical Values
• Some tests do not return a p-value.
• Instead, they might return a list of critical values and their associated
significance levels, as well as a test statistic.
• The results are interpreted in a similar way. Instead of comparing a
single p-value to a pre-specified significance level, the test statistic is
compared to the critical value at a chosen significance level.
• If test statistic < critical value: Fail to reject the null hypothesis.
• If test statistic >= critical value: Reject the null hypothesis.
• Again, the meaning of the result is similar in that the chosen
significance level is a probabilistic decision on rejection or fail to reject
the base assumption of the test given the data.
• Results are presented in the same way as with a p-value, as either
significance level or confidence level.
• For example, if a normality test was calculated and the test statistic
was compared to the critical value at the 5% significance level, results
could be stated as:
The test found that the data sample was normal, failing to reject the
null hypothesis at a 5% significance level.
The test found that the data was normal, failing to reject the null
hypothesis at a 95% confidence level.
Errors in Statistical Tests
• The interpretation of a statistical hypothesis test is probabilistic.
• That means that the evidence of the test may suggest an outcome
and be mistaken.
• Given a small p-value (reject the null hypothesis) either means that
the null hypothesis false (we got it right) or it is true and some rare
and unlikely event has been observed (we made a mistake). If this
type of error is made, it is called a false positive. We falsely believe
the rejection of the null hypothesis.
• Alternately, given a large p-value (fail to reject the null hypothesis), it may
mean that the null hypothesis is true (we got it right) or that the null
hypothesis is false and some unlikely event occurred (we made a mistake).
If this type of error is made, it is called a false negative. We falsely believe
the null hypothesis or assumption of the statistical test.
Each of these two types of error has a specific name.
• Type I Error: The incorrect rejection of a true null hypothesis or a false
positive.
• Type II Error: The incorrect failure of rejection of a false null hypothesis or
a false negative.

Type-1 error: Type 1 error is the case when we reject the null hypothesis but
in actual it was true. The probability of having a Type-1 error is called
significance level alpha(α).
Type-2 error: Type 2 error is the case when we fail to reject the null
hypothesis but actually it is false. The probability of having a type-2 error is
Examples of Hypothesis Tests
• There are many types of statistical hypothesis tests.
• This section lists some common examples of statistical hypothesis
tests and the types of problems that they are used to address:
Variable Distribution Type Tests (Gaussian)
• Shapiro-Wilk Test
• D’Agostino’s K^2 Test
• Anderson-Darling Test
Variable Relationship Tests (correlation)
• Pearson’s Correlation Coefficient
• Spearman’s Rank Correlation
• Kendall’s Rank Correlation
• Chi-Squared Test
Compare Sample Means (parametric)
• Student’s t-test
• Paired Student’s t-test
• Analysis of Variance Test (ANOVA)
• Repeated Measures ANOVA Test
Compare Sample Means (nonparametric)
• Mann-Whitney U Test
• Wilcoxon Signed-Rank Test
• Kruskal-Wallis H Test
• Friedman Tes

You might also like