Hypothesis Testing Class
Hypothesis Testing Class
STAT 1
SUMMARY STATISTICS
SAMPLE VS POPULATION
RANDOM VARIABLE
PROBABILITY DISTRIBUTION
SUMMARY STATISTICS
Why is it important for us to learn about summary statistics?
Description of a large number of data points
Generate inferences from the summary statistics
A very simple example
You work for a credit card company, and you have data on credit
card applications. You divide the data into applications from
customers who have a great payment record, and applications from
customers who have been late with payments at least 3 times in
the last year
Average salary for Group 1: $37,000. Standard Deviation: $5,000
Average Salary for Group 2: $26,000. Standard Deviation: $9,000
SUMMARY STATISTICS
Going beyond summary statistics
With large volumes of data :
Many trends
Multiple patterns
Some contradictory findings
How do we establish
Meaningful trends vs random noise
Relative importance or impact?
Inferential Statistics
Allows us to find statistically significant relationships
Remove noise
CASE STUDY
You receive customer complaints about a
courier vendor you use for delivery of
products purchased they do not seem to be
delivering on time.
You confront the vendor and they claim that
they meet industry standards, and that the
complaints are a result of isolated and oneoff situations
CASE STUDY
Is 6.18 different from 5.28?
Is it a big difference?
What would you conclude?
CASE STUDY 2
A/B TESTING
You are testing banner ads on your ecommerce website, with changes in copy.
You run with one type of banner for 10 days, and check key KPIs (say purchase rate).
You then run second banner for next 10 days and check purchase rate
This is the data, on a daily basis.
SAMPLE VS POPULATION
In the previous slide, we have introduced the concept of sample and
population
Examples of Populations
All applications received for credit cards from Bank XYZ
All consumers of Product Y
What others can you think of?
Examples of Samples:
All applications received in the last 3 months
Women consumers over the age of 45 that have bough Product Y
in the last 6 months
What else?
SAMPLE VS POPULATION
Why do we need to separate the two?
Population (or the Universe) tends to be very large, making it difficult
(or impossible) to collect and analyze data on the population
It is easier to take a subset of the population, analyze the subset, and
then make inferences about the population
SAMPLE VS POPULATION
Lets say we have a population of 10000 respondents to a survey,
and we want to take a sample of 500. How many samples are
possible?
CASE STUDY
You work for Airline X as GM Sales. For any flight, lets assume for simplicity
that there are always 100 seats. When selling flight tickets, you could always
sell exactly 100 tickets. However, you know that in most flights, some people
will not show up, so you could maximize sales by reselling that ticket.
How do you decide how many tickets you should oversell for each flight?
Remember, if you oversell by more than the number of people who dont show, you will lose
money in terms of putting those people on other flights plus some compensation
RANDOM VARIABLES
AND PROBABILITY
We introduce the concept of random variables and how they apply to statistical
theory
A random variable is one that takes a numerical value whose outcome is
determined by an experiment.
ANOTHER EXAMPLE
Fast Courier Delivery Times:
Random variable?
Remember: its not random in the sense that you
have zero knowledge
Range of possible outcomes?
Probability of each possible outcome?
PROBABILITY DISTRIBUTION
Fast Courier:
In sample of 39 observations
What is probability of a random delivery taking 8 days?
8 days or more?
PROBABILITY DISTRIBUTION
A probability distribution therefore includes all possible outcomes of an
experiment repeated n times
A discrete probability distribution is a list of discrete outcomes ,
and a continuous distribution is a product of an experiment with continuous
possible outcomes
1. There are only two possible outcomes: Win or Lose, 1 or 0, Male or Female
2. There are no external factors influencing the probability of each outcome over time
3. The chances of each outcome are independent of previous results
BINOMIAL DISTRIBUTION
n is fixed
Each observation represent one of the two outcomes
P is same for each outcome
Each observation is independent
2 Late customers:
P = 0.3456
3 late customers:
P = 0.2304
4 late customers:
P = 0.0768
5 late customers:
P = 0.0120
STAT 2
OUTLINE
POISSON PROBABILITY DISTRIBUTION
NORMAL PROBABILITY DISTRIBUTION
HYPOTHESIS TESTING
POISSON DISTRIBUTION
Another discrete probability distribution, that is used to model number of events
occurring in a time frame
Examples include:
1. Number of insurance claims in a month
2. Disease spread in a day
3. Number of telephone calls in an hour
4. Number of patients needing emergency services in a day
POISSON DISTRIBUTION
Poisson Probabilities are calculated as:
POISSON DISTRIBUTION
GROUP CASE STUDY
CAPACITY PLANNING:
You need 4 resources to process orders via your web-site daily. Currently you
receive an average of 392 transactions a day. Maximum transactions per
person per resource is 101. Your website cannot handle > 450 transactions in
one day.
Avg profit is Rs.120. Should you invest in an additional resource at a cost of Rs
300 per day?
NORMAL DISTRIBUTION
Area under the curve:
The total area under a normal probability curve is always 1 . This property
allows us to think of the area as probability, and therefore we can compute
probability two values on the curve
NORMAL DISTRIBUTION
NORMAL DISTRIBUTION
We can calculate probabilities of any X given mean and std deviation
If total delivery time is normally distributed with a mean of 6 days and a std
deviation of two days, what is the probability that a random delivery takes less
than 9 days?
Use Excel:
Formula: NORM.DIST(Outcome x, Mean, Std Dev, cumulative)=
NORM.DIST(9,6,2,?)
Is this the right answer?
NORMAL DISTRIBUTION
Delivery will take more than 4 days?
NORMAL DISTRIBUTION
HYPOTHESIS TESTING
A statistical hypothesis is an assumption about a population parameter
Hypothesis testing is a procedure to accept or reject the assumption
HYPOTHESIS TESTING
Inventory Optimization
You want to optimize inventory costs , and you are reviewing mobile electronic
equipment. You have been using a daily sales average for this category as 315,
with a std deviation of 85 (based on past data a year ago). You take a current
sample of the last 45 days to validate, and find that avg daily sales are 338. Should
you increase inventory levels?
SIMPLE ANSWERS?
Yes, data clearly shows increase in inventory levels.
No we need a larger sample before we decide
HYPOTHESIS TESTING
One way to solve this is to run a hypothesis test.
1. Set up NULL HYPOTHESIS
N ,
Implications : If sample size is sufficiently large (> 30), you can always use a
normal distribution as your test distribution without worrying about true
population distribution
HYPOTHESIS TESTING
Probability of Observed Outcome?
Use Central Limit Theorem.
This sample is one of multiple possible samples from the population of customers,
and therefore means of all possible samples will follow a normal distribution
HYPOTHESIS TESTING
= NORM.DIST(338,315,85/(45^0.5),TRUE)
This is the probability of seeing mean less or equal to 338.
We need probability of outcomes as extreme or more extreme assuming that the null
is true
That is:
1 norm.dist(338,315,85/(45^0.5),true)
HYPOTHESIS TESTING
What if the observed probability was low?
Low? Or High?
20%
HYPOTHESIS TESTING
Cut-off : SIGNIFICANCE LEVEL (ALPHA, )
Calculated Probability: P-VALUE
IF:
P-VALUE < SIGNIFICANCE LEVEL
REJECT THE NULL HYPOTHESIS
P-VALUE > SIGNIFICANCE LEVEL
FAIL TO REJECT THE NULL HYPOTHESIS
HYPOTHESIS TESTING
Significance Level usually 5%
In the inventory example:
Null Hypothesis: Daily Sales rate has not changed
HYPOTHESIS TESTING
You believe that average discounts on books is maintained around 24%. You take a
random sample of 100 books and find that average discount is 26%, with a std
deviation of 6%. Is this a substantial difference? Are your books discounted more than
expected?
NULL HYPOTHESIS:
Discounts are still averaging 24%
ALTERNATE HYPOTHESIS:
Discounts are steeper than 24%
SIGNIFICANCE LEVEL:
10%
P VALUE:
= 1- norm.dist(26,24, 6/10, true) = 0.0004
CONCLUSION?
Notice used
Sample std
deviation
instead of
pop. std dev
HYPOTHESIS TESTING
Types of Hypothesis Tests:
1. One Tail and Two Tail Tests
2. Large Sample (Z Tests) and Small Sample Tests (T Tests)
3. One sample and Two Sample Tests
Single Sample Z/T tests, Independent Sample T tests, Paired Sample T tests
4. Multiple sample tests
ANOVA
5. Non-parametric Tests
Chi Square Tests
HYPOTHESIS TESTING
If Alternate Hypothesis is set up as:
Sample mean not equal to population mean:
Two Tail Test
P-VALUE should be compared to SIGNIFICANCE LEVEL/2
HYPOTHESIS TESTING
You want to optimize inventory costs , and you are reviewing mobile electronic
equipment. You have been using a daily sales average for this category as 315,
with a std deviation of 85 (based on past data a year ago). You take a current
sample of the last 45 days to validate, and find that avg daily sales are 338.
Should you increase inventory levels?
Null: Stays the Same: Average daily sales are same as population i.e. 315
Alternate1: Average daily sales are STEEPER then 315
Alternate2:Two Tail Hypothesis: Average daily sales are NOT EQUAL to 315
Alpha: 5% (Same)
P-value : 1- norm.dist(338,315,85/45^0.5, true) = 0.035 (Calculated the same way)
If One Tail: 0.035 < 0.05 Reject Null
If Two Tail: 0.035< 0.025 Accept Null
Jigsaw Academy Education Pvt. Ltd.
HYPOTHESIS TESTING
Choosing one tail vs two tail:
Depends on the business context
For example:
1. You work for a manufacturer of cereals, and are receiving consumer complaints of
package weight being less than printed weight on your product. You take a random
sample from your production process and want to run a check. What alternate
hypothesis will you choose?
2. You are working for a parts manufacturer for an airline company, and are creating
components for engine parts. You have received specifications from the airline
company around the weight of each valve. When you run a quality check on the
finished product, what alternate hypothesis will you choose?
REMEMBER: Decide which tail test you will run before calculating p-value
HYPOTHESIS TESTING
Types of Hypothesis Tests:
1. One Tail and Two Tail Tests
2. Large Sample (Z Tests) and Small Sample Tests (T Tests)
3. One sample and Two Sample Tests
Single Sample Z/T tests, Independent Sample T tests, Paired Sample T tests
4. Multiple sample tests
ANOVA
5. Non-parametric Tests
Chi Square Tests
STATS 3
OUT LINE
HYPOTHESIS TESTING
Large Sample and Small Sample Tests
Is sample size > 30 Use CLT and Normal Distribution
Sample Size < 30 Use T Distribution
Suppose we have a simple random sample of size n
drawn from a Normal population with mean and
standard deviation s.
HYPOTHESIS TESTING
T - TESTS
If sample size is < 30, your data may not be normally distributed.
In order to compute probability of an observed outcome when sample size < 30, use a
t- distribution or a t-test
Suppose we have a simple random sample of size n drawn from a Normal population with
mean and standard deviation s.
HYPOTHESIS TESTING
T - TESTS
You believe books are your most popular category of sales, with a 35% share of
transactions. A random sample of 10 days shows books share of total daily sales
averaging 32%, std deviation 3%. Has share of books reduced? Use a 5% level of
significance
NULL HYPOTHESIS
Share of books category = 35%
ALTERNATE HYPOTHESIS
Share of books category not equal to 35% (TWO TAIL TEST)
SIGNIFICANCE LEVEL: 5%
HYPOTHESIS TESTING
T - TESTS
2 parts to the T-Test in Excel:
1. Calculate Test Statistic distance measure
2. Include Test Statistic in the T.DIST function
Dont need to do that for Normal Distribution because p-values of a normal distribution
are the same irrespective of sample size
But p-values of a T-Dist depend on Sample size.
HYPOTHESIS TESTING
Types of Hypothesis Tests:
1. One Tail and Two Tail Tests
2. Large Sample (Z Tests) and Small Sample Tests (T Tests)
3. One sample and Two Sample Tests
Single Sample Z/T tests, Independent Sample T tests, Paired Sample T tests
4. Multiple sample tests
ANOVA
5. Non-parametric Tests
Chi Square Tests
CASE STUDY 2
A/B TESTING
You are testing banner ads on your ecommerce website, with changes in copy.
You run with one type of banner for 10 days, and check key KPIs (say purchase rate).
You then run second banner for next 10 days and check purchase rate
This is the data, on a daily basis.
HYPOTHESIS TESTING
T - TESTS
Often, we want to test difference in sample means between two samples (not sample
vs population)
example: A/B Testing
HYPOTHESIS TESTING
T - TESTS
Independent Sample T Tests
Excel: Data \Data Analysis\ T-Test: Two Sample
TWO-SAMPLE T-TESTS
The test statistic is:
HYPOTHESIS TESTING
MULTIPLE SAMPLE TESTS
What if you have multiple samples?
Supposing you are checking effectiveness of click rate based on 4 different layouts
This is the data:
HYPOTHESIS TESTING
T - TESTS
TWO-SAMPLE T-TESTS
In the previous example, we compared two samples that had equal
observations
1.
If there are unequal observations, we can still use the t-test, but the
degrees of freedom used should be 1 less the small sample size
2.
We can also assume similar variance across the two samples, in which case
the t-stat will be simplified, but if variance is not similar, use the test for
unequal variance in Excel
Weight
Respondent Weight Pre Post
1
162
168
2
170
158
3
184
186
4
164
155
5
172
143
6
176
161
7
159
160
8
170
135
HYPOTHESIS TESTING
Different types (or cases) of hypothesis testing:
1.
2.
3.
4.
Want to test a directional hypothesis : that is, sample mean > pop mean, for
ex, as opposed to sample mean not equal to pop mean?
5.
HYPOTHESIS TESTING
ERRORS
Error Types
1. Reject the null hypothesis when in fact it is true - Type I Error
2. Fail to reject the null hypothesis when in fact it is not true - Type II Error
HYPOTHESIS TESTING
ERRORS
Type I Error
False Positive
Concluding that a
difference exists when it
does not
Type II Error
False Negative
Concluding that a
difference does not exist
when it does
HYPOTHESIS TESTING
ERRORS
Outcome
Not Guilty
Guilty
Convict
Type I Error
Correct Decision
Free
Correct Decision
Type II Error
HYPOTHESIS TESTING
ERRORS
Outcome
Part Ok
Part Not Ok
Throw
Type I Error
Correct Decision
Keep
Correct Decision
Type II Error
HYPOTHESIS TESTING
ERRORS
Size of Error?
Type I : Significance Level
Type II: Related to Power. Usually set to 20%. It is a function of sample size, sample
variance , and
What does 20% in Type II imply? You are willing to take a risk of not capturing an
impact that exists 20% of the time
Outcome
Part Ok
Part Not Ok
Throw
Keep