Applied Statics - Merged
Applied Statics - Merged
1. Statistical Distributions
The distributions in the theory of statistics are classified mainly as discrete and continuous
distributions.
Uniform Uniform
Binomial t-Distribution
Poisson Chi-square
Geometric Exponential
Weibull
If an experiment has two possible outcomes, “success” and “failure”, and their probabilities are,
respectively, 𝜃and (1 − 𝜃), then the variable of number of successes(𝑋), has a Bernoulli distribution
with 𝑝𝑚𝑓; 𝑓(𝑥) = 𝜃 (1 − 𝜃) ; 𝑥 = 0 𝑜𝑟 1.
𝑥̅ = 𝑛𝑝 and σ= 𝑛𝑝𝑞;
Poisson Distribution
In a Poisson experiment, the random occurrence of number of events over an interval (usually a time
interval) is observed. In the same experiment if the time between two events is observed, the
variable will theoretically follow a continuous distribution which will be discussed later.
𝑒 𝜆
𝑓(𝑥; 𝜆) = ; 𝑥 = 0,1,2, …
𝑥!
Eg: Consider the statistical experiment of flipping a coin repeatedly and count the number of times
the coin lands on heads. Continue flipping the coin until it has Head 5 times on top. Then the number
of trials needed to have Head turned on 5 times (X), follows a negative binomial distribution.
X : 5 6 7 8 9 10 ……………………
P(X) : ? ? ? ? ? ? ……………………
3
n
Cr : The number of combinations of n things, taken r at a time.
b*(x; r, P) : - the probability that an x-trial negative binomial experiment results in the rth success
on the xth trial, when the probability of success of an individual trial is P.
b*(x; r, P) = x-1Cr-1 .Pr . (1 - P)x - r
NB: When dealing with negative binomial distribution, check on how the negative binomial random
variable is defined.
The negative binomial random variable is R, the number of successes before the binomial
experiment results in k failures. The mean of R is μR = kP/Q.
The negative binomial random variable is K, the number of failures before the binomial
experiment results in r successes. The mean of K is μK = rQ/P.
4
This is a special case of the negative binomial distribution, where the variable of interest is the number
of trials required for a single success or the first success. Thus, the geometric distribution is negative
binomial distribution with the number of successes (r) is equal to 1.
An example of a geometric distribution would be asking for the probability that the first head occurs on
the third flip. That probability is referred to as a geometric probability and is denoted by g(x; p). The
formula for geometric probability is
g(x; p) = bp.qx - 1
Hypergeometric Experiment
Eg: Consider the statistical experiment of randomly selecting 2 marbles without replacement from an
urn of 10 marbles - 5 red and 5 green. The variable of interest is the number of red marbles selected.
This is a hyper geometric experiment.
Note : As binomial experiment requires that the probability of success be constant on every trial, the
above is not a binomial experiment. In the above experiment, the probability of a success changes on
every trial. Further that if the marbles were selected with replacement, the probability of success would
not change. It would be 5/10 on every trial. Then, this would be a binomial experiment.
5
Hypergeometric probability
h(x; N, n, k): - the probability that an n-trial hypergeometric experiment results in exactlyx successes,
when the population consists of N items, k of which are classified as successes.
µx =nk / N
Vx = nk ( N - k )( N - n ) / [ N2 ( N - 1 ) ]
.
6
;Normal pdf
Exponential Distribution
7
The exponential distribution is often used to model the length of time until an event occurs.
The exponential distribution can be thought of as the continuous analogue of the geometric
distribution.
This parameter 𝝀 represents the “mean number of events per unit time” e.g. the rate of
arrivals or the rate of failures as same as in Poisson distribution.
𝜆 = 0.5(𝑠𝑜𝑙𝑖𝑑)
𝜆 = 1(𝑑𝑜𝑡𝑡𝑒𝑑)
𝜆 = 2(𝑑𝑎𝑠ℎ𝑒𝑑)
Applications
Model inter arrival times (time between arrivals) when arrivals are completely random;
𝝀 = arrivals / hour
Model service times; 𝝀 = services / minute
Model the lifetime of a component that fails catastrophically (i.e. light bulb);
𝝀 = failure rate
1. It is closely related to the Poisson distribution – if X describes the time between two failures
then the number of failures per unit time has the Poisson distribution with parameter 𝝀, the
same.
8
𝒙
2. The cdf is 𝑭𝑿 (𝒙) = 𝝀 ∫𝟎 𝒆 𝝀𝒚
𝒅𝒚 = 𝟏 − 𝒆 𝝀𝒙
𝟏
3. The 100(1 −∝)% percentile is 𝒙∝ = − 𝝀 𝐥𝐧 ∝
4. Mean 𝝁𝒙 = 𝟏⁄𝝀
5. Variance 𝑽𝒙 = 𝟏⁄𝝀𝟐
7. “Memoryless” property
For all s>= 0 and t >=0
P(X > s + t | X > s) = P(X > t)
Instance 2:This means that the distribution of the waiting time to the next event
remains the same regardless of how long we have already been waiting. This only
happens when events occur (or not) totally at random, i.e., independent of past
history
Exercise :Suppose the life of an industrial lamp is exponentially distributed with failure rate
𝝀=1/3 (one failure every 3000 hours on the avg.) Determine the probability that
a) the lamp will last no longer than its mean life time. (constant for any 𝝀)
b) the lamp will last longer than its mean life time
c) the industrial lamp will last between 2000 and 3000 hours.
d) the lamp will last for another 1000 hours given that it is operating after 2500 hours.
Answer:
a) 𝑃(𝑋 ≤ 3)=
b) 𝑃(𝑋 > 3)=
c) 𝑃(2 ≤ 𝑋 ≤ 3)=
9
d) 𝑃(𝑋 > 3.5|𝑋 > 2.5)=𝑃(𝑋 > 2.5 + 1|𝑋 > 2.5)=𝑃(𝑋 > 1)
Gamma distribution
Gamma distribution is more suitable to describe some of the real world applications when they
follow exponential patterns. The general command of a such probability density is given by
In evaluating k, using calculus theory, the Gamma function which only depends on 𝛼 is derived:
Thus ∫ 𝑘𝑥 𝑒 𝑑𝑥 = 𝑘𝛽 𝜏(𝛼) = 1
A random variable X has a Gamma distribution has the probability density function
1
𝑥 𝑒 , 𝑓𝑜𝑟 𝑥 > 0
𝑔(𝑥; 𝛼, 𝛽) = 𝛽 𝜏(𝛼)
0 , 𝑒𝑙𝑠𝑒𝑤ℎ𝑒𝑟𝑒
Where 𝛼 > 0 and 𝛽 > 0.
The mean 𝜇 = 𝛼𝛽 and 𝑉(𝑋) = 𝛼𝛽
Observe the graphs of gamma functions for different pairs of values for 𝛼and 𝛽
10
Exercise: In a certain city, the daily consumption of electric power in millions of kilowatt-hours
can be treated as a random variable having a Gamma distribution with 𝛼 = 3 and 𝛽 = 2.
(i) What is the average consumption of electric power per day by the city?
(ii) If the power plant of this city has a daily capacity of 12 million kilowatt-hours, what
is the probability that this power supply will be inadequate on any given day?
Answer:
(i) Average = 𝛼𝛽 = 3 ∗ 2 = 6
𝑥
∞ 1
(ii) 𝑃(daily consumption of electric power ≥ 12) = ∫12 3 𝑥3−1 𝑒−2 𝑑𝑥
2 𝜏(3)
1
=1− 𝑥 𝑒 𝑑𝑥
2 𝜏(3)
11
Sampling distributions
Let’s draw all possible samples of size n from a given population of size 𝑁. Then consider
computing a statistic; the mean or a proportion or the standard deviation for each sample.
The variability of a sampling distribution is measured by its variance (or by its std. deviation).
Note: If 𝑁 is much larger than 𝑛, then𝑛/𝑁is fairly small and the sampling distribution has
roughly the same sampling error, irrespective of whether sampling is done with or
without replacement.
If sampling is done without replacement and the sample represents a significant fraction
(say, 1/10) of the population size, the sampling error will be clearly smaller.
Thus, 𝝈𝒙 = 𝝈
√𝒏
Therefore, we can specify the sampling distribution of the mean 𝑥 ~𝑁(𝜇 ̅ , 𝜎 ̅ )as
Let the probability of getting a success is P; and the probability of a failure is Q in a population.
From this population of size 𝑁, suppose that we draw all possible samples of size n. And finally,
within each sample, suppose that we determine the proportion of successes p and failures q. In
this way, we create a sampling distribution of the proportion.
Suppose there are𝑚 number of such samples drawn from this large population.
𝑷𝑸
Thus, 𝝈𝒑 = 𝒏
𝒑~𝑵(𝑷 ,
𝑷𝑸 ; whenever the sample size is sufficiently large and the population probability of
𝒏)
success (P) is known.
Example:
1. Suppose that a biased coin has probability p=0.4of heads. In 1000 tosses, what is the
probability that the number of heads exceeds 410?
1. Find the probability that of the next 120 births, no more than 40% will be boys. Assume
equal probabilities for the births of boys and girls. Assume also that the number of births in
the population (N) is very large, essentially infinite.
14
Exercises:
1. A true-false examination has 48 questions. Jane has probability 3/4 of answering a question
correctly. Ama just guesses on each question. A passing score is 30 or more correct answers.
Compare the probability that Jane passes the exam with the probability that Ama passes it.
Jane’s score has distribution B(48,0.75), so the probability that Jane’s score is 30 or more is
1-P(X<=29) = 0.9627. In case your calculator doesn’t give an answer, you will have to use a
normal approximation to the Binomial distribution (based on the Central Limit Theorem)
2. A restaurant feeds 400 customers per day. On the average 20 percent of the customers order
apple pie.
(a) Give a range for the number of pieces of apple pie ordered on a given day such that you
can be 95 percent sure that the actual number will fall in this range.
(b) How many customers must the restaurant have, on the average, to be at least 95 percent
sure that the number of customers ordering pie on that day falls in the 19 to 21 percent
range?
3. A rookie is brought to a baseball club on the assumption that he will have a 0.3 batting
average. (Batting average is the ratio of the number of hits to the number of times at bat.) In
the first year, he comes to bat 300 times and his batting average is 0.267. Assume that his at
bats can be considered Bernoulli trials with probability 0.3 for success. Could such a low
average be considered just bad luck or should he be sent back to the minor leagues?
15
3) Student’s t Distribution
A particular form of the t distribution is determined by its degrees of freedom. The “degrees of
freedom” refers to the number of independent observations in a set of data.
Suppose we have a simple random sample of size n drawn from a Normal population with mean
𝜇 and standard deviation 𝜎. Let 𝑥̅ denote the sample mean and s, the sample standard deviation.
16
The 𝑡 score produced by this transformation can be associated with a unique cumulative
probability. This cumulative probability represents the likelihood of finding a sample mean less
than or equal to 𝑥̅ , given a random sample of size n.
The notation 𝑡 represents the t-score that has a cumulative probability of (1 - α).
The t distribution can be used with any statistic having a bell-shaped distribution (i.e.,
approximately normal). i.e. when the population size is large but the sample sizes are small and
the standard deviation of the population is unknown t-Distribution can be applied.
The t distribution should not be used with small samples from populations that are not
approximately normal.
17
Example:
1. A random sample of 12 observations from a normal population with mean 48 produced the
following
Estimates: 𝑥̅ = 47.1and 𝑠 = 4.7. Find the probability of getting a sample of the same size
with its mean less than or equal to the population mean.
2. The MD of Orrange light bulb manufactures claims that an average of their light bulbs lasts
300 days. An investigator randomly selects 15 bulbs for testing and those bulbs last an
average of 290 days, with a standard deviation of 50 days. Assuming MD’s claim as true,
determine the probability that 15 randomly selected bulbs would have an average life of no
more than 290 days?
18
4. Chi-square Distribution
The chi-square statistic can be calculated from a sample of size 𝑛drawn from a population,
which is normal, using the following equation:
(𝑛 − 1)𝑠
𝜒 = 𝜎
When sampling is done for an infinite number of times, and by calculating the chi-square statistic
for each sample, the sampling distribution for the chi-square statistic can be obtained. It is then
called the chi-square distribution.
The chi-square distribution is constructed so that the total area under the curve is equal to 1. The
probability that the value of a chi-square statistic will fall between 0 and A; 𝑃(𝜒 ≤ 𝐴) is
illustrated by the following diagram.
Using the following Chi-Square Distribution table, one can find thecritical 𝜒 value, when the
probability of exceeding the critical value is given.
20
Example: My Cell company has developed a new cell phone battery. On average, the battery
lasts 60 minutes on a single charge. The standard deviation is 5 minutes. Suppose the
manufacturing department runs a quality control test. They randomly select 10 batteries. The
standard deviation of the selected batteries is 6 minutes.
b) What is the probability that the standard deviation of any sample of size 10 would be greater
than 6 minutes?
5. F Distribution
The distribution of all possible values of the f statistic is called an F distribution, with v1 = n1 -
1 and v2 = n2 - 1 degrees of freedom. The f statistic, also known as an f value, is a random
variable that has an F distribution.
Select a random sample of size n1 from a normal population, having a standard deviation
equal to σ1.
Select an independent random sample of size n2 from a normal population, having a
standard deviation equal to σ2.
The f statistic is the ratio of s12/σ12 and s22/σ22.
21
The curve of the F distribution depends on the degrees of freedom, v1 and v2.
This cumulative probability represents the likelihood that the f statistic is less than or equal to a
specified value.
F-distribution table can be used to find the value of an f statistic having a cumulative probability
of (1 - α); represented by fα.
Thus, f0.05(v1, v2) refers to value of the f statistic having a cumulative probability of (1-0.05)=
0.95, with v1and v2degrees of freedom.
Example:
Suppose a sample of 11 of cows was selected at random from a population of them having the
population standard deviation of their weight is 5 kg and the estimated sample sd is 4.5 kg.
Another sample of size 7 of bulls was taken in a similar way with their population sd is 3.5 kg
and sample sd is 4 kg.
a) Compute an f-statistic.
Confidence Intervals
A confidence interval will give you a range of values for a given population parameter, within which the
parameter falls in 100(1-α)% of the time.
The following table includes the standard errors of some statistics to help you in finding the confidence
intervals for the respective population parameters.
𝒙 ± 𝒁𝜶 (𝑺𝑬)
𝟐
Margin of Error
𝒑 ± 𝒁𝜶 (𝑺𝑬)
𝟐
24
(𝑛 − 1)𝑠 (𝑛 − 1)𝑠
𝐶ℎ𝑖 − 𝑠𝑞𝑢𝑎𝑟𝑒 𝑠𝑡𝑎𝑡𝑖𝑠𝑡𝑖𝑐 = 𝑎𝑛𝑑 ~𝜒2( )
𝜎 𝜎
(𝑛 − 1)𝑠
Thus, 𝜒2 / ,( ) ≤ ≤ 𝜒2 / ,( )
𝜎
= 𝜒2 1
,( )/ (𝑛 − 1)𝑠 ≤
𝜎
≤ 𝜒2 ,( ) /(𝑛 − 1)𝑠
Computational Exercise
Breakdown voltage is a characteristic of an insulator that defines the maximum voltage difference that
can be applied across the material before the insulator collapses and conducts. In solid insulating
materials, this usually creates a weakened path within the material by creating permanent molecular or
physical changes by the sudden current. Within rarefied gases found in certain types of lamps,
breakdown voltage is also sometimes called the "striking voltage". [Wikipedia]
The breakdown voltage of a material is observed on 17 experimental units as it is not a definite value
because it is a form of failure. Thus we have n = 17 and 𝑠 = 137324:3. Find the 95% confidence interval
for 𝜎 , to describe more about the population variance (variance of the breakdown voltage of the
material)?
25
Hypothesis Testing
A statistical hypothesis is an intelligent educated guess/assumption about a population parameter,
which may or may not be true. There are two forms of statistical hypotheses.
Null hypothesis: This is denoted by H0, is usually the hypothesis that sample observations result
purely from chance.
Alternative hypothesis: This is denoted by H1 or Ha, is the hypothesis that sample observations
are influenced by some non-random cause.
Decision Errors
P-value: The strength of evidence in support of a null hypothesis is measured by the P-value.
Null Alternative Test Statistic, the type of test & rejection criterion
Hypothesis Hypothesis
Population
has mean “A”
𝑇 = (𝑋 − 𝜇)⁄𝑠 or 𝑇 = ; 𝜎 unknown
√
𝐻 :𝜇 ≥𝐴 𝐻 :𝜇 <𝐴
Examples:
1. Bon Air Elementary School has 300 students. The principal of the school thinks that the average IQ
of students at Bon Air is at least 110. To prove her point, she administers an IQ test to 20 randomly
selected students. Among the sampled students, the average IQ is 108 with a standard deviation of
10. Based on these results, should the principal accept or reject her original hypothesis? Assume a
significance level of 0.01.
2. An inventor has developed a new, energy-efficient lawn mower engine. He claims that the engine
will run continuously for 5 hours (300 minutes) on a single gallon of regular gasoline. Suppose a
simple random sample of 50 engines is tested. The engines run for an average of 295 minutes, with
a standard deviation of 20 minutes. Test the null hypothesis that the mean run time is 300 minutes
against the alternative hypothesis that the mean run time is not 300 minutes. Use a 0.05 level of
significance. (Assume that run times for the population of engines are normally distributed.)
27
When the two population variances are known and not equal
(𝑥 − 𝑥 ) − (𝜇 − 𝜇 )
𝑍=
+
𝐻 :𝜇 = 𝜇
When the two population variances are unknown and not equal
df = (s12/n1 + s22/n2)2 /
or smaller of n1 - 1 & n2 - 1
28
Null Alternative Test Statistic, the type of test & rejection criterion
Hypothesis Hypothesis
Example:
The weights of 9 obese women before and after 12 weeks on a very low calorie diet were as follows:
Variance Tests
Acknowledged http://www.xycoon.com/
Example:
1. The data 159.9, 187.2, 180.1, 158.1, 225.5, 163.7, and 217.3 consists of the weights, in pounds, of a
random sample of seven individuals taken from a population that is normally distributed. The
variance of this sample is given as 753.04.
2 2
Test the null hypothesis H0: = 750.0 against the alternative hypothesis H1: 750.0 at a level of
significance of 0.3.
2. Students have collected the data 27, 29, 22, 21, 26, 28, 24, and 29 from one population and the data
19, 18, 24, 18, 22, and 15 from another. The variance of the first sample is 9.64286 and the variance
of the second sample is 10.2667. The ratio of the first variance to the second is 0.939239. Test the
null hypothesis that the two variances are equal, H0: 12 / 22 = 1, against the alternative hypothesis
that the two variances are not equal, H1: 12 / 22 1, where 12 is the variance of the first
population and 22 is the variance of the second population, at a significance level of .10.
31
1. An insurance company is reviewing its current policy rates. When originally setting the rates
they believed that the average claim amount was $1,800. They are concerned that the true
mean is actually higher than this, because they could potentially lose a lot of money. They
randomly select 40 claims, and calculated a sample mean of $1,950. Assuming that the standard
deviation of claims is $500, and set 𝛼= 0.05, test to see if the insurance company should be
concerned.
2. Trying to encourage people to stop driving to campus, the university claims that on average it
takes people 30 minutes to find a parking space on campus. I do not think it takes so long to find
a spot. In fact I have a sample of the last five times I drove to campus, and I calculated 𝑥̅ = 20.
Assuming that the timeit takes to find a parking spot is normal, and that s2= 6 minutes, then
perform a hypothesis test with level𝛼= 0.10 to see if my claim is correct.
3. A sample of 40 sales receipts from a grocery store has 𝑥̅ = $137 and s2= $30.2. Use these values
to test whether or not the mean value of a receipt at the grocery store is different from $150.
4. The actual proportion of families in a certain city who own, rather than rent their home is 0.70.
If 84 families in this city are interviewed at random and their response to the question of
whether they own their home, are recorded. 61 of them have responded saying that they own
the home. Using a suitable test statistic test the claim that the population proportion of owning
a home is 0.7.
32
Chi-square tests
1. Chi-square test of Association
2× 𝟐 Contingency Table
Total 62 38
The two variables here are “Gender of Candidate” and “Results of the Candidate”.
When you have two categorical variables from the same population, you may test whether there
is a significant association between the two variables using the Chi-square Test.
Hypothesis
Test Statistic
(𝑂 − 𝐸 )
𝜒 = ~ 𝜒 , % ; 𝑑𝑓 = (𝑟 − 1)(𝑐 − 1)
𝐸
Under H
×
Expected frequency =
𝜒 =
𝜒 , % = 3.84
Decision
33
𝒉 × 𝒌 Contingency Table
Example: A survey of 200 families, known to the regular television viewers was undertaken.
They were asked which of the TV channels they watched most during a common week, and the
observations are as follows.
TV channel Region
watched most
North East South West
1 19 16 42 23
2 6 11 26 7
3 15 3 12 10
Test the hypothesis that there is no association between the TV channel watched most and the
Region, using the Chi-square test.
Example 1
From a list of 500 digits the occurrence of each distinct digit is observed. Test at 5% significance
level, whether the sequence is a random sample from the Uniform distribution.
Digit 0 1 2 3 4 5 6 7 8 9
Frequency 40 58 49 53 38 56 61 53 60 32
34
Example 2
The table below gives the number of heavy rainstorms reported by 330 weather stations over a
one year period.
a) Find the expected frequencies of rainstorms given by the Poisson distribution having the
same mean and the total as the observed distribution.
b) Use the Chi-square test to check the adequacy of the Poisson distribution to model these
data.
# rainstorms 0 1 2 3 4 5 More
than 5
The rth moment of a random variable 𝑋 denoted by 𝝁𝒓 is the expected value of the random
variable’s rth power; i.e.𝑬(𝑿𝒓 ).
𝐹𝑜𝑟 𝑟 = 1,2,3, …
The rth moment about the mean of a random variable 𝑋 denoted by 𝝁𝒓 is the expected value of
(𝑋 − 𝜇) ; i.e.𝑬[(𝑿 − 𝝁)𝒓 ].
𝐹𝑜𝑟 𝑟 = 1,2,3, …
Theorem:
Proof:
36
The moments of most distributions can be determined directly by evaluating the respective
integrals or sums.
MGF is an alternative procedure, which sometimes provides considerable simplifications to
find the moments.
MGF can be used to find the expected value of ar.v. and its variance.
Definition
𝑀 (𝑡)is the value which the function 𝑀 assumes for the real variable (𝑡).
𝑑 𝑀 (𝑡)
𝑎𝑛𝑑, 𝑀 (𝑡) =
𝑑𝑡
1. 𝑀 (𝑡)| = 𝐸(𝑋)= 𝜇
2. 𝑀 (𝑡)| = 𝐸(𝑋 )
3. 𝑉(𝑋) = 𝑀 (𝑡)| − (𝑀 (𝑡)| )
37
Example 1
Find the MGF and E(X) and V(X) for the r.v.X whose pdf is given by
𝑒 , 𝑓𝑜𝑟 𝑥 > 0
𝑓(𝑥) =
0, 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒
Example 2
Suppose Y~ Bin (n,p). Find E(X) and V(X) using its MGF.
Exercises
1. Let Y be a continuous r.v with pdff(y)=2e-3y; y≥ 0, Find the mean and the variance of Y.
2. Given that the probability distribution of a r.v. is 1/8 𝐶 for r=1,2,3. Find the MGF, mean
and variance for this random variable.
38
Linear Regression
The simplest way to predict values of a random variable in Statistics can be considered as Linear
Regression technique.
Indications
1. Scatter Diagram
2. Correlation Coefficient
23
22
21
20
20 21 22 23 24 25 26 27 28 29 30
X
In simple linear regression, we allow only one independent variable to predict the dependent
variable. Under multiple linear regression there can be many independent variables predicting a
sing dependant variable.
39
R² = 0.5231
23
22
21
20
20 21 22 23 24 25 26 27 28 29 30
X
General Model
𝒀 = 𝜶 + 𝜷𝑿
𝑦 = 𝛼 + 𝛽𝑥 + 𝜀 ;
OR
𝜀 = 𝑒𝑟𝑟𝑜𝑟 𝑖𝑛 𝑝𝑟𝑒𝑑𝑖𝑐𝑡𝑖𝑜𝑛
First Step :Plot the scatters and look for any evidence for a linear relationship.
40
Assumption:
Error terms are independently and identically distributed as Normal with mean zero and
variance𝜎 .
i.e.𝜀~𝑁(0, 𝜎 )
𝑬𝑺𝑺 = 𝜀 = (𝑦 − 𝛼 − 𝛽𝑥 )
( ) ( )
ESS is at minimum when = 0 and =0
𝜕(𝐸𝑆𝑆) 𝜕(∑ (𝑦 − 𝛼 − 𝛽𝑥 ) )
= =2 𝑥 (𝑦 − 𝛼 − 𝛽𝑥 ) = 0
𝜕𝛽 𝜕𝛽
𝑥 𝑦 −𝛼 𝑥 −𝛽 𝑥 =0
𝑥 𝑦 =𝛼 𝑥 +𝛽 𝑥 − − − − − − − −(1)
41
𝜕(𝐸𝑆𝑆) 𝜕(∑ (𝑦 − 𝛼 − 𝛽𝑥 ) )
= =2 (𝑦 − 𝛼 − 𝛽𝑥 ) = 0
𝜕𝛼 𝜕𝛼
𝜕(𝐸𝑆𝑆)
= 𝑦 − 𝑛𝛼 − 𝛽 𝑥 =0
𝜕𝛼
𝑦 = 𝑛𝛼 + 𝛽 𝑥 − − − − − − − −(2)
From (2)
∑ 𝑦 ∑ 𝑥
𝛼= − 𝛽
𝑛 𝑛
𝜶 = 𝒚 − 𝜷𝒙
∑ 𝑦 ∑ 𝑥
𝑥𝑦 = − 𝛽 𝑥 +𝛽 𝑥
𝑛 𝑛
∑ 𝑦 ∑ 𝑥 (∑ 𝑥 )
𝑥𝑦 = −𝛽 +𝛽 𝑥
𝑛 𝑛
𝑛∑ 𝑥 𝑦 −∑ 𝑥 ∑ 𝑦
𝛽=
𝑛 ∑ 𝑥 − (∑ 𝑥 )
∑ 𝑥 𝑦 − 𝑛𝑥̅ 𝑦
𝛼 = 𝑦 − 𝛽 𝑥̅ 𝑎𝑛𝑑 𝛽=
∑ 𝑥 − 𝑛𝑥̅
Example:
X y
26.8 26.5
28.9 24.2
23.6 27.1
28.1 22.5
22.6 25.8
27.7 23.6
24.7 26.3
25.6 24.9
𝛽−𝛽
𝑃 ⎛𝑡 , ≤ ≤ 𝑡 ,
⎞ = (1 − 𝛼)
⎝ ⎠
𝐻 ∶𝛽=0
𝛽−𝛽
~𝑡
REGRESSION
∑𝒏𝒊 𝟏(𝒚 − 𝑭𝒄𝒂𝒍 =
(estimation via 1 RSS/1
𝒚)2
reg. line)
𝐑𝐒𝐒/𝟏
ERROR(Residual)
∑𝒏𝒊 𝟏(𝒚𝒊 − 𝐄𝐒𝐒/(𝐧 − 𝟐) P(𝑭𝟏,𝒏 𝟐 ≥
(error in n-2 ESS/(n-2) 𝑭𝒄𝒂𝒍 )
𝒚 )2
estimation)
TOTAL
∑𝒏𝒊 𝟏(𝒚𝒊 −
(estimation + n-1 ~𝑭𝟏,𝒏
𝒚)2 𝟐
error)
Example :
Description: These data are on the production of power from wind mills. Direct Current (DC)
output was measured against wind speed (in miles per hour).
Number of observations: 25
Variable Description
output Current output produced by the wind mill
speed Windspeed (in miles per hour)
44
Source:Joglekar, G., Schuenemeyer, J.H. and LaRiccia, V. (1989) Lack-of-fit testing when
replicates are not available, American Statistician, 43, pp. 135-143.
(speed,output)≡ (𝑥, 𝑦)
0.123,2.45 1.582,5.00 2.166,8.15
0.500,2.70 1.501,5.45 2.112,8.80
0.653,2.90 1.737,5.80 2.303,9.10
0.558,3.05 1.822,6.00 2.294,9.55
1.057,3.40 1.866,6.20 2.386,9.70
1.137,3.60 1.930,6.35 2.236,10.00
1.144,3.95 1.800,7.00 2.310,10.20
1.194,4.10 2.088,7.40
1.562,4.60 2.179,7.85
ANOVAa
Total 153.554 24
𝑌 = 𝛽𝑋 + 𝜀
𝑦 𝛽 𝜀
1 𝑋 ⋯ 𝑋
𝑦. 𝛽. 𝜀.
Where, 𝑌 = .. 𝛽= .. 𝜀= .. 𝑋= ⋮ ⋱ ⋮
. . .
𝑦 𝜀 1 𝑋 ⋯ 𝑋
𝛽
(𝑦 − 𝑦) = (𝑦 − 𝑦) + (𝑦 − 𝑦)
ANOVA table
SOV SS DF MS F-ratio
Partial F-test
This test assesses whether the addition of any specific independent variable, given others
already in the model significantly contributes to the prediction of 𝑌.
Suppose that we wish to test whether adding a variable 𝑋 to the model significantly improves the
prediction of the response 𝑌, given that variables 𝑋 , 𝑋 , … , 𝑋 are already in the model.
46
Test Statistics
[𝐸𝑆𝑆(𝑅𝑀) − 𝐸𝑆𝑆(𝐹𝑀)]
𝐹= 1 ~𝐹
𝐸𝑆𝑆(𝐹𝑀)
𝑛−𝑝−1
Effect of multi-collinearity
Design of Experiment
The statistical Design of Experiment is an efficient procedure for planning experiments
so that the data obtained can be analyzed to yield valid and objective conclusions.
Treatments: These are the different procedures whose effects are to be measured and compared.
Experimental Units: The group of material to which a treatment is applied in a single trial of
the experiment.
Replication: Each treatment appearing more than once or applied more than one unit in the
experiment.
Example:
1) An experiment was conducted to determine how five different kind of work tasks affect a
worker’s pulse rate. In this experiment, 60 male workers were assigned at random to five
different groups so that there were 12 workers in each group. The five tasks were
randomly assigned to the groups so that each group gets one and only one task. The pulse
rates of workers were measured after each task.
a. Identify the experimental unit here.
b. How many treatments are there, and what are they?
c. How many replications are there in the experiment?
d. What is the response variable that should be analyzed to compare the effects of
the treatments?
2) A plant breeder wants to study the effect of three levels of nitrogen and three levels of
potassium on his new variety of paddy. He wants to try all possible nitrogen potassium
combinations in his experiment. He has four blocks of lands, each of which is divided
into 9 parts. Each combination is assigned at random to one part in each block.
a. Identify the experimental unit here.
b. How many treatments or combinations are there, and what are they?
c. How many replications are there?
Structures of the Experimental Design
Treatment Structure: Consists of the set of treatments, treatment combinations or population, that
the experimenter has selected for comparison.
Design Structure: Consists of the grouping of the units into homogeneous groups or blocks.
There are 4 types of design structures which are commonly used.
48
Testing ANOVA
In ANOVA we are interested in testing the equality of the 𝑡 treatment means. Therefore, the
appropriate hypothesis of interest is,
𝐻 :𝜇 =𝜇 =⋯= 𝜇
Example:
1) An engineer is interested in determining if the RF power setting affects the etch rate, and
she runs a completely randomized experiment with four levels of power and five
replicates. Data are given below. Construct one way ANOVA table stating hypothesis,
model clearly. What is the conclusion from the hypothesis test?
2) A manufacturer suspects that the batches of raw material furnished by her supplier differ
significantly in calcium content. There is a large number of batches currently in the
warehouse. Five of these are randomly selected for study. A chemist makes five
determinations on each batch and obtains the following data.
Usually blocking is done in such a way that experimental units within a block are as
homogeneous as possible and experimental units between blocks are not homogeneous.
Testing ANOVA
The model for an RCBD with 𝑡 treatments and 𝑏 blocks is given by,
Total 𝑏𝑡 − 1 𝑦..
𝑆𝑆𝑇 = 𝑦 −
𝑏𝑡
51
In an RCBD one usually is not interested in comparison between blocks as the experimental
design is based on the assumptions that block effects are different.
Example:
1) In a study to reduce stress on air traffic controllers, 3 alternative systems A, B and C have
been proposed. Six controllers were chosen for the study and each controller had to use
all 3 systems but in random order. The following data provide a measure of the stress for
each controller on each system.
System 15 14 10 13 16 13
A
System 15 14 11 12 13 13
B
System 18 14 15 17 16 13
C
Analyze the data and test if the 3 treatments are different in their level of stress.
2) Three laboratories, A, B, and C are used by food manufacturing companies for making
nutrition analyses of their products. The following data are the fat contents (in grams) of
the same weight of three similar types of peanut butter.
Example: Consider five gasolines A, B, C, D, and E have to be tested for the miles they do per
gallon on cars. Suppose five different brands of cars are available and each can only test one
gasoline per day. The day-to-day weather difference is also likely to influence to mileage on
cars. (Here there are two sources of variation other than gasoline differences, that one need to
remove). Show that how can we effectively use a Latin Square design for this experiment.
Assuming that interaction do not exist between any pair of factors, the statistical model of LS
can be given as,
𝑦 =𝜇+𝛼 +𝛽 +𝛾 ( ) +𝜀
The ANOVA table, which is an extension of the ANOVA for an RCBD as follows,
Total 𝑡 −1 𝑆𝑆𝑇
= 𝑦
𝑦..
−
𝑡
53
Example:
1) The following data resulted from an experiment to compare three burners B1, B2, and
B3. A Latin square design was resulted was used as the tests were made on three engines
and were spread over three days.
Engine 1 Engine 2 Engine 3
Day 1 B1 16 B2 17 B3 20
Day 2 B2 16 B3 21 B1 15
Day 3 B3 15 B1 12 B2 13
Test the hypothesis that there is no difference between the burners at 5% level of
significance yields.
2) In an experiment to compare the effects of three treatments A, B, and C on the milk yield
of 3 dairy cows and 3 successive periods during lactation were used as columns and rows
respectively, in a LS. Total nutrient consumptions are given as follows,
Compute the ANOVA and test whether the treatments are significantly different at 5%
level of significance.