0% found this document useful (0 votes)
38 views

1B40 Practical Skills: Weighted Mean

This document discusses methods for fitting theoretical models to experimental data using least squares regression. Specifically, it covers: 1) Deriving the weighted mean and calculating error for data from different distributions. The weighted mean minimizes the summed deviations from the mean. 2) Applying least squares to fit a straight line model y=mx+c to data, minimizing the vertical deviations of the points from the line. 3) Deriving the specific parameters m and c for the straight line model that minimize the summed squared deviations of the data points.

Uploaded by

Roy Vesey
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
38 views

1B40 Practical Skills: Weighted Mean

This document discusses methods for fitting theoretical models to experimental data using least squares regression. Specifically, it covers: 1) Deriving the weighted mean and calculating error for data from different distributions. The weighted mean minimizes the summed deviations from the mean. 2) Applying least squares to fit a straight line model y=mx+c to data, minimizing the vertical deviations of the points from the line. 3) Deriving the specific parameters m and c for the straight line model that minimize the summed squared deviations of the data points.

Uploaded by

Roy Vesey
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

1B40 Practical Skills

Weighted mean
The normal (Gaussian) distribution with a true mean µ and standard deviation σ is
1  ( x − µ )2 
p (x ) = exp  − .
σ 2π  2σ 2 
The probability of occurrence of a value x1 is p ( x1 ) . Hence the probability P of obtaining the
values x1 , x2 , x3 ,K , xn is
P ( x1, x2 , K xn ) = p ( x1 ) p ( x2 ) p ( x3 ) L p ( xn ) .
{Strictly there should be a factor 1/n! on the R.H.S as the order is irrelevant, but it can be
omitted as we are only interested in the variation of P with its parameters.}
Thus explicitly
 n 2

 1 
n
 ∑ ( x−µ) 
 exp  − .
i =1
P =
 σ 2π   2σ 2 
 
It is reasonable to assume that this should be a maximum. (Principle of Maximum
n

∑(x − µ )
2
Likelihood). This probability is a maximum when i
is a minimum. This idea leads to
i =1

the Principle of Least Squares which may be expressed as follows:


• the most probable value of any observed quantity is such that the sum of the squares of
the deviations of the observations from this value is the least.
n
1 n
The quantity ∑ ( xi − λ ) has a minimum value when λ = ∑ xi , i.e. the mean. This follows
2

i =1 n i=1
from
n n n

∑(x −λ) = ∑ x − 2λ ∑ xi + nλ 2 ,
2 2
i i
i =1 i =1 i =1

d  n 2
n


d λ  i =1
( xi − λ )  = − 2 ∑ xi + 2nλ = 0.
 i =1

Hence these principles lead to the often quoted result that the best estimate for µ is the
arithmetic mean.

It may be, of course, that the x1 , x2 , x3 ,K , xn belong to different Gaussian distributions with
different standard deviations. The total probability would then be
1  1  1   1   n ( x − µ )2 
P=    L   exp −∑ .
2π  σ 1   σ 2   σ n   i=1 2σ i 
2

1
( xi − µ )
2
n
This will be greatest when ∑
i =1 2σ i2
is a minimum. This occurs when µ is given by the

weighted mean
n
xi
∑σ
i =1
2
x= . n
i
1

i =1 σ i
2

In general if a measurement xi has weight wi then the weighted mean is


n

∑wx i i
x= i =1
n
.
∑w
i =1
i

The standard deviation of the weighted mean is


n

∑w (x − x )
2
i i
σ2 = i =1
n

∑w
i =1
i

These expressions reduce to those given earlier for the unweighted quantities if we put wi = 1
for all measurements.
For the case of only two quantities,
x1 x2
+
σ 12 σ 22
x= ,
1 1
+
σ 12 σ 22
and using the formula for the propagation of errors on a sum of two quantities gives
1 1 1
= 2+ 2.
σ σ1 σ 2
2

2
Curve fitting
We can apply the principle of least squares to the problem of fitting a theoretical formula to a set
of experimental points. The simplest case is that of a straight line.

The Straight Line


In many experiments it is convenient to express the relationship between the variables in the
form of the equation of a straight line i.e.
y = mx + c,
where m, the gradient of the line, and c, the intercept at x = 0, are treated as unknown
parameters.

As an example, consider the compound pendulum experiment where the relationship between
the period T and the adjustable parameter h is given by
h k2
T = 2π + ,
g gh
and the quantities h and k are defined in the script for the experiment.
Plotting T against h would yield a complicated curve which is difficult to analyse. However the
relationship can be expressed as
4π 2 2 4π 2k 2
T h=
2
h + .
g g
If T2h is plotted against h2 a straight line is expected. The benefits of this are
1. Results expressed as a linear graph have a satisfying immediacy of impact.
2. It is very easy to see if a set of results is progressively deviating from linearity - much
easier than say detecting deviation from, for example, a parabola.
3. The line of best fit to a set of points with error bars is easy to estimate approximately by
eye, and to insert with the aid of a ruler as a quick check. .
4. A mathematical method exists to calculate the line of best fit, which relates simply to the
statistical consideration of errors.

Best straight line fit to a linear curve using the method of least
squares (Legendre 1806)
We consider fitting a function of the form
y = mx + c
to a set of data points. The example is simple because it is linear in m and c not because y is
linear in x. The data consists of the points ( xi , yi ± σ i ) . That is we assume that the x-
coordinates are known exactly as they usually correspond to the independent variable (the one
under the experimenter’s control), but there is an uncertainty σ i in the y-coordinate
corresponding to the dependent variable that is “measured”. The deviation, d i , of each point
from the straight line is taken only in the y-coordinate, d i = yi − ( mxi + c ) .

3
Chart Title
y = -0.477x + 6.4796
7
6
5
4
y

3
2
1
0
0 2 4 6 8 10 12
x

Method I – Points with associated error bars


According to the principle of least squares we have to minimise
2
n
d i2 n  yi − mxi − c 
S = ∑ 2 =∑ . (1.1)
i =1 σ i i =1  σi 
On differentiating with respect to m and c in turn we get
n
( y − mxi − c ) xi
−2∑ i = 0,
i =1 σ i2
(1.2)
n
( yi − mxi − c )
−2∑ = 0.
i =1 σ i2
Expanding these we have
n
xi yi n
xi2 n
xi

i =1 σi 2
= m ∑
i =1 σ i
2
+ c ∑
i =1 σ i
2
,
(1.3)
n n n
yi xi 1

i =1 σ i
2
= m ∑
i =1 σ i
2
+ c∑i =1 σ i
2
.
n
1
The last one, on dividing through by ∑σ
i =1
2
becomes
i

4
n n
yi xi

i =1 σ i
2 ∑
i =1 σ i
2

n
= m n
+c
1 1

i =1 σ i
2 ∑
i =1 σ i
2

y = mx + c.
This shows that the best fit line passes through the weighted mean point ( x , y ) even if this does
not correspond to an actual measured point.
The eqns (1.3) are two simultaneous equations for the two unknown, m and c. Their solution is
[1][ xy ] − [ x ][ y ] [1][ xy ] − [ x ][ y ]
m= = ,
[1][ xx ] − [ x][ x ] D
(1.4)
[ y ][ xx ] − [ x][ xy ] [ y ][ xx] − [ x ][ xy ]
c= = ,
[1][ xx ] − [ x ][ x] D
where
D = [1][ xx ] − [ x ][ x ] , (1.5)
and the quantities in the square brackets [ ] are defined by
n n n n n
xi2
[1] = ∑ 2 ; [ x ] = ∑ 2 ; [ y ] =∑ 2 ; [ xy ] = ∑ 2 ; [ xx ] = ∑ 2 .
1 xi y xi yi
(1.6)
i =1 σ i i =1 σ i i =1 σ i i =1 σ i i =1 σ i

The calculation of the errors on the fitted parameters, m and c, is intricate and is done best by
techniques that are beyond this introductory course (matrix methods). We simply quote the
results,
[1] [1]
σ m = (δ m ) =
2
= ,
2

[1][ xx ] − [ x ][ x ] D
(1.7)
[ xx ] [ xx ]
σ c = (δ c ) =
2
=
2
.
[1][ xx ] − [ x ][ x ] D

These expressions may look complicated but they are easily evaluated in a computer
programme or a spreadsheet. They only involve sums of terms.

Method II – no estimate of error on y


If the errors on the data points are not known we can only minimise
n
S = ∑ ( yi − mxi − c ) .
2
(1.8)
i =1

This is equivalent to the previous case if we set all the errors σ i = 1 . Thus we can get the
results immediately for

5
n [ xy ] − [ x ][ y ]
m= ,
n [ xx ] − [ x ][ x ]
(1.9)
[ y ][ xx ] − [ x ][ xy ]
c= .
n [ xx ] − [ x ][ x]
In this case the only way to estimate the uncertainties in m and c is to use the scatter of the
points about the fitted line. The mean square error σ 2 in the residuals d i = yi − ( mxi + c ) is
given by
n

∑d i
2
Smin
σ2 = i =1
= . (1.10)
n− 2 n −2
{The n -2 occurs because we have only n -2 independent points, two being needed to find the
slope and intercept of the line}. We then estimate the errors
n n
σ m2 = ( δ m ) =
2
σ 2 = σ 2,
n [ xx ] − [ x ][ x ] D
(1.11)
σ c = (δ c) =
2 2 [ xx ]
σ =2 [ xx ] 2
σ ,
n [ xx] − [ x ][ x ] D
where
D = n [ xx ] − [ x ][ x ] . (1.12)

Correlation in least squares fits


The original measurements xi and yi may be uncorrelated but the values of m and c found by
both methods are correlated since they depend on the same data - the ( xi , yi ) values. The
best fit line passes through the fixed point ( x , y ) . If x > 0 and the gradient is increased by its
error the intercept decreases as the line pivots about the point ( x , y ) , and vice versa. A
formula for the covariance can be derived. For the weighted fits,

cov( m, c) = σ mc2
=−
[ x] ,
D
and for the unweighted ones

cov( m, c) = σ mc
2
=−
[ x] σ 2 .
D
As an illustration, a fit to some data gave the following weighted fit parameters:
m = −0.433, c = 6.189, σ m = 0.057, σ c = 0.297, cov( m, c ) = −0.014. The table
shows the values of yi predicted and the estimated uncertainty for chosen xi .
x predicted y y error with correlation y error without correlation
-10 10.5 0.8 0.6
40 -11.1 2.0 2.3
The errors with correlation included may be smaller or larger than those calculated without it!

6
The table below compares the advantages and disadvantages of Method I and Method II.

Method I Method II
Data points with big are essentially ignored in the fit are treated like those points with
errors small errors
Errors on m and c are realistic in terms of the statistics can be (unfortunately) small if
of the data, σ i and n points happen to lie well on a
straight line
If the data don’t the errors on m and c may be errors will be larger
really lie on a ridiculously small if statistics are
straight line large
Number of data 2 3
points needed to
estimate m, c and
errors
Can goodness of fit Yes No
be tested?
Can method be No Yes
used if σ i are
unknown?

Excel implementation of unweighted fit


Method II is implemented in the Excel spreadsheet function LINEST. If the array formula (see
Excel for ways to enter an array formula)
=LINEST(range of known y’s, range of known x’s, , true)
is entered into an array of 5 rows and 2 columns, then it returns the following results,
(The words have been added to explain the quantities computed).
LINEST
m -0.3961 5.8665 c
error on m 0.0465 0.3151 error on c
2
r 0.8898 0.4873 standard error on y
F 72.6618 9 number degrees freedom
regression sum of squares 17.2568 2.1374 sum of squares of residuals

Thus this example describes the line


y = − ( 0.40 ± 0.05) + ( 5.9 ± 0.3) .

You might also like