CHE2015
General Analytical and
Inorganic Chemistry
CHE2015 2020 M. Kalulu
Data Handling and
Spreadsheets in
Analytical
Chemistry
CHE2015/2219 2020 M. Kalulu 2
Why do we need statistics in
analytical chemistry?
• Scientists need a standard format to
communicate significance of experimental
numerical data.
• Objective mathematical data analysis
methods needed to get the most information
from finite data sets
• To provide a basis for optimal experimental
design.
CHE2015/2219 2020 M. Kalulu
What Does Statistics Involve?
• Defining properties of probability
distributions for infinite populations
• Application of these properties to
treatment of finite (real-world) data sets
• Probabilistic approaches to:
– Reporting data
– Data treatment
– Finite sampling
– Experimental design
CHE2015/2219 2020 M. Kalulu
Some Useful Statistics Terms
• Mean – Average of a set of values
• Median – Mid-point of a set of values.
• Population – A collection of an infinite munber of
measurements. N infinity
• Sample – A finite set of measurements which
represent the population.
• True value (true mean)- (m), mean value for the
population.
• Observed Mean –(x), mean value of the sample set
CHE2015/2219 2020 M. Kalulu
Accuracy and Precision:
Is There a Difference?
• Accuracy: degree of agreement between
measured value and the true value.
• Absolute true value is seldom known
• Realistic Definition: degree of agreement
between measured value and accepted true
value.
CHE2015/2219 2020 M. Kalulu
Precision
• Precision: degree of agreement between
replicate measurements of same quantity.
• Repeatability of a result
• Standard Deviation
• Coefficient of Variation
• Range of Data
• Confidence Interval about Mean Value
CHE2015/2219 2020 M. Kalulu
You can’t have accuracy without good precision.
But a precise result can have a determinate or systematic error.
Fig. 3.1. Accuracy and precision.
©Gary Christian, Analytical Chemistry, 6th Ed. (Wiley)
CHE2015/2219 2020 M. Kalulu
Determinate Errors
Are They Systematic?
• Determinate Errors:
• Determinable and either avoided or
corrected.
• Constant errors
• Uncalibrated weights
• Burets- volume readings can be corrected
• Concentration variation with temperature
CHE2015/2219 2020 M. Kalulu
Indeterminate Errors
Are They Random?
• Indeterminate Errors-
– accidental or random errors
• Represent the experimental uncertainty that
occurs in any measurement.
– Small difference on successive measurements
• Random Distribution
• Mathematical Laws of Probability
• Normal distribution or Gaussian Curve
CHE2015/2219 2020 M. Kalulu
Random errors follow a Gaussian or normal distribution.
We are 95% certain that the true value falls within 2σ (infinite population),
IF there is no systematic error.
©Gary Christian,
Analytical Chemistry,
6th Ed. (Wiley) Fig. 3.2 Normal error curve. CHE2015/2219 2020 M. Kalulu
A Review of Significant Figures
How many significant figures in the following
examples?
• 0.216 90.7 800.0 0.0670 500
• ((35.63 * 0.5482 * 0.05300)/1.1689)*100%
• 88.5470578%
• 88.55%
• ((97.7/32.42)*100.0)+36.04)/687
• 0.4911
CHE2015/2219 2020 M. Kalulu
Ways of Expressing Accuracy
• Absolute Errors: difference between true
value and measured value
• Mean Errors: difference between true
value and mean value
• Relative Error: Absolute or Mean Errors
expressed as a percentage of the true value
((m-x)/m)*100 = % Relative Error
• Relative Accuracy: measured or mean
value expressed as a percentage of true
value
((x/m)*100 = % Relative Accuracy
CHE2015/2219 2020 M. Kalulu
Standard Deviation
The Most Important Statistic
• Standard Deviation s of an intinite set of
experimental data is theoretically given by
s = S(xi – m)2/N
• xi = individual measurement
m = mean of infinite number of
measurements (true value)
• N = number of measurements
CHE2015/2219 2020 M. Kalulu
Standard Deviation of a Finite Set
of Experimental Data
• Estimated Standard Deviation, s (N < 30)
• s = (S(xi – x)2/(N-1))
• For finite sets the precision is represented
by s.
• Standard deviation of the mean smean
• Smean = s/N
• Relative standard deviation rsd: or
coefficient of variation
• (s/mean)*100 = % rsd
CHE2015/2219 2020 M. Kalulu
Enter text, numbers, or formulas in specific cells.
Fig. 3.3. Spreadsheet cells.
©Gary Christian, Analytical Chemistry, 6th Ed. (Wiley)
CHE2015/2219 2020 M. Kalulu
The formula in cell B6 subtracts the weight of the flask from the weight with water.
You can copy the formula to the right by highlighting the cell and dragging it from the
lower right corner to the right.
Fig. 3.4. Filling cell contents.
CHE2015/2219 2020 M. Kalulu
©Gary Christian, Analytical Chemistry, 6th Ed. (Wiley)
We often use relative cell references in formulas.
If a number from a given cell is to be a constant in the formula, place $ in front of that
cell’s descriptors.
©Gary Christian,
Fig. 3.5. Relative and absolute cell references. Analytical Chemistry,
6th Ed. (Wiley)
CHE2015/2219 2020 M. Kalulu
Excel has a number of mathematical and statistical functions.
Click on fx on the tool bar to open the Paste Function.
CHE2015/2219 2020 M. Kalulu
©Gary Christian, Analytical Chemistry, 6th Ed. (Wiley)
The cell B4 formula calculates the standard deviation of cells B1 to B3.
Standard deviation calculation.
©Gary Christian, Analytical Chemistry, 6th Ed. (Wiley)
CHE2015/2219 2020 M. Kalulu
Propagation of Errors
Not Just Additive
Computation Determinate Indeterminate
(Random)
Add/Subtract ER = EA+ EB-EC sR2 = sA2+ sB2+sC2
R = A+B-C sR =sA2+ sB2+sC2
Multiply/Divide ER= EA+ EB- EC (sR/R)2 =(sA/A)2+
R A B C
R = AB/C (sB/B)2+(sc/C)2
General
R = f(A,B,C,…)
CHE2015/2219 2020 M. Kalulu
Control Charts
• Quality control chart: time plot of a
measured quantity assumed to be constant.
• Inner and Outer control limits
• Inner control limit: 2s (1/20)
• Outer control limit: 2.5s (1/100) or
3s(1/500)
CHE2015/2219 2020 M. Kalulu
This is a time plot for analysis of the same sample, assumed to have only
random distribution, to check for errors in a method.
At 2s, there is a 1 in 20 chance a value will exceed this only by chance.
At 2.5s, it is 1 in 100.
Fig. 3.6. Typical quality control chart.
©Gary Christian, Analytical Chemistry, 6th Ed. (Wiley) CHE2015/2219 2020 M. Kalulu
Confidence Limit
How sure are you?
• Confidence Limit = x ± ts/N
t statistical factor that depends on the number
of degrees of freedom
degrees of freedom = N-1
Values of t at different confidence levels and
degrees of freedom are located in table 3.1
CHE2015/2219 2020 M. Kalulu
Select a confidence level (95% is good) for the number of samples analyzed
(= degrees of freedom +1).
Confidence limit = x ± ts/√N.
It depends on the precision, s, and the confidence level you select.
CHE2015/2219 2020 M. Kalulu
©Gary Christian, Analytical Chemistry, 6th Ed. (Wiley)
Tests of Significance
Is there a difference?
• The F Test
• Designed to indicate whether there is a
difference between two methods.
• F = s12/s22 degrees of freedoms 1 and 2
If calculated F value exceeds a tabulated F
value at a selected confidence level, then
there is a significant difference between the
variances of the two methods.
CHE2015/2219 2020 M. Kalulu
F = s12/s22.
You compare the variances of two different methods to see if there is a
significant difference in the methods, at the 95% confidence level.
©Gary Christian, Analytical Chemistry, 6th Ed. (Wiley)
CHE2015/2219 2020 M. Kalulu
Student T Test
Are there Differences in the Methods?
1. t Test When an Accepted Value is Known
m = x ± ts/N
It follows
±t = (x- m) N/s
CHE2015/2219 2020 M. Kalulu
Select a confidence level (95% is good) for the number of samples analyzed
(= degrees of freedom +1).
Confidence limit = x ± ts/√N.
It depends on the precision, s, and the confidence level you select.
CHE2015/2219 2020 M. Kalulu
©Gary Christian, Analytical Chemistry, 6th Ed. (Wiley)
Tests of Significance
Is there a difference?
• Comparison of the Means of Two Samples
• ±t = ((x1-x2)/sp) (N1N2/(N1+N2))
• pooled standard deviation: sp
• sp = (S(xi1-x1)2+S(xi2-x2)2+…+S(xik-xk)2/(N-k))
CHE2015/2219 2020 M. Kalulu
Rejection of a Result:
The Q Test
• The Q test is used to determine if an
“outlier” is due to a determinate error. If it
is not, then it falls within the expected
random error and should be retained.
• Q = a/w
• a = difference between “outlier” and nearest
sorted result
• w = range of results.
CHE2015/2219 2020 M. Kalulu
QCalc = outlier difference/range.
If QCalc > QTable, then reject the outlier as due to a systematic error.
CHE2015/2219 2020 6th
M.Ed.Kalulu
©Gary Christian, Analytical Chemistry,
(Wiley)
Confidence Limits Using Range
• Confidence Limit = x ±Rtr
CHE2015/2219 2020 M. Kalulu
The median may be a better indicator of the true value than the mean for
small numbers of observations.
And the range times a factor (K) may be a better measure of spread than the
standard deviation (sr = RKR).
©Gary Christian, Analytical Chemistry, 6th Ed. (Wiley)
CHE2015/2219 2020 M. Kalulu
A least-squares plot gives the best straight line through experimental points.
Exel will do this for you.
©Gary Christian,
Analytical Chemistry,
CHE2015/2219 2020 M. Kalulu
6th Ed. (Wiley)
Fig. 3.7. Straight-line plot.
This Excel plot gives the same results for slope and intercept as calculated in the example.
©Gary Christian,
Analytical Chemistry,
6th Ed. (Wiley)
CHE2015/2219 2020 M. Kalulu
Fig. 3.8. Least-squares plot of data from Example 3.21.
Chart Wizard is on your tool bar (the icon with vertical bars).
Select XY (Scatter) for making line plots.
©Gary Christian,
Analytical Chemistry,
6th Ed. (Wiley)
CHE2015/2219 2020 M. Kalulu
You may insert the graph within the data sheet (Sheet 1), or a new Sheet 2.
Fig. 3.9. Calibration graph inserted in spreadsheet (Sheet 1).
©Gary Christian, Analytical Chemistry, 6th Ed. (Wiley) CHE2015/2219 2020 M. Kalulu
Select LINEST from the statistical function list (in the Paste Function window
– click on fx in the tool bar to open).
LINEST calculates key statistical functions for a graph or set of data.
©Gary Christian,
Analytica Chemistry,
6th Ed. (Wiley)
Fig. 3.10. Using LINEST for statistics.
CHE2015/2219 2020 M. Kalulu
Calibration Data for a Chromatographic Method
for the Determination of Isooctane in a Hydrocarbon Mixture
Mole % Peak
Isooctane Area Statistics
0.352 1.09 Slope 2.09 0.26 Intercept
0.803 1.78 Std Dev 0.13 0.16 Std Dev
2
1.08 2.60 R 0.99 0.14 Std Error of Estimate
1.38 3.03 F 241.15 3.00 Degree Freedom
1.75 4.01 Sum sq regression 5.02 0.06 Sum Sq Residuals
CHE2015/2219 2020 M. Kalulu
Peak Area vs Mole % Isooctane
4.50 PA = 2.0925Mole% + 0.2567
4.00 R2 = 0.9877
3.50
3.00
Peak Area
2.50
2.00
1.50
1.00
0.50
0.00
0 0.5 1 1.5 2
Mole % Isooctane
CHE2015/2219 2020 M. Kalulu
Detection Limits
There Is No Such Thing as Zero
• All instrumental methods have a degree of noise
associated with the measurement that limits the
amount of analyte that can be detected.
• Detection Limit is the lowest concentration level
that can be determined to be statistically different
from an analyte blank.
• Detection Limit is the concentration that gives a
signal three times the standard deviation of the
background signal.
CHE2015/2219 2020 M. Kalulu
A “detectable” analyte signal would be 12 divisions above a line
drawn through the average of the baseline fluctuations.
Fig. 3.11. Peak-to-peak noise level as a basis for detection limit.
©Gary Christian, Analytical Chemistry, 6th Ed. (Wiley) CHE2015/2219 2020 M. Kalulu
CHE2015/2219 2020 M. Kalulu
CHE2015/2219 2020 M. Kalulu