What Every Engineering Manager Wants You To Know (2017)
What Every Engineering Manager Wants You To Know (2017)
John Paff
Engineering Technology Manager, Spectra-Mat, Inc.
“Finally, a book that cultivates the rich landscape between human creativity
and ingenuity, which motivates the scientist and engineer, and the rigors of
applied experimental practice. Looking back over many years of engineer-
ing research development and manufacturing activities, I am ever surprised
how common problem-solving skills and experimental methodologies are
infrequently cultivated alongside the prodigious evolution of technical
knowledge and our means to generate data and simulate results. A thought-
ful and approachable problem-solving primer has long been needed for new
engineers, which combines core experimental principles used in engineer-
ing, science, and applied statistics. In academic settings, such subjects are
still taught as parts of course work across disparate disciplines. But in con-
temporary industry, their combination becomes a mandatory core skill set
and is key to success in the technical quality and communication of any
engineer’s creative endeavor.
In Buie’s book, we have a contemporary amalgamation of applied experi-
mental principles and methods presented in an approachable and motivat-
ing format. Dr. Buie draws from history, case studies, and real examples that
breathe life into what might otherwise become a dry subject. Her passion
for experimental investigation and its teaching is strongly evident as she
traverses a subject matter that might take years of academic and industrial
practice for an engineer to integrate and master.”
“Problem Solving for New Engineers offers a way to shape learning gained
in school and bridge the gap to becoming a savvy, strategic problem solver,
reducing the “groping-in-the-dark” phase of mastering a discipline. This
book enables the wisdom of mastery by providing key understandings
and methods that are at the heart of an experimental discovery mindset.
Approaches to moving fascination and wonder into realized outcomes are
based in a context of inquiry, exploration, and discovery that refine disci-
plined problem-solving by happily traveling the unknown—one experiment
at a time.”
Diana Hagerty
Project Manager at General Atomics Aeronautical Systems
“Melisa Buie is not only creative in her approach but also utterly aware of
the challenges we face as engineers and scientists in practice. As I was going
through the pages, I realized that the book mirrors my own experience. I
wish something like this has been available when I was starting out.”
“Problem Solving for New Engineers, written by Dr. Melisa Buie, serves the
fresh new engineers with plenty of methods required for successful experi-
mentation and process development in modern companies, with focus on,
but not limited to, nature sciences.
The problem I observe so frequently with new engineers coming from the
university—from how to apply the knowledge about how experiments were
performed by others to an efficient setup of our own experiments—is dis-
cussed at different levels, and guidance is provided every step of the way,
from a collection of the requirements to evaluation and qualification of the
new process.
Personally, I most appreciate the balance between the overview of meth-
ods in a thorough explanation, rather free of equations, which will not let
you skip the rest of any chapter, and a fair comparison of the one-factor-at-
a-time experimentation that all of us learned at university and statistical
design of the experiment.
The text invites you to experiment on your own and irradiates the pleasure
of investigation and development itself. The author’s knowledge of science
history converts the scientific topic to an easy-to-read lecture, which you
will also enjoy as a bedtime story.”
Noël Kreidler
Owner, Kreidler Solutions, Talent Acquisition and Human Resources
Melisa Buie
CRC Press
Taylor & Francis Group
6000 Broken Sound Parkway NW, Suite 300
Boca Raton, FL 33487-2742
This book contains information obtained from authentic and highly regarded sources. Reasonable efforts
have been made to publish reliable data and information, but the author and publisher cannot assume
responsibility for the validity of all materials or the consequences of their use. The authors and publishers
have attempted to trace the copyright holders of all material reproduced in this publication and apologize
to copyright holders if permission to publish in this form has not been obtained. If any copyright material
has not been acknowledged please write and let us know so we may rectify in any future reprint.
Except as permitted under U.S. Copyright Law, no part of this book may be reprinted, reproduced, trans-
mitted, or utilized in any form by any electronic, mechanical, or other means, now known or hereafter
invented, including photocopying, microfilming, and recording, or in any information storage or retrieval
system, without written permission from the publishers.
For permission to photocopy or use material electronically from this work, please access www.copyright
.com (http://www.copyright.com/) or contact the Copyright Clearance Center, Inc. (CCC), 222 Rosewood
Drive, Danvers, MA 01923, 978-750-8400. CCC is a not-for-profit organization that provides licenses and
registration for a variety of users. For organizations that have been granted a photocopy license by the
CCC, a separate system of payment has been arranged.
Trademark Notice: Product or corporate names may be trademarks or registered trademarks, and are
used only for identification and explanation without intent to infringe.
Giants may be a myth to some, but in my eyes and my heart, these three
people are giants. I have learned so much from them, more than I write.
http://taylorandfrancis.com
Contents
Foreword..............................................................................................xvii
Author................................................................................................ xxiii
xi
xii • Contents
5.6.2 Paradigms..........................................................113
5.6.3 Bias and Priming..............................................114
5.7 Key Takeaways...............................................................115
References..................................................................................116
In Gratitude......................................................................................... 253
Index..................................................................................................... 255
http://taylorandfrancis.com
Foreword
As engineering students transition into engineers in industry, many learn
that their new skills are inadequate to answer a variety of the design deci-
sions they face. The world is more complicated and system behavior is
more subtle than can be worked out with basic engineering calculations.
Two of the greatest skills needed in industry are how to make trial and
error more efficient and effective and how to cope with variation. Making
trial and error more efficient and effective is the domain of experimen-
tal design; coping with variation is the domain of statistical methods. By
combining the two, a model of system behavior is built. Yet most engi-
neering students have not had a course in experimental design and, typi-
cally, just a very introductory course in statistical methods, one that does
not cover complex model fitting.
Trial and error (or hypothesize and test) is the scientific method. For
a complex process that depends on a number of factors, the only way to
understand and model the process behavior is with a multifactor experi-
ment. The field of experimental design demonstrates how to learn system
behavior in the most efficient way: a way that holds outside factors con-
stant, that helps you understand interactions between factors, and that
allows you to learn many things at once rather than just one factor at a
time.
With statistical methods, process variation becomes clear. The data
coming from monitoring a process need to be studied statistically to ade-
quately judge when the system behavior is changing, rather than simply
exhibiting natural variation. We live in an age of omnipresent data; statis-
tical methods provide the tools to understand what the data are revealing.
But here is the disconnect. Despite the overwhelming value of experi-
mental design and statistical methods, they are not being sufficiently
taught in most engineering curricula.
xvii
xviii • Foreword
In the following pages, let a skilled master show you how to apply key
statistical concepts so that you can experience firsthand the rewards of
discovery and creative problem solving. Enjoy!
John Sall
Co-founder of SAS and Chief Architect of JMP
http://taylorandfrancis.com
Author
Melisa Buie, PhD, makes lasers and solves problems. In her role as direc-
tor of operations, she works on both engineering and business problems.
She joined Coherent and began lecturing at San Jose State University in
2007. She has also worked as a research scientist for Science Applications
International Corporation, working at the Naval Research Laboratory in
Washington, DC, where she made theoretical lasers. Melisa was a member
of the technical staff and engineering manager at Applied Materials, Inc.,
prior to joining Coherent.
Melisa has coauthored more than 40 publications and holds five pat-
ents. Melisa’s degrees include a PhD in nuclear engineering/plasma phys-
ics from the University of Michigan and an MS in physics from Auburn
University. She has a Six Sigma Black Belt from the American Society for
Quality. In 2017, she completed a certificate in innovation leadership from
Stanford University Graduate School of Business. She lives in Palo Alto,
California.
xxiii
http://taylorandfrancis.com
1
The Great Universal Cook-Off
Henry Petroski
All science begins with problems, and we all use essentially the same
method to solve problems. We try things out, we experiment. We put
things to the test. Our schools and universities give us the basic knowl-
edge in the fields in science and engineering. We read about others experi-
ments. We learn the results of their tests and trials. But when do we have
the opportunity to discover? Our lab classes are intended to open our eyes
and have us see what those who’ve come before us saw. Yet, they often
fall short. Our lab classes have us follow detailed instructions with a well-
characterized, very limited problem statement. Unfortunately, this is not
how problems and experiments occur in real life. The aim of this book is
to provide a strategy and the tools needed for new engineers and scientists
to become apprentice experimenters armed only with a problem to solve
and some knowledge of their subject matter.
1
2 • Problem Solving for New Engineers
of us, there is a particular problem that motivates us, that provides the
context in which we desire to grow. One of my best friends in college
studied plasma physics because she wanted to help give the world fusion
energy, the ultimate safe, clean energy source. For another friend, her life
changed when she found astronomy. She is now designing and building
some of the most sophisticated detection equipment for astronomical
exploration. John Steininger, founder of tech start-up Sonopro, wanted
to “light up Africa.” Through John’s work, microfinanced solar powered
lamps provide lights for students in Uganda. Thane Kreiner, executive
director and professor of Science and Technology for Social Benefit at
Santa Clara University, uses his neuroscience and business background to
solve world problems like bringing fresh drinking water to remote locations.
My first visit to Ann Arbor, Michigan, was Earth Day 1990. There was a
campus-wide festival. A positive upbeat atmosphere abounded. My initial
meeting was with Professor Ronald Fleming, who ran the Ford Nuclear
Reactor on the university’s North Campus. As Professor Fleming walked
me through the reactor, he told me about his visits to India, where he saw
people starving a short distance away from food. The problem seemed to
be how to transport the food from its source to these people while keeping
it fresh. Professor Fleming was passionate about developing a technique
using irradiation as a solution to preserving the food until it could reach
the people in need. Irene Joliot-Curie, Nobel Prize winning chemist and
daughter of Marie and Pierre Curie, felt strongly that “nuclear energy has
only one objective, the improvement of the economy of our daily lives”
(Goldsmith 2005).
I have always loved solving problems and talking about science with
others who are passionate about the world. Discovering our context for
experimentation, whatever gives us that spark or drives us to discover, is an
important part of the process. In the words of Claude Levi-Strauss (1983),
“The scientific mind does not so much provide the right answers as ask the
right questions.” Author Ian Leslie writes, “Questions weaponize our curi-
osity, turning it into a tool” (Leslie 2014). When we begin to ask the right
questions out of intellectual curiosity or out of a passion to solve a local or
global problem, then we have taken the first step toward a scientific mind.
Discovery is defined as “the action or an act of finding or becoming
aware of for the first time, especially the first bringing to light of a scien-
tific phenomenon” (Brown 1993). The word experiment has multiple defi-
nitions: (a) the action of trying something or putting it to the test; a test,
a trial; (b) an action or procedure undertaken to make a discovery, test
6 • Problem Solving for New Engineers
This acquisition serves the creative process as the basis for continuity
in thinking, raising the question if, in order to begin to do innovative
work in a domain, one must know what came before” (Weisberg 1993).
Knowledge of subject matter is the key to getting started. Developing a
knowledge of strategy is essential to efficient, consistent experimentation
and problem solving.
We need a certain amount of information before we can begin experi-
mentation. First, we want to know what others know. We want to know
the terms, jargon, tools, history, and anecdotes in our field of study. When
new content is introduced, it should fill in the gaps and add to our exist-
ing knowledge and mental models. We examine the soundness of what
we already know. We must constantly ask ourselves, “How confident am
I in this information?” We need to understand how that knowledge con-
strains, shapes, and distorts us. There are times when “what we think we
know” keeps us from asking the questions necessary. However, it is critical
we understand what others have done and what the experts know about
the area we are experimenting in. We can use this learning to critically
reflect on the physical world. This knowledge of terms, jargon, tools, his-
tory, and assumptions in our field of study will provide us with openings
for action (questions and curiosities) that were previously unavailable to
us within the constraints of our existing model of the world. We move
back and forth between gathering information about our subject, to the
unknown, all the while increasing what we know.
Knowledge of strategy in these gaming examples parallels the stra-
tegic knowledge that we need in scientific investigations. It is impor-
tant to understand which strategic tools to use in the experiment.
Dr. Khorasani divides this strategy into data analysis and statistical
knowledge and thinking. By combining these pieces of the strategy with
our knowledge of subject matter, we can begin to explain the results of
our investigations. A scientist or engineer can conduct an investigation
without statistics, but it is impossible for the experiment to be performed
objectively and efficiently without an understanding of statistics (Bode
et al. 1986, Boring 1919, Box et al. 1978, Khorasani 2016). A good scien-
tist or engineer becomes much better with the knowledge of strategy.
This is particularly true in fields with large data sets such as medicine
where conclusions have public health implications. As medical doctor
Vladica Velickovic wrote, “Involvement of biostatisticians and mathe-
maticians in a research team is no longer an advantage but a necessity”
(Velickovic 2015).
The Great Universal Cook-Off • 9
1.4.1 Understanding Variation
The first source of difficulty is understanding variation. Variation is a double-
edged sword. Variation happens no matter what we do. We’ll spend three
chapters with the quantification of variation—variation that adds uncer-
tainty to our data. In our experiments, we only want variation due to the
changes we are making to our experimental variables, i.e., the variation we
can control. We can’t always control variation; therefore, it is critical that
we understand the potential sources of variation and plan our experiments
with variation in mind. In this book, we will examine data and learn how
to minimize and quantify uncertainty due to various types of variation—
random fluctuations, mistakes, and systematic bias. Once we’ve accounted
for random and systematic variation (bias) and minimized the potential
10 • Problem Solving for New Engineers
350,000 70.5
300,000
70
69.5
200,000
150,000
69
100,000
68.5
50,000
0 68
2006 2007 2008 2009 2010 2011 2012 2013 2014 2015
Year
FIGURE 1.1
Graphic that illustrates the correlation between the number of patents granted in the
United States and life expectancy in the United States. (From United States Trademark and
Patent Office, U.S. Patent Activity Calendar Years 1790 to the Present: Table of Annual U.S.
Patent Activity Since 1790, https://www.uspto.gov/web/offices/ac/ido/oeip/taf/h_counts
.htm, 2016; World Happiness Report: Overview, http://worldhappiness.report/overview/.)
300,000
250,000
200,000
150,000
68.5 69 69.5 70 70.5
US life expectancy at birth
FIGURE 1.2
Bivariate fit of the the number of patents granted in the United States and life expectancy in
the United States. (From World Happiness Report: Overview, http://worldhappiness.report
/overview/; United States Trademark and Patent Office, U.S. Patent Activity Calendar Years
1790 to the Present: Table of Annual U.S. Patent Activity since 1790, https://www.uspto.gov
/web/offices/ac/ido/oeip/taf/h_counts.htm, 2016.)
12 • Problem Solving for New Engineers
1.5 BOOK ORGANIZATION
This book is organized into a series of lessons or essential core concepts
that fit together to solve a big-picture problem. Problem solving and
experimentation are key elements in the development of new products,
new technologies, and new ideas. Our foundation is a solid grasp of engi-
neering and physics principles (knowledge of subject matter). Adding
strategic experimental design thinking to this foundation, we can build
a solid, repeatable experiment with full awareness of the limitations and
strengths of our experimental findings. With the big picture in our sights,
each chapter explores critical concepts related to variation that build upon
one another. The lessons in this book are organized around variation,
which is introduced in Chapter 4.
We’ll walk through how to encapsulate all these pieces in a coherent strat-
egy and solid experimental plan.
1.6 KEY TAKEAWAYS
Today, we can stand on the shoulders of many giants that have come
before us. Duplicating and mimicking the results of others are great ways
to begin experiencing experimentation. Using recipes created by the sci-
entists and engineers who preceded us allows us to learn about the sub-
ject. We will learn by doing these experiments for ourselves and perhaps
repeating some of them. “Graduate students could, in addition to learn-
ing the guidelines, train by replicating published studies” (Fanelli 2013).
Experimentation builds experiential muscles that no amount of reading
what others have done can give us and that no one can take from us.
As scientists or engineers with a solid foundation in physics and engi-
neering principles and a few statistical tools, we should be able to begin
experimental exploration and discovery for ourselves. My goal in this
book is that we come away knowing how to begin to discover for ourselves
through experimental investigation. Reading and working through this
book, we will become fully equipped with the tools, skills, and fearless-
ness required to discover for ourselves those things that may be known
to others or may not be known at all. With a lot of patience, knowledge, strat-
egy, and a bit of luck, we may discover something previously unknown to
anyone.
P.S. Take some time to explore the ideas in this chapter. Talk to sci-
entists and engineers about their own journey. Ask them open-ended
questions about the path that led them to where they are. This can be an
16 • Problem Solving for New Engineers
REFERENCES
Begley, C. G. and L. M. Ellis. 2012. Raise Standards for Preclinical Cancer Research.
Nature 483:531–533.
Berger, W. 2014. A More Beautiful Question: The Power of Inquiry to Spark Breakthrough
Ideas. New York: Bloomsbury.
Bode, H., F. Mosteller, J. W. Tukey, and C. Winsor. 1986. The Education of a Scientific
Generalist. The Collected Works of John W. Tukey, Volume III: Philosophy and Principles
of Data Analysis: 1949–1964. ed. L. V. Jones. Pacific Grove, CA: Wadsworth. (The origi-
nal paper was published in 1949.)
Boring, E. G. 1919. Mathematical vs. Scientific Significance. Psychological Bulletin
16:335–339.
Box, G. E. P., W. G. Hunter, and J. S. Hunter. 1978. Statistics for Experimenters: An
Introduction to Design, Data Analysis and Model Building. New York: John
Wiley & Sons.
Brown, L., Editor, 1993. The New Shorter Oxford English Dictionary on Historical Principles.
4th Ed. Oxford: Clarendon Press.
Child, J. 1961. Mastering the Art of French Cooking. New York: Alfred P. Knopf.
Easton, V. J. and J. H. McColl. 2016. The Statistics Glossary, v 1.1. http://www.stats.gla.ac.uk
/steps/glossary/.
Economist. 2013. Unreliable Research: Trouble at the Lab. Economist October 19.
Ephron, N. 2009. Julie and Julia. http://www.sonypictures.com/movies/juliejulia/.
Falin, L. 2013. Correlation vs. Causation: Everyday Einstein: Quick and Dirty Tips for Making
Sense of Science. Scientific American, October 2. https://www.scientificamerican.com
/article/correlation-vs-causation/.
Fanelli, D. 2013. Redefine Misconduct as Distorted Reporting. Nature 494(7436):149.
The Great Universal Cook-Off • 17
The most exciting phrase to hear in science, the one that heralds new dis-
coveries, is not “Eureka!” but “That’s funny…”
Isaac Asimov
In every field, there are myths, and science is no exception. Before we delve
any further into problem solving, I hope to dispel several myths about
scientific discovery. These myths include the following:
There are other myths that we could discuss, but I find that these are
some of the more common and dangerous ones, if not physically then to
our psyche. The biggest problem with these myths is that they get in the
way of many new scientists and engineers and stop others. All the hard
work, dedication, dead ends, and failures of real problem solving and
experimentation are rarely mentioned.
2.1 FAIRY TALES
In retrospect, a published experiment may look like a perfect story, with
the beginning leading inexorably to the ending as a “fairy tale” (see Figure
2.1). When we read article after article in professional journals describ-
ing nice, tidy experiments with perfect endings, we tend to assume those
19
20 • Problem Solving for New Engineers
In the beginning.....
Results
FIGURE 2.1
Fairy tale experimentation.
Introduction
Electrons in germanium are confined to well-defined energy bands
that are separated by “forbidden regions” of zero charge-carrier
density. You can read about it yourself if you want to, although I don’t
recommend it. You’ll have to wade through an obtuse, convoluted
discussion about considering an arbitrary number of non-coupled
harmonic-oscillator potentials and taking limits and so on. The upshot
is that if you heat up a sample of germanium, electrons will jump from
a non-conductive energy band to a conductive one, thereby creating
a measurable change in resistivity. This relation between temperature
and resistivity can be shown to be exponential in certain temperature
regimes by waving your hands and chanting “to first order.”
Experiment procedure
I sifted through the box of germanium crystals and chose the one that
appeared to be the least cracked. Then, I soldered wires onto the crys-
tal in the spots shown in Figure 2.2b of Lab Handout 32. Do you have
any idea how hard it is to solder wires to germanium? I’ll tell you: real
goddamn hard. The solder simply won’t stick, and you can forget about
getting any of the grad students in the solid state labs to help you out.
Once the wires were in place, I attached them as appropriate to the
second-rate equipment I scavenged from the back of the lab, none of
which worked properly. I soon wised up and swiped replacements
FIGURE 2.2
Lab report written by physics student Lucas Kovar. (Continued)
22 • Problem Solving for New Engineers
from the well-stocked research labs. This is how they treat under-
grads around here: they give you broken tools and then don’t under-
stand why you don’t get any results.
In order to control the temperature of the germanium, I attached
the crystal to a copper rod, the upper end of which was attached to a
heating coil and the lower end of which was dipped in a thermos of
liquid nitrogen. Midway through the project, the thermos began leak-
ing. That’s right: I pay a cool ten grand a quarter to come here, and yet
they can’t spare the five bucks to ensure that I have a working thermos.
Result
Check this shit out (Fig. 1). That’s bonafide, 100%-real data, my friends.
I took it myself over the course of two weeks. And this was not a lei-
surely two weeks, either; I busted my ass day and night in order to pro-
vide you with nothing but the best data possible. Now, let’s look a bit
more closely at this data, remembering that it is absolutely first-rate. Do
you see the exponential dependence? I sure don’t. I see a bunch of crap.
Resistivity vs. temperature
1
0.9
0.8
0.7
0.6
R/R_o
0.5
0.4
0.3
0.2
0.1
0
100 150 200 250 300 350
T (K)
Conclusion
Going into physics was the biggest mistake of my life. I should’ve
declared CS (computer science). I still wouldn’t have any women, but
at least I’d be rolling in cash.
write up a report and, in the case of Lucas Kovar, hope that the professor
has a sense of humor. This feels like the reality of some of our early experi-
mentation. However, when we read professional science and engineering
journal articles, they tell a completely different story. Interestingly enough,
there are several more parts to this story. Lack of repeatability and reluc-
tance to publish negative results is a part of the story that often goes untold.
A 2013 article published by The Economist entitled “Unreliable
Research: Trouble at the Lab” broaches the topic of unrepeatable scien-
tific results (Economist 2013). Prior to this, in 2005, Stanford professor
of epidemiology and head of METRICS (Meta Research and Innovation
Center at Stanford) John Ioannidis presented a paper to the International
Congress on Peer Review and Biomedical Publication (Ioannidis 2005).
In this work, he reported, “most published research findings are probably
false.” Additionally, he showed that, statistically, most claimed research
findings are false. At the time, only 1 out of every 20 papers reported false-
positive results (Ioannidis 2005). Don’t get me wrong here: I’m not saying
we should question the published results for the band-gap of Germanium.
What I am saying is that it’s okay to bring a dose of suspicion to newly
published research findings. Skepticism is healthy.
There is a lot of pressure in academia to publish. Negative results are not
considered interesting by journals. In each publication, authors want to
expound on their positive results. Between 1990 and 2007, the publication
of negative results across the sciences actually dropped from 30% to 14%,
according to Daniele Fanelli while at the University of Edinburgh (Fanelli
2013). Very little information gets published in the sciences related to null
or negative results. We don’t see these as successes, as opportunities for
further learning. We see these as dead ends, as failures.
24 • Problem Solving for New Engineers
any progress and then we move to another area. Solving problems in sci-
ence is similar. The path to get to the solution is not known, and in all like-
lihood, there are multiple ways to reach a discovery. As Walter Isaacson
writes the story of the history of Silicon Valley, ideas come from many
sources, converging and diverging at the present moment (Isaacson 2014).
Ideally, scientific research is a process of guided learning. According
to Dr. Kevin Ashton in his book How to Fly a Horse, “Imagination needs
iteration. New things do not flow finished into the world. Ideas that seem
powerful in the privacy of our head teeter weakly when we set them on
our desk. But every beginning is beautiful. The virtue of the first sketch is
that it breaks the blank page. It is the spark of life in the swamp. Its quality
is not important. The only bad draft is the one we do not write” (Ashton
2015). The object of the methods and tools presented herein is to make
the process of discovery as efficient as possible. However, we may feel as
if we are caught in the scary maze of experimentation, as in Figure 2.3.
We may learn in an iterative manner, but the act of creation is more like
wading through a maze where, at each step, we stand on the shoulders of
someone who came before us (Ashton 2015). Astrophysicist Mario Livio
describes the evolution of scientific progress in a description of the theory
FIGURE 2.3
Experimentation may feel like a very scary maze at times.
26 • Problem Solving for New Engineers
2.2 LIGHTNING BOLTS
Dr. Ashton opens his book How to Fly a Horse with a reprint from 1815
General Music Journal that was rumored to have been written by Mozart
about his creative process. “When I proceed to write down my ideas the
committing to paper is done quickly enough, for everything is, as I said
before, already finished; and it rarely differs on paper from what it was in
my imagination” (Ashton 2015). The evidence used by many of Mozart’s
effortless compositional creations are the many perfect manuscripts. There
are no fixed mistakes. Although it has continued to be referenced by many
authors, this document is a forgery. Mozart’s widow kept his manuscripts
Eureka! And Other Myths of Discovery • 27
but stated in a letter that she had discarded the “unusable autographs”
before selling the rest (Weisberg 1993). This creation myth is not even close
to resembling the real creative struggle that Mozart went through. Yes, he
was gifted in music, but his work was not magical. There was no dream or
lightning bolt that struck him and delivered complete symphonies. What
was his secret? It was work. He wrote and rewrote scores.
We often describe a discovery as a light bulb coming on. I love the fol-
lowing bit of trivia on the history of the link between the light bulb and
a bright idea. In 1919, before audio was integrated into films, there was a
cartoon character named Felix. Felix the cat was the brainchild of artist
Otto Messmer and producer Pat Sullivan. Felix used the appearance of
symbols and numbers in the film as objects of opportunity. In Felix’s films,
light bulbs would appear above his head when he had an idea. This symbol
has long outlived its originator, yet when we become aware of something
new, this light bulb comes on for us. We could think of everything that we
don’t know as dark space. As we experiment and try new things, we work
for light bulbs to illuminate these great unknowns.
Progress has an iterative nature. “Make small changes, small changes
right where you are. Large changes occur in tiny increments. Small
actions lead to larger increments in our creative lives. Take one small
daily action instead of indulging in the big questions. Creativity requires
action … We prefer the low-grade pain of occasional heart stopping to the
drudgery of small and simple daily steps in the right direction” (Cameron
1992). Double Nobel Prize winner Marie Curie wrote, “A great discovery
does not issue from a scientist’s brain ready-made, like Minerva spring-
ing fully armed from Jupiter’s head; it is the fruit of an accumulation of
preliminary work” (Goldsmith 2005). Science progresses incrementally.
Professor Randall, in Knocking on Heaven’s Door, wrote, “That’s how sci-
ence works. People have ideas, work them out roughly, and then they or
others go back and check the details. The fact that the initial idea had
to be modified after further scrutiny is not a mark of ineptitude—it’s a
sign that science is difficult and progress is often incremental” (Randall
2011). A friend of mine was famous for showing up at engineering review
meetings and asking us “How do you move a ten ton weight down the
street?” The answer is “one inch at a time.” This is the really tricky part:
keeping our eyes on the goal but moving inches forward every day until
we arrive at our destination.
Even incorrect theories and explanations of results can potentially be viewed
as progress, one inch down an incorrect path we learn there is no need to
28 • Problem Solving for New Engineers
2.3 GENIUSES
Griffins, ghouls, gnomes, giants, and geniuses are all mythical creatures.
I apologize for the alliteration, but this particular myth has stopped so
many people from pursuing science and engineering. I wanted the allit-
eration to provide something that would be easy to remember. There are
several variations of this particular myth that we say to ourselves which
stops us: either “I’m not a genius” or “I’m not creative” or “I’m not natu-
rally talented.” These statements serve to keep many people from ever get-
ting started with anything, especially the sciences.
Over the years, there have been many scientists who have attempted
to prove that certain types (races, genders, ethnicities, etc.) of people are
geniuses or inherently had more aptitude to learn than others. Repeatedly,
these experiments have failed and in many cases missed children who
have gone on to win Nobel Prizes. One such example was Stanford
University Professor Lewis Terman, developer of the Stanford-Binet IQ
test, who identified 1500 children of exceptional abilities (whose geniuses
were called “Termites” and who had IQs on average of 151) and rejected
168,000 others as ordinary. The Termites were studied for more than
35 years. Some of them did great things but others went on to live ordi-
nary lives. (His work is actually very controversial and flawed, I’m only
referencing it here as an example of one experiment that dispels the myth
of genius.) The real story is the rejected majority (Ashton 2015). Included
in this rejected majority were Physics Nobel Laureates William Shockley
and Luis Alvarez. Shockley’s Nobel Prize was for the c oinvention of the
transistor and Alvarez won for proposing that an asteroid may have
crashed to Earth and killed the dinosaurs. How could they have been
missed by a genius test?
Unlike when Terman developed the IQ test, we now know that IQ
can change, grow, and develop. With increased exposure, new neuronal
30 • Problem Solving for New Engineers
2.4 KEY TAKEAWAYS
Every scientist or engineer who has worked in a lab or performed an
experiment has been affected to some extent by one or all of the myths
discussed in this chapter. Even today, we continue to perpetuate these
myths. Read almost every technical journal and it appears that the experi
ments were magical. They read like the fairy tales of our childhoods. The
true story, all the dead ends, all the stumbles and falls, are omitted. It can
be frustrating, but don’t be discouraged. We don’t know exactly where
ideas come from. We know that “lightning strikes” of ideas are rare, if
they have ever really occurred. What appears to have been an “Aha” or
“Eureka” moment was really the result of a lot of hard work, stewing over
ideas and concepts and taking lots of erroneous dead ends. Similarly with
geniuses and “naturals,” there are those who succeed, but those successes
are a result of many hours of practice and hard work. As we learn more
about human behavior, we realize that passion and perseverance are really
Eureka! And Other Myths of Discovery • 31
the keys to success. The results from our experimentation WILL BE con-
sistent with our experimental setup. Therefore, we need to make sure
that our experimental setup is as good as we can make it. In the upcom-
ing chapters, we will begin to examine sources of variation which, gone
unchecked, can affect our experiment. This will allow us to truly explore
the effect(s) of interest. First, however, let’s cover an important and criti-
cal topic: communication.
P.S. In the face of discouragement, when we aren’t the top in the class or
bad things happen, that is when our grittiness needs to kick in. Think of
an experiment that failed or didn’t go the way it should have. Do a post-
mortem on this failure. Can you identify why it didn’t work? Try to come
up with a short list of maybe 10 things that could possibly have changed
the outcome of the experiment.
REFERENCES
Ariely, D. 2009. Predictably Irrational: The Hidden Forces That Shape Our Decisions. New York:
HarperCollins.
Ashton, K. 2015. How to Fly a Horse: The Secret History of Creation, Invention, and Discovery. New
York: Doubleday/Random House.
Berkun, S. 2010. The Myths of Innovation. Sebastopol, CA: O’Reilly Media.
Box, G. E. P., W. G. Hunter and J. S. Hunter. 1978. Statistics for Experimenters: An Introduction
to Design, Data Analysis and Model Building. New York: John Wiley & Sons.
Cameron, J. 1992. The Artist Way: A Spiritual Path to Higher Creativity. New York: Jeremy P.
Tarcher/Putnam.
Duckworth, A. 2016. Grit: The Power of Passion and Perseverance. New York: Scribner/Simon
& Schuster.
Economist. 2013. Unreliable Research: Trouble at the Lab. The Economist October 19.
Fanelli, D. 2013. Redefine Misconduct as Distorted Reporting. Nature 494(7436):149.
Feynman, R. P. 1965. The development of the space-time view of quantum electrodynamics.
Novel Prize Lecture. http://www.nobelprize.org/nobel_prizes/physics/laureates/1965
/feynman-lecture.html.
Goldsmith, B. 2005. Obsessive Genius: The Inner World of Marie Curie. New York: W. W. Norton
& Company.
Ioannidis, J. P. A. 2005. Why Most Published Research Findings Are False. PLoS Medicine
2(8):e124.
Isaacson, W. 2014. The Innovators: How a Group of Hackers, Geniuses, and Geeks Created the
Digital Revolution. New York: Simon & Schuster.
Kovar, L. 2001. Germanium Band Gap, My Ass. Annals of Improbable Research 7(3). www
.improbable.com/magazine. The complete paper can also be found at http://pages.cs
.wisc.edu/~kovar/hall.html.
Leslie, I. 2014. Curious: The Desire to Know and Why Your Future Depends on It. New York: Basic
Books.
32 • Problem Solving for New Engineers
Livio, M. 2013. Brilliant Blunders: From Darwin to Einstein—Colossal Mistakes by Great Scientists
That Changed Our Understanding of Life and the Universe. New York: Simon & Schuster.
Merton, R. K. 1965. On the Shoulders of Giants: A Shandean Postscript. New York: The Free
Press.
Nobel. 2016. http://www.nobelmuseum.se.
Oakley, B. 2014. A Mind for Numbers: How to Excel at Math and Science (Even if You Flunked
Algebra). New York: Jeremy P. Tarcher/Penguin.
Randall, L. 2011. Knocking on Heaven’s Door: How Physics and Scientific Thinking Illuminate the
Universe and the Modern World. New York: HarperCollins.
Weisberg, R. W. 1993. Creativity: Beyond the Myth of Genius. New York: W. H. Freeman and
Company.
3
Experimenting with Storytelling
Carmine Gallo
33
34 • Problem Solving for New Engineers
written form, scientists must use words, graphics, tables, and statistical
summaries to effectively communicate their experiments and the find-
ings. As engineers and scientists (whether professional or amateur), we
need to be fluent in all of these types of communication.
audience take us seriously. I want to stress the critical nature of this part
of experimentation by ensuring that communication is at the forefront of
discussion.
Developing dexterity with all forms of communication gives our
intended audience confidence not only in the data but also in us as engi-
neers and/or scientists. The form of communication may vary depend-
ing on our situation but, “the ability to communicate is a very important
skill.” Statistician Dr. Fred Khorasani labels communication dexterity as
one of the basic skills in engineering. The other basic skills required to
do our experiments may change over time. For example, many years ago
(before my time), using a slide rule was a basic skill for engineers and sci-
entists, but today navigating specialized software packages is a basic skill.
Dr. Khorasani writes, “People with basic skills are much more effective in
investigation or in problem solving” (Khorasani 2016). The stronger our
basic skill set is, the more we will be able to do and the more effective we
will be in the long run.
In this chapter, we will cover the language of science as well as when
and how to use graphics and tables. The other communication tool,
statistical summaries, will not be specifically addressed in this chap-
ter. Experimentation, measurement, and statistics form a holy trinity.
The measurements involved in the early sciences—first astronomy, then
experimental physics—put increased pressure on mathematicians to
understand and quantify random error. These needs drove the develop-
ment of statistics. Statistics “provides a set of tools for the interpretation
of data that arise from observation and experimentation. … But statistics
also provides tools to address real-world issues, such as the effectiveness of
drugs or the popularity of politicians, so a proper understanding of statis-
tical reasoning is as useful in everyday life as it is in science” (Mlodinow
2008). We will cover statistical summaries throughout the remainder of
the book as this may be the least familiar communication tool.
TABLE 3.1
Common Venues for Sharing Engineering and Scientific Results
Audience
Venue Format Audience Size Features
Lab report, Written Professors, teaching Small group Less formal,
dissertation, assistants, small more data
internal team heavy
memo
Journal Written Other professionals Larger Formal,
article some very familiar audience language
with topic dependent should be
on journal concise and
circulation clean
Poster Oral with May be familiar Small groups Less formal
aids with topic at a single but not
time but casual; free
large groups form Q&A
overall with
audience
Talk/speech Oral, may Typically familiar Tends to be Formal, more
include aids with topic larger structure
but not audience Q&A
essential
40 • Problem Solving for New Engineers
It appears to be a source of pride for scientists and engineers that our writ-
ing be dull. “Most academics find getting the initial ideas the most enjoy-
able part of research and conducting actual research is almost as much
fun. But few enjoy the writing, and it shows. To call academic writing dull
is giving it too much credit. Yet to many, dull writing is a badge of honor.
To write with flair signals that you don’t take your work seriously and
readers shouldn’t either,” writes University of Chicago Business School
Professor Richard Thaler, the father of the field of behavioral economics
(Thaler 2015). We can fill our papers or talks with lots of “technomumble
jumble”; however, the risk we run with this “showboating” is that we lose
our audience (Williams 2013). We fail in our attempts to have them see our
discovery, our learnings. What if we could grasp something from every
paper we read? What if we enjoyed journals written by our peers? What
if our writing inspired and captivated others? Let’s look at one way that
this may be possible. Physics Nobel Prize Laureate and former California
Institute of Technology Physics Professor Richard Feynman is a classic
exception to the dull, dry, boring academic writing. A reader need only
read one chapter of his books to understand the absolute pleasure Professor
Feynman had with discovery. Professor Feynman was a storyteller.
Storytelling may seem like an odd topic for a book on experimentation,
even if the chapter topic is communication. Just so there is no confusion,
I’m not talking about fictional storytelling. However, I am choosing story-
telling very deliberately and intentionally. As human beings, we love sto-
ries. It is one of the few truly universal traits that we share across cultures
and throughout history (Hsu 2008).
42 • Problem Solving for New Engineers
with how best to solve our mystery. We consider different options and
approaches. We have to make choices and decisions that may impact the
results. We perform the experiment and then it is time for interpretation
of a set data. What do the results mean? What impact will these find-
ings have? What are the broader implications? With the new experimental
results in hand, we build “a face of reality” (Hoffman 2014). The world is
then seen in a different light by those who read or learn of this work.
When communicating scientific work, we are of two minds. We are the
author of the story and one of the main characters. “The protagonists are the
investigators of nature,” advises Professor Hoffman. We must be in the story
as observer and interpreter. We must grapple with performing our observa-
tions and measurements. At the same time, we are asked to interpret these
findings. We are asked to frame the story for the reader or listener. “Carefully
done measurements of observables are an essential ingredient of science,
against which theories must be measured. They constitute facts, some will
say. Well, facts are mute. One needs to situate the facts, or interpret them.
To weave them into nothing else but a narrative” (Hoffman 2014).
Good stories teach us. We learn from stories. There is a wonderful rich
research area in neuroscience and psychology studying the effects of sto-
ries on our brains. Researchers are looking at which areas of our brains
are activated while reading and listening. They are looking at how our
brains couple to and mirror as we listen. We are discovering how impor-
tant stories are to learning and developing relationships within a social
world (Hasson et al. 2012). Stories “cross the barriers of time, past, present
and future, and allow us to experience the similarities between ourselves
and through others, real and imagined” (Stanton 2012).
We’ve seen that Galileo and Newton dismantled Aristotle’s armchair sci-
entific “sit back and think about it” philosophy. However, in his role as a phi-
losopher, and given what it takes to effectively communicate, Aristotle was on
to something. He proposed that persuasion (effective communication) had
three components: ethos, logos, and pathos. These are Greek words mean-
ing character, logic, and experience. We can think of ethos or character as
our expertise or reputation. This makes us credible subject matter experts.
Logos is all the data, statistics, and logical arguments we use to back-up
our claims. Finally, there is pathos, the experience of our invention or dis-
covery. The nonprofit Technology, Entertainment and Design (TED) Ideas
Worth Sharing has set a new presentation bar (Anderson 2016). TED brings
together people from all walks of life with the goal of changing the world
through the sharing of ideas. Carmine Gallo, author and communication
44 • Problem Solving for New Engineers
coach, has analyzed TED presentations from some of the greatest speakers
in the world. Gallo found that ethos and logos only account for less than
half of the presentation. More than half of the presentation was pathos
(Gallo 2014).
When we present the results of our experiments, we should strive to
connect with our audience using the story of our investigation. Professors
Feynman and Hoffman perfected the art of storytelling. Their books and
lectures are filled with stories that allow them to connect with everyone
in the audience. University of Houston Graduate College of Social Work
Professor and #1 New York Times best-selling author, Brene Brown, suggests
that storytelling is a means of accomplishing pathos. Although she is a world-
renowned expert in social work, Professor Brown gladly accepts the title
“Storyteller” when she presents. She tells stories with her data. “Maybe stories
are just data with soul,” she suggests, “… we’re all storytellers” (Gallo 2014).
Gallo writes, “Researchers have discovered that our brains are more active
when we hear stories. A wordy PowerPoint slide with bullet points activates
the language-processing center of the brain, where we turn words into mean-
ing. Stories do much more, using the whole brain and activating language,
sensory, visual, and motor areas” (Gallo 2014). Dissertations and journal
articles, regardless of the field, are still for the most part expositions, straight-
forward explanations with lists of facts and figures. Although most academic
journals have a formal outline that must be followed for an article to be pub-
lished, as engineers and scientist, we have the option to communicate our
work in such a way that both our story as well as nature’s story comes through.
tool for presenting large data sets concisely and in a coherent manner. In
typical scientific reports (journal articles, lab reports, etc.), visual displays
account for up to 50% of real estate on the page and occasionally more.
The graphics we use should enrich and supplement the text, equations,
tables, and statistical summaries. An excellent graphic will summarize
and display complex ideas in a clear, precise, and efficient manner.
When choosing how best to display data, it is important that there be
a clear purpose for the graphic. The purpose could be description, explo-
ration, comparison, tabulation, or decoration. The intent of the graph
should be clear. Good graphical displays show data in a completely self-
explanatory manner. “The greatest value of a picture is when it forces us to
notice what we never expected to see” (Tukey 1977). The focus of the graph
should be on the data, not on the “methodology, graphic design, the tech-
nology of graphic production, or something else” and “not how perfectly
stylish the pages look” (Tufte 2001, 2006). Good graphics reveal data and
don’t distract from or distort the results.
Graphics can be more precise and revealing than regular statistical
computations. As Professor Tufte wrote, “the essential test of text/image
relations is how well they assist in understanding of the content.” He con-
tinues, “Evidence is evidence, whether words, numbers, images, diagrams,
still or moving. It is all information after all. For readers and viewers, the
intellectual task remains constant regardless of the particular mode of
evidence: to understand and to reason about the materials at hand and to
appraise their quality, relevance, and integrity” (Tufte 2006).
There are many reasons we might want to include visual displays in our
writing; however, the two most common purposes in journal articles, internal
memos, and reports are (1) to easily communicate the experimental setup or
(2) to easily communicate the results of the experiment. Although there are
many visual display tools that can be used to accomplish these purposes, let’s
just look at a few of the more common and effective visual tools.
3.4.1 Experimental Sketch
We want to get in the habit of thinking about, considering, and distinguishing
everything that might have an impact on the experiment that we performing.
Accompanying any experimental report or paper, there is typically a sketch
of the experiment. Occasionally, a photograph of the experimental setup is
used. Photographs can be distracting and actually take away from what we
want to communicate. A sketch or even a block diagram such as a process
46 • Problem Solving for New Engineers
flow diagram, can show very specific views. Labeling can provide the neces-
sary amount of detail. The remainder of the essential details can be reserved
for the text. If there are many parts, a legend can be used. Compare Figures
3.1 and 3.2 of a belt furnace used to braze and anneal metal parts. The simple
sketch allows us to identify the essential elements in the furnace. The arrows
can indicate the direction of motion. This is more difficult to show with the
photograph than a simple sketch. The belt conveys the parts through the fur-
nace at a set speed. The parts to go through a rapid thermal process are loaded
Hydrogen atmosphere heat zone
Temperature
N2 curtain
ramp up/ramp down
Parts Parts
loaded Furnace unloaded
(Direction of movement)
Belt
FIGURE 3.1
Sketch of a belt furnace.
FIGURE 3.2
Photograph of a belt furnace manufactured by C. I. Hayes. (Courtesy of Coherent, Inc.)
Experimenting with Storytelling • 47
on one end and unloaded on the other end after going through the hydrogen
furnace. The parts are subject to a temperature profile determined by the
belt speed and the set point temperature inside the furnace. The contrast
in the amount of information communicated by these simple figures is
striking. We can learn so much more from the sketch than the photograph.
FIGURE 3.3
Common symbols used in process flow charts.
3.4.3 Input–Process–Output Diagram
Good science involves understanding all the factors that might enter into
a measurement.
Lisa Randall (2011)
Add ~50 mL of DI
water to the beaker
Add titrant to
Add 5 drops of unknown in beaker
indicator to the slowly
beaker and swirl
No
Has color of Record the number of
Yes
indicator changed milliliters to the end
(e.g., from red to point in the notebook
blue)? (to the nearest
0.01 mL)
Dispose of solutions
according to
instructions
FIGURE 3.4
Example of a chemistry experiment flow chart. (Courtesy of Professor D. Nivens, 2016,
http:/www.chemistry.armstrong.edu/nivens/Chem2300/flowchart.pdf.)
Manpower
Materials
Methods
Experimental
process Outputs
Machines
Mother nature
Measurement
FIGURE 3.5
Generic Input–Process–Output diagram.
3.4.4 Infographics
An ill-specified or preposterous model or a puny data set cannot be rescued
by a graphic or calculation, no matter how clever or fancy.
Tufte (2001)
Inputs Process Outputs
Manpower Are procedures followed/referenced?
a. Copper
Materials i. Oxygen-free Cu
ii. Cleaned
b. NiCuSil braze alloy
i. Correct Ni composition?
ii. Correct Cu composition?
iii. Correct silver composition?
c. Nickel
i. Ni 200?
ii. Cleaned
a. Thermocouple (control)
Measurements b. Thermocouple (overtemp)
c. Flowmeter (gas)
d. Water flow switches (safety)
e. Gas flow switches (safety)
f. Gas burnoff ignitor sensor (closed circuit interlock)
FIGURE 3.6
Experimenting with Storytelling • 51
Deposition chamber
Main reactive
gas stream
Forced convection of reactants Forced convection of byproducts
to the deposition region away from the deposition region
1 7
Substrate
FIGURE 3.7
Infographic showing the processes involved during chemical vapor deposition. (From
Plummer, J.D., Deal, M.D., Griffin, P.B., Silicon VLSI Technology: Fundamentals, Practices
and Modeling, Prentice Hall, Upper Saddle River, NJ, 2000.)
Experimenting with Storytelling • 53
The lower the pressure inside the chamber, the longer the mean free path,
which means that the electrons, ions, or molecules can travel further
inside the chamber before reacting. The mean free path will affect each of
the seven processes illustrated in the graphic.
3.5.1 Components of Graphs
While lying in bed one morning in 1636, the mathematician and philoso-
pher Rene Descartes watched a fly crawling on the wall. As he watched,
it dawned on him that the path the fly was making on the wall could be
captured numerically. He noticed that the fly was initially 10 inches above
the floor and 8 inches from the left edge of the wall. A moment later the fly
was at 11 inches above the floor and 9 inches from the left edge. Descartes
drew two lines at right angles, a horizontal line to represent the floor and a
vertical line to represent the left edge of the wall, which is equivalent to the
length of floor to ceiling. The two lines intersected where the walls met. As
long as the fly was walking on the wall, its path could be traced precisely—
a certain number of inches from the floor and a certain number of inches
from the left wall. Descartes translated the idea of latitude and longitude
for identifying a location on earth relative to the poles. The notion of lati-
tude and longitude to identify a global position had been around since the
3rd century bc. However, Descartes’ realization was that two quantities
could be used to represent a relative position. Descartes was able to con-
struct a grid relative to the two lines. The idea of the graph as a “sophis-
ticated abstraction” created a “conceptual revolution.” Descartes “showed
that algebra and geometry were two languages that described a shared
reality” (Dolnick 2011). The coordinate plane (grid) that is created with
the two lines Descartes envisioned was named in his honor, the Cartesian
coordinate system. Each location on the grid is defined with two identi-
fiers, which provide the location of the data point relative to the two lines
known as the x axis and the y axis (Johnson and Moncrief 2002).
54 • Problem Solving for New Engineers
Just as Descartes was able to map out the fly’s path with a graph, we
can use graphs to reveal our data at multiple levels of detail. Graphs are
a wonderful way of understanding and sharing information. University of
Colorado Physics Professor John Taylor wrote “… drawing graphs helps you
understand the experiment and the physical laws involved” (Taylor 1982).
Every professor or manager who asks us to create a graphic will have
his or her own guidelines for what is important and essential to include in
a graph. However, there are four basic elements to any graphic: labeling,
scaling, the data itself, and possibly the trend line. Before proceeding with
these essential elements, let me make a quick comment about gridlines,
shading, 3D bars, or other effects: just don’t. The purpose of a graphic is
to communicate information; be as spartan as possible with all the extra
stuff. Three-dimensional effects tend to distort the data, misleading the
reader (Klass 2012, Tufte 2006). The data, not the methodology of plotting
data, are the crux of the graph. Let’s look at each one of these elements.
From my experience teaching and managing young engineers and scien-
tists, improper and/or incomplete labeling tends to be the most overlooked
or ignored part of a graph. Proper labeling is critical in communicating
exactly what we are plotting. Although there will be specific guidelines
about what to include and what not to include in our specific situation.
My advice is to overlabel just to be on the safe side. It is better to label too
much versus not enough; however, balance this with having the labeling
be a distraction from the communication.
Labeling includes the title, the axis labels, the legend, and, when the
graphic is incorporated into a report or paper, the figure caption (see
Figure 3.8). In journal articles and/or lab reports, the title can be forsaken
for the figure caption to avoid redundancy and use the real estate on the
Title
y1 axis label (units)
Legend label
Series y1
Series y2
Scales
FIGURE 3.8
Example of all the labels that may need to be included in the graph. The legend is crucial
when plotting more than one variable.
Experimenting with Storytelling • 55
700
2661,636.6
600
1971,552.9
500
400
1461,336
300
1034,258.5
686,226.4
200
395,134.3
100
69,47.94
105,35.7
0
0 500 1000 1500 2000 2500 3000 3500
(a) Power (hp)
800
Particulate matter emissions (g/hr)
700
600
500
400
300
200
0
0 500 1000 1500 2000 2500 3000 3500
(b) Power (hp)
FIGURE 3.9
(a) Example graph where all data points are labeled. Without additional information, this
is ineffective and distracting. (b) Example of graph where one data point is labeled that
provides specific details of particulate matter emissions during the engine idle condition.
(From Filippone, C., Diesel—Electric Locomotive Energy Recovery and Conversion: Final
Report for Transit IDEA Project 67, 2014, http://onlinepubs.trb.org/onlinepubs/IDEA
/FinalReports/Transit/Transit67.pdf.)
Experimenting with Storytelling • 57
3.5
3
Mass flow rate (kg/sec)
2.5
1.5
0.5
0
0 100 200 300 400 500 600 700 800 900 1000
Exhaust gas temperature (K)
(a) Fuel consumption Exhaust gases
3.5
0.1
3
0.08
2.5
0.06 2
1.5
0.04
1
0.02
0.5
0 0
400 500 600 700 800 900 1000
Exhaust gas temperature (K)
(b)
Fuel consumption Exhaust gases
FIGURE 3.10
(a) Graph of mass flow rate from locomotive fuel consumption and exhaust on the same
graph and scale as a function of locomotive exhaust gas temperature. (b) Graph of mass
flow rate from locomotive fuel consumption and exhaust on the same graph but dif-
ferent scales as a function of locomotive exhaust gas temperature. (From Filippone, C.,
Diesel—Electric Locomotive Energy Recovery and Conversion: Final Report for Transit
IDEA Project 67, 2014, http://onlinepubs.trb.org/onlinepubs/IDEA/FinalReports/Transit
/Transit67.pdf.)
58 • Problem Solving for New Engineers
what’s really going on. This same data is replotted in Figure 3.10b, and we
can see that the mass flow rate as a function of temperature is the same for
both fuel consumption and exhaust. We see a steep increase around 800 K.
There are other cases where we may want to show two different dependent
variables on the same graph. Scaling is important in these cases to effec-
tively communicate the data. Figure 3.11a and b demonstrates the same
concepts with two different variables. The scale we use can highlight the
information we wish to communicate or obscure vital information.
How we go about plotting our data should be determined by the data.
The majority of scientific data will be plotted in scatter plots. Scatter plots
easily show the actual data points and allow for comparison of more
than one set of data. Scatter plots are the most common types of graph-
ics used in journals, followed by contour plots (The contour plot images
a 3-dimensional surface by plotting constant z slices, called contours, on
a 2-dimensional format. The (x,y) coordinates are connected where that z
value occurs.). Scatter plots are nice because we can actually see the effects
of the two variables plotted; while contour plots, as well as 3D plots, give
us a “feel” for the data but tend to be less specific.
3.5.2.1 Pie Charts
Pie charts are rarely used in hard science publications. Even business lead-
ers are recommending that they not be used (Knaflic 2015). I don’t see much
use for these in engineering or science. The data in a complex pie chart can
typically be expressed with a different type of graph while a simple pie chart
can be expressed in a table. There is nothing wrong with using pie charts,
Experimenting with Storytelling • 59
600
Exhaust velocity (m/s) and exhaust T(K)
500
400
Exhaust velocity
300 Exhaust temperature
200
100
0
0 500 1000 1500 2000 2500 3000 3500
(a) Power (hp)
25 600
20
400
15
300
10
200
5 100
0 0
0 500 1000 1500 2000 2500 3000 3500
Power (hp)
(b) Exhaust velocity Exhaust temperature
FIGURE 3.11
(a) Graph showing exhaust temperature and velocity measurements on the same graph
and scale as a function of engine power. (b) Graph showing exhaust temperature and
velocity measurements on the same graph and scale as a function of engine power.
(From Filippone, C., Diesel—Electric Locomotive Energy Recovery and Conversion: Final
Report for Transit IDEA Project 67, 2014, http://onlinepubs.trb.org/onlinepubs/IDEA
/FinalReports/Transit/Transit67.pdf.)
60 • Problem Solving for New Engineers
TABLE 3.2
Comparison of Various Graphical Techniques and When Their Use Might Be Appropriate
Graphic Goal Comments Examples
Pie chart Composition Emphasize Don’t use, but if you
relationship to must, express a
whole qualitative relationship
Scatter plots Trends, Emphasize Trends in space
comparisons relationships or time, relationships
Photographs/ Distributions, Only include Process results, e.g., etch
micrographs trends, object of interest depth or profile, cross
composition sections
Schematic or process Processes, Simplify complex Complex experimental
flow diagram locations processes setup or process flow
2D or 3D contours Trends, locations, Contrast two Surface roughness,
compositions, results topographical maps
distributions
Tables Comparisons Compare Comparison between
categories two experiments
of results
Histogram or bar Distribution, Emphasize Summarize a large data
chart composition, categories/ set
trends groups in data
should we feel that is the best way to express our data. Pie charts might
be useful where we have a few (typically less than six) categories of data
and the relative size of these categories as a part of the whole is important.
With similar percentages in the categories, a bar chart might be a better
tool for effectively communicating the results. I went through almost a
complete year of issues of the journal Applied Physics Letters, counting
and categorizing all the figures for the year. I didn’t find one example of
pie chart in the data. My findings are summarized in Figure 3.12.
Distributions or trends can be seen easily with histograms or frequency
diagrams and scatter plots rather than pie charts. Scatter plots are great
for displaying data collected over time or over a distance, allowing the
reader to visualize the distribution. Before selecting a pie chart to commu-
nicate data, we should consider other more effective charts (Knaflic 2015).
3.5.2.2 Histogram
A histogram depicts frequencies of numeric data whose purpose is to
provide a pictorial summary of a data set. In other words, the histogram
Experimenting with Storytelling • 61
40%
30%
20%
10%
0%
es
s
F
ts
r
s
m
e
PD
ph
bl
lo
th
ra
Ta
rp
ra
/o
/
og
th
g
te
rs
ro
ist
ec
at
ou
ic
Sk
H
Sc
nt
m
Co
o/
ot
Ph
FIGURE 3.12
Bar chart showing the most common types of figures in Applied Physics Letters in 2000.
TABLE 3.3
List of 30 Hardness Measurements of 30 Different 304 Stainless Steel Discs
89.2 85.6 85.5 83.6 84.5 86.4
85.3 83.2 87.9 85.1 85.1 87.5
85.3 85.5 86.4 83.8 84.5 87.2
86.7 85.5 87.3 80.7 82.1 86.2
83.8 84.1 88.6 82.7 86.2 87.5
8
7
6
5
Count
4
3
2
1
80 82 84 86 88 90
Hardness measurements (Rockwell B)
(a)
3
Count
80 81 82 83 84 85 86 87 88 89 90
Hardness measurements (Rockwell B)
(b)
FIGURE 3.13
Histogram of the hardness measurements on 304 stainless steel samples with bin sizes of
(a) 2 and (b) 1.
Experimenting with Storytelling • 63
Let’s look at a data set of 30 numbers, listed in Table 3.3 and displayed in
Figure 3.13. Table 3.3 has 30 measurements of the hardness of 30 different
304 stainless steel discs using a Rockwell B Hardness Tester. What can we
see just from looking at the data? From the histogram, we can see if the
data vary wildly. We might also look through the data set and pick out
the minimum value and maximum values. In spite of this, we don’t really
get a feel for the data or what might be going on with the experiment.
Depending on how large the data set is, neither of these values really gives
us a feel for the data set. A histogram gives a better synopsis of the data.
We see how the values are distributed and we see the center of the data set
(roughly anyway). We see how the data vary and any symmetry or lack
thereof about the center.
80%
70%
60%
% of patents granted
50%
40%
30%
20%
10%
0%
1825 1845 1865 1885 1905 1925 1945 1965 1985 2005 2025
Calendar year
FIGURE 3.14
Scatter plot of the % of patents granted by years in the United States. Roughly 58% of
the patent applications received are issued patents. (From United States Trademark and
Patent Office, U.S. Patent Activity Calendar Years 1790 to the Present: Table of Annual
U.S. Patent Activity Since 1790, https://www.uspto.gov/web/offices/ac/ido/oeip/taf/h
_counts.htm, 2016.)
600,000
# of applications/patents
500,000
400,000
300,000
200,000
100,000
0
1775 1800 1825 1850 1875 1900 1925 1950 1975 2000 2025
Calendar year
Total apps Total patents granted
FIGURE 3.15
Time series data showing the dramatic increase in patent applications and patents
granted in recent history. (From United States Trademark and Patent Office, U.S. Patent
Activity Calendar Years 1790 to the Present: Table of Annual U.S. Patent Activity Since
1790, https://www.uspto.gov/web/offices/ac/ido/oeip/taf/h_counts.htm, 2016.)
Tables are great tools for summarizing, comparing, and presenting data
in order to communicate results. See Tables 3.1 and 3.2 summarize infor-
mation. Tables are an effective tool for presenting different experimental
conditions, results, and/or why the testing was performed. Use tables to
extrapolate the information contained in the text. There are times when
it is easier to present a summary of some data set in a table rather than
a graph. Compare Tables 3.3 and 3.4 in this chapter. Both tables contain
exactly the same data. Table 3.4 gives us information about the samples.
If we knew that the sample groupings were from six different material
lots (A, B, C, D, E, and F), would this make the information more valuable
to consume real estate in the main body of our report?
Much of the same rationale behind graphic displays also applies to tables.
All columns and rows should be clearly labeled and easy to read. The labels
should provide information about exactly what is contained in the table and
any associated units. Tables should have captions in the text, just as with
figures. Be aware and conscious of the number of significant digits used in
tables. Just because Excel or the calculator will give 10 digits when two num-
bers are divided, doesn’t mean the data are accurate to 10 digits, unless, of
course, it is. I can only think of a few situations where this degree of accuracy
would even be important to present. For example, if we worked for NASA or
a national lab, there might be occasions where writing gravity out to 10 or
more decimal places would make a difference. I stress this here and in the
next chapter because it happens far too often. We get caught up the in the
calculations and the excitement of our findings and the detail of the digital
accuracy are glossed over.
TABLE 3.4
30 Hardness Measurements of 30 Different 304 Stainless Steel Discs Emphasizing Groupings
Hardness Hardness Hardness
Sample (Rockwell B) Sample (Rockwell B) Sample (Rockwell B)
A1 89.2 B1 85.6 C1 85.5
A2 85.3 B2 83.2 C2 87.9
A3 85.3 B3 85.5 C3 86.4
A4 86.7 B4 85.5 C4 87.3
A5 83.8 B5 84.1 C5 88.6
D1 83.6 E1 84.5 F1 86.4
D2 85.1 E2 85.1 F2 87.5
D3 83.8 E3 84.5 F3 87.2
D4 80.7 E4 82.1 F4 86.2
D5 82.7 E5 86.2 F5 87.5
Experimenting with Storytelling • 67
3.6 IMPORTANCE OF CONCLUSIONS
University of Colorado Physics Professor John Taylor wrote, “Performing
an experiment without drawing some sort of conclusion has little merit”
(Taylor 1982). In my experience both in industry and academia, I con-
tinue to see many reports that omit any discussion of the findings and/
or fail to draw conclusions, especially from new scientists and engineers.
Drawing conclusions seems to work one of two ways: either there are no
conclusions and the presenters stop midsentence or the conclusions are
so all-encompassing that the presenters throw out all they know of sta-
tistically significant results. As scientists and engineers, especially in the
laboratory or early career, when data become available, it “must be inter-
preted through the construction of a theory that can explain” the results
(Weisberg 1993). The conclusions and/or discussion section is an integral
part of work. This section brings the experiment into perspective for all
readers. In reality, the conclusion/discussion section is the most important
section of any write-up/presentation, not just an afterthought.
There are a few basic guidelines for writing or presenting a good experi-
mental conclusion.
• Begin and end on a positive note. Even if the experiment wasn’t suc-
cessful or completely successful, we can still highlight what we did
learn from doing the work.
• Compare results to literature or tribal knowledge. We should relate
our work to what others have done or what is thought.
• Likewise, compare results to initial hypothesis or problem statement.
Everyone reading will want to know if the results matched our initial
hypothesis or answered the question.
• Quantify/qualify results by highlighting sources of error. We should
never hide from this. Openly identifying errors will allow others to
have confidence in our work.
• Describe additional experiments that would improve results.
3.7 KEY TAKEAWAYS
The objective of communication in a lab or work setting is to convey ideas
about our work, actions we have taken, and conclusions we have drawn. We
can achieve this objective with text, tables, graphic displays, or a combina-
tion of all three. Therefore, the objective of any text, table, or graphic is to
communicate. We want to create as simple and clear of a message as we can
with our data. The less our audience has to struggle with to understand our
work, the more confidence they can have in our abilities as experiment-
ers. A chart (or graph) that is confusing or in any way unclear will not
be effective in getting our message across and could potentially erode any
confidence in us and/or our experiment. We have a choice about how to
communicate our work, and we must decide in each presentation setting
which method(s) will be most effective and engaging. We want our work
not only to inspire interest but confidence in our experimental abilities.
P.S. Take some time to watch scientists and engineers giving TED talks
(Technology, Education, and Design) talks. I personally find that no mat-
ter the topic of their work, I am engaged and interested. Our technical
presentations may need to be more detail oriented but it doesn’t mean we
can’t learn from this style of presentation.
REFERENCES
Anderson, C. 2016. TED Talks: The Official TED Guide to Public Speaking. New York:
Houghton Mifflin Harcourt.
Callister, W. D. and D. G. Rethwisch. 2008. Fundamentals of Materials Science and
Engineering: An Integrated Approach. 3rd Ed. New York: John Wiley & Sons.
Cleveland, W. S. 1994. The Elements of Graphing Data. Summit, NJ: Hobart Press.
Deming, W. E. 1982. Out of the Crisis. Cambridge, MA: Massachusetts Institute of
Technology, Center for Advanced Engineering Study.
Dolnick, E. 2011. Clockwork Universe: Isaac Newton, the Royal Society and the Birth of
the Modern World. New York: HarperCollins.
Gallo, C. 2014. Talk Like TED: The 9 Public Speaking Secrets of the World’s Top Minds.
New York: St. Martin’s Press.
Experimenting with Storytelling • 69
Charles Wheelan
Making measurements and collecting data are not the goals of engineers
and scientists. Making measurements and collecting data are merely a
means to and end. The purpose of measurements and data collection is to
help us (our lab partners or team, company, or manager) make informed
decisions—for example, decisions about whether a product is shipped may
depend on whether a process or tool is working or needs improvement.
The job of an engineer is to make decisions or recommendations about
decisions—not just to collect data. The confidence others have in our
experimental or problem solving abilities is a direct result of the choices
we make and the data we collect.
Understanding the measurements and the data we collect are critical
first steps in experimentation. We need to be able to effectively communi-
cate the data we collect, but in order to do this, we must have agreements
and understandings about the quality, quantity, type, and confidence of
that data. For this reason, we need to discuss data and measurements early
in our conversation about experimentation and problem solving. In this
chapter, we will start from the beginning with data basics as a refresher
and as a means of establishing a common way of languaging our results as
data. Once we establish a conventional way of talking about our data, we
can then examine measurements more closely. We know that all measure-
ments (ideally) will contain some part signal and some part uncertainty
(noise). Our confidence in the data, and therefore in our experiment, is
often a measure of the ratio of the effect (signal) to the uncertainty (noise)
71
72 • Problem Solving for New Engineers
4.1 DATA CHAOS
One of the biggest problems we face as we try to solve the big problems
of the twenty-first century is not the lack of data but data chaos. Over
the years, our governments, health care organizations, and industry have
collected heaps of data. The data might be expressed in English or metric
units. Not only are the data expressed differently, but also these moun-
tains of data are in file cabinets, basement boxes, and computers. The data
may be gathered from many different specialists, different labs with differ-
ent standards, using different protocols. The data can be handwritten or
digital. A very small portion of that digital data is kept in well-organized
databases. Much of digital data appears in an unstructured format while
much of the paper reports are handwritten, scanned, and low-resolution.
In the case of medical data, let’s not forget about all those archived paper
or audio files.
Miguel Helft provides a great example in an article he wrote for
Fortune magazine (Helft 2014). Medical cancer data aren’t collected sys-
tematically, and there are no standards for reporting the data. For exam-
ple, data for albumin, a protein marker routinely measured in cancer
patients, can be expressed in over 30 different ways. Albumin is just one
marker. The real problem is that oncologists collect thousands of data
points about each patient: from different blood markers, biopsies, genetic
tests, magnetic resonance images, x-rays, etc. With each care facility and
lab reporting data in different formats using different forms and storing
the data differently, it will take dedicated efforts to make sense of all this.
Innovation, from our modern conveniences to life-saving medical treat-
ments, would only be science fiction without the ability to measure and
control critical data in our experiments and research.
Introducing Variation • 73
TABLE 4.1
Comparison of Quantitative Data Types
Questions
Answered
Data Type Data Examples by These Data Numerical Expression
Measurement Length, height, How long? How Any real number and
weight, volume, much? units typically: 1.54353
wavelength, power, meters, 7.954 W, 35.9
time, temperature Joules, 10.5 m/sec, 0°C
Nominal Frequency of How many? Integers: 3,1001,21
occurrence
Ordinal Ranking, ordered What order were the Integers: 1,2,3,4
students’ grades in
the class?
Locational Location Where were cancer Real numbers
deaths by county in and direction:
Idaho? 19.59852°N–155.5186°E
74 • Problem Solving for New Engineers
why we are collecting the data. In engineering and science, we deal with
all of these types of data, but measurement data are the most widely used.
For the most part, measurement data are quantitative (measured, numeric
data) as opposed to qualitative (attribute, characteristic data), which may
be numeric as in case of ranking (ordinal) data.
Measurement data are the most preferable type of data. Measurement
data are quantifiable, continuous data, e.g., length, height, weight, volume,
wavelength, power, time, etc. We get more information from an actual
measurement as opposed to summarized data (statistics). Nominal data
are the classification or categorization of data. Nominal data are quantita-
tive, countable, discrete, or occurrence data, e.g., inches of rainfall, # of
defects, # of failures, # of choices, # of birth defects, etc. Ordinal data are
ranked data. Ranking birth order of den mates or states by the amount of
rainfall are examples of ordinal data.
The final type of data is locational data. Locational data are used to
answer the question “where?” and is typically found in concentration
charts or measles charts. Locational data might be considered nominal
data with a locator. By the way, measles charts are the locational graphs
used to visually show where something is happening. For example, if we
wanted, we could use a measles chart to show solar or wind power genera-
tion overlaid on a map of the Germany or traffic accident rates overlaid on
a map of the Washington, DC, metro area.
The sciences deal with all types of data. Measured data are more infor-
mative, descriptive, and precise than counted data are. Since continuous
data contain more information, they are preferred over discrete or dis-
continuous data. There are times when we have a choice about the type of
data we collect. For example, if we are measuring a set of parts to compare
against a drawing, we could measure the actual dimensions of the parts.
For simplicity’s sake, let’s say we have a bag of 500 stainless steel washers.
Once we had measured the thickness of all the washers, we would have
500 individual measurements. We would have a set of continuous data.
We would be able to make calculations that represent the data set. On the
other hand, we could also group the parts into discrete bins based on the
measurements. This would give us countable data. For example, we can
label the parts as “smaller than the specification,” “within the specifica-
tion,” or “larger than the specification.” In this case, we’d have a lot less
information about the individual parts. We would only know which par-
ticular bin they belonged to and nothing else. In other cases, we do not have
a choice about whether the data we collect are continuous or discontinuous.
Introducing Variation • 75
For example, on the farm growing up, one of our morning chores was to col-
lect eggs. We harvested eggs from the dozen or so chickens each day. Each
morning, the number of eggs one of us kids had to gather was discrete or
discontinuous. However, when my father asked me to calculate the average
egg yield each week, I might have gotten a significant fractional number.
Data must be representative. There are many occasions where it is impos-
sible to collect all the data that are available. Population is the term we use
for the set or collection of all possible objects or individuals of interest. We
are interested in a population because we want to draw some conclusions
about the characteristics of that population.
In order to learn something about a population, we might collect a sub-
set of the population data. A sample is a subset of population. In order
to accurately represent a population, we need a random sampling of that
population. When we talk about a set of data, we could be talking about
a population or a sample. In the case of a sample, we want the data in the
sample to be representative of the whole population or at least some larger
group of that population.
Statisticians use a specific and particular language to distinguish and
clarify statistics to represent population and samples. More often than not,
we do not know the true value that describes a population. True values
are not known; we, as scientists and engineers, use averages over repeated
experiments to establish the reference value that we lazily refer to as the
true value (Gauch 2006). How do we draw conclusions about a data set if
we don’t have a true value to refer to? We depend on calculated statistics
to summarize the sample data available. These statistics include sample
mean, median, range, standard deviation, variance, root-mean-square
deviation, standard error, etc.
We’d like to be able to use sample data to draw conclusions about popu-
lation. Ideally, we’d like to use one or two numbers to describe or repre-
sent our whole set of data. The most common approach is to describe the
middle of the data and the variability or dispersion in the data. There are
a number of rigorous mathematical implications that result from working
with a population versus a sample of the population. We can leave that
to future bedtime reading or a statistics class. Throughout this book, we
will assume that we are talking about statistics that represent a sample.
Describing the sample is a simpler and more cost effective way to represent
the population. In a later chapter, we’ll delve further into representative data.
Finally, data should be useful. Inclusion of irrelevant data in either
reporting or analysis can lead to confusion or overly complex models. For
76 • Problem Solving for New Engineers
4.2 DATA BASICS
Before we delve into more advanced topics related to data, there are several
data topics I’d like to review: significant digits and scales and units. I can’t
tell you how many college-level lab reports and even new engineers at all lev-
els of education that I’ve reviewed with 10 to 12 significant digits in tables
or scales and units have been omitted from reports. These may seem like
elementary topics, but these are easy mistakes to make. The spreadsheet/
workbook software applications make it easy to do calculations that default to
displaying as many digits as there is room in the column. We focus on the
numerical value and the calculation or measurements and forget the physi-
cal quantity that is of concern. At every step in our experimentation, we
must simultaneously keep in mind both the big picture problem we want
to solve and the ensuing details of calculations and/or measurements.
4.2.1 Significant Digits
Significant figures are defined as all non place holder digits in a num-
ber. What does this mean? It is probably easier to demonstrate with a few
examples. Given the number 123.456, there are six significant figures in
this expression. However, if the number were written as 123.4560, there
may be either six or seven significant figures, depending on the whether
the 0 in the ten-thousandths place was measured or is just a place holder.
The same is true for numbers on the left of the decimal position; the num-
ber 10 could have one or two significant figures depending on whether the
0 was measured or whether it is just a place holder. Likewise, with 100,
1,000, and 10,000, the number of significant figures in each case could
just be one or two in each case. In order to make the number of significant
figures obvious, an alternative expression might be 10,000 = 1 × 105 for
only one significant figure or 10,000 = 1.0 × 105 for two significant figures.
Scientific notation eliminates any ambiguity in the number of significant
digits.
Introducing Variation • 77
we use is referenced to the freezing point of water. The units we use should
be the most appropriate for the intended purposes. Going between dif-
ferent standards is easy but may cause trouble. Recall the classic mistakes
made by NASA scientists using both metric and English units. In the case
of the Mars orbiter, two different teams working on the project were using
different units, resulting in a $125 million loss (Conradt 2010). In some
cases, we may need to work with multiple scales, but it is certainly safer
to identify the most convenient scale in each situation and stick with one
system.
4.3 VARIABLES
Variables are all those inputs or outputs that we can vary in our experi-
ments. All the inputs and outputs listed in an Input–Process–Output dia-
gram could be considered variables. Whether we are varying the inputs/
outputs by controlling, measuring, ignoring, or manipulating/managing,
it is critical that we understand the roles that each play in our experi-
ments. The goal of scientific experimentation is to examine the relation-
ship between variables. Whether we are attempting to quantify, qualify,
establish, study, or determine variable relationships, our experiments will
always involve them.
Historically, we have divided variables into two broad categories: depen-
dent and independent. Independent variables are those variables that we
control, vary, change, or manipulate in some way while dependent vari-
ables are those we measure. In a broad sense, we could think of indepen-
dent variables as inputs and dependent variables as outputs. From an
experimental perspective, the independent variables are those we choose
to be not biased by—those that are free from our inputs (the initial condi-
tions) while we monitor our outputs (those variables that depend on the
experimental conditions).
As we saw in the Input–Process–Output diagrams, there are many fac-
tors that can be inputs, all of which may impact that results or dependent
variables. We typically choose one or a few of the input variables to change
in our experiments to study the effect or impact on our outputs. What
about all the other input factors that we listed? They are still independent,
free, unbiased variables, but we aren’t intentionally varying them as a part
Introducing Variation • 79
of the experiment. We will divide all our inputs into three categories: con-
stants (C), noise (N), and variables (X) (Ishikawa 1987, Wortman et al.
2007).
m = 50 ± 5 g (4.1)
What does this really mean? Does the scale or balance being used to
make the measurement have an uncertainty of 5 g? Does the scale have an
uncertainty of 10 g? Is 5 g the standard deviation or the standard error? Is
it the expanded uncertainty, e.g., ±2σ or ±3σ, standard uncertainty, u, or
combined uncertainty, uC? Is the 5 g expression simply the experimenter’s
best guess at uncertainty?
It is not obvious what this notation represents, and this makes it dif-
ficult for experimenters, much less decision makers such as engineering
managers or the marketing department in a company, to compare results.
According to University of North Carolina physicist, Professor David
Deardorf, “The interpretation of u in x ± u is not consistent within a field
of study, let alone between fields of study, and the meaning is generally
not specified” (Deardorf 2016). “The ± format should be avoided whenever
possible because it has traditionally been used to indicate an interval cor-
responding to a high level of confidence and thus may be confused with an
expanded uncertainty” (Deardorf 2016). However, this notion is commonly
used in most fields of study. In many cases, the ± expression is the expected
format for uncertainty, even though no one really knows what it represents.
If this is a requirement for a publication, paper, or memo, we should include
an explanation of exactly what is meant by the ± format in our work.
This is not a subject that teachers and instructors usually spend a lot
of time dealing with. However, in the late 1990s, as metrology (the sci-
ence of measurement) became more of a science in and of itself, seven
82 • Problem Solving for New Engineers
TABLE 4.2
Examples of simple experimental techniques for increasing the signal and decreasing
the uncertainty in an experiment
Method of Increasing S/N Pros Cons
Repeat a measurement Determine if measurement May increase time
system is adequate
Repeat an experiment Decreases bias and noise Increased time and
resources;
May increase cost
Randomizing Samples Decreases bias and noise May increase cost, time
and/or resources
Randomizing Experiments Decreases bias and noise May increase cost, time
and/or resources
Increase samples size Decreases bias and noise Increased time and
resources;
May increase cost
Add covariates Decreases bias and noise Increased times and
resources; May increase
cost; Potentially more
complex to analyze
Source: Slutz, S., Hess, K., Increasing the ability of an experiment to measure an effect, 2016, http://
www.sciencebuddies.org/science-fair-projects/top_research-project_signal-to-noise-ratio
.shtml.
case, we, as scientists and engineers, must be careful not to repeat experi-
ments unnecessarily.
Hugh Gauch, an agri-scientist at Cornell, published a paper in American
Scientist in which he calculated the number of repeated experiments
required so that the average of the repeats is more accurate than a single
experiment (Gauch 2006). Gauch’s calculations are plotted in Figure 4.1. We
see that repeating an experiment one additional time, the data are more
accurate 60.8% of the time. If we repeat the experiment five times, the data
are more accurate 73.2% of the time. What about the other 40.2% or 26.8%
of the time, respectively? To achieve 90% confidence of success, the experi-
ment would need to be repeated 40 times. To achieve 95% confidence of
success, the experiment would need to be repeated 162 times. On the other
hand, the good news is that as we increase the number of repeats, we see an
improvement, but beyond a certain point, it is no longer worth it. I am not
advocating that all experiments be repeated 162 times or even 40 times; this
is more to make us aware that replication has its limitations.
The larger the sample size, the more confidence we can have in the
results (see Figure 4.2). The more samples of a whole population we test,
Introducing Variation • 85
Confidence of success
120
100
Confidence of success (%)
80
60
40
20
0
1 10 100 1000
# of experimental repeats
FIGURE 4.1
Law of diminishing returns is seen with confidence in experiments simply due to repeated
experiments assuming error in data is all random. (From Gauch, H.G., Am. Sci., March–
April, 133–141, 2006.)
Confidence interval
100%
90%
80%
Confidence interval (%)
70%
60%
50%
40%
30%
20%
10%
0%
0 100 200 300 400 500 600 700 800 900 1000
Sample size, N
FIGURE 4.2
As sample size increases, the range in the confidence interval decreases. (From Gauch,
H.G., Am. Sci., March–April, 133–141, 2006.)
86 • Problem Solving for New Engineers
the more representative of the population our results actually are. A good
estimate of the confidence interval for a measurement with N samples is
1
C .I . = . (4.2)
N
4.5.2 Reducing Uncertainty
Let’s deal with what is meant by true value and reference value, uncer-
tainty, and error. First, uncertainty tells us the range of values within
which the ‘true value’ can be said to lie within a specified level of confi-
dence. (I’m using quotes around true value because we know there is no
such thing.) In order to interpret data correctly and draw valid conclu-
sions, we must indicate uncertainty and deal with it properly. For the result
of a measurement to have clear meaning, the value should not consist of
Introducing Variation • 87
4.6 KEY TAKEAWAYS
This chapter has laid the foundation for the upcoming chapters. We dis-
cussed different required characteristics of data that we will use. We looked
at variables. We covered the different types of input variables that give us
our output variables. Being able to identify whether a variable is in-control
or out-of-control (noise) is an important beginning to experimentation. In
Section 4.4, we saw that the measurement data we collect are both signal
and uncertainty. There is no such thing as a true value. Characterization of
uncertainty is important because it tells us how much we can rely on our
signal. We can become more confident in our data by strengthening the
signal or reducing the uncertainty. Uncertainty is comprised of a number
of different components. An uncertainty estimate should address both sys-
tematic and random variation. Including uncertainty with measurements
is the most appropriate means of expressing the truthfulness of the results.
As we mature as engineers and scientists, so should our experimental
sophistication in the characterization of uncertainty. The next three chap-
ters will delve into approaches to minimize uncertainty introduced by
unintentional variation, systematic variation, and random variation.
P.S. Test your understanding of the chapter by examining an experi-
mental setup. Create and input–process–output diagram. Label each of
the inputs with a C for controlled variables, N for noise variables, and
X for process variables. Consider what it would take the move the noise
variables to controlled.
REFERENCES
ANSI. 2016. American National Standards website. http://www.ansi.org. National
Conference of Standards Laboratories website. http://www.ncsli.org. ANSI/NCSL
Z540-2 is a handbook written to assist with calibration laboratories and users of
measurement and test equipment.
92 • Problem Solving for New Engineers
Conradt, S. 2010. The Quick 6: Six Unit Conversion Disasters. http://mentalf loss.com
/article/25845/quick-6-six-unit-conversion-disasters.
Deardorf, D. 2016. Introduction to Measurements % Error Analysis. http://user.physics
.unc.edu/~deardorf/uncertainty/UNCguide.html.
Gauch, H. G. 2006. Winning the Accuracy Game. The American Scientist March–April:
133–141.
GUM. 2009. Evaluation of Measurement Data—Guide to the Expression of Uncertainty
in Measurement. Paris: Bureau International des Poids et Mesures. JCGM 100:2008.
http://www.bipm.org/en/publications/guides/gum.html.
Helft, M. 2014. Can Big Data Cure Cancer? Fortune 170(2):70–78.
Ishikawa, K. 1987. Guide to Quality Control. Tokyo: Asian Productivity Organization.
NIST. 2006. National Institute of Standards and Technology NIST/SEMATECH e-Hand-
book of Statistical Methods. http://www.itl.nist.gov/div898/handbook/2006.
Slutz, S. and K. Hess. 2016. Increasing the Ability of an Experiment to Measure an Effect.
http://www.sciencebuddies.org/science-fair-projects/top_research-project_signal
-to-noise-ratio.shtml.
Taylor, B. N. and C. E. Kuyatt. 1994. Guidelines for Evaluating and Expressing the
Uncertainty of NIST Measurement Results. Natl. Inst. Stand. Technol. Tech. Note
1297, Washington. http://physics.nist.gov/Pubs/guidelines/outline.html. This is
sort of a guide to the Guide to the Expression of Uncertainty in Measurement,
GUM. Website: http://physics.nist.gov/cuu/uncertainty/basic.html.
VIM. 2012. International Vocabulary of Metrology—Basic and General Concepts and
Associated Terms (VIM 3rd edition). Paris: Bureau International des Poids et
Mesures. JCGM 200:2012. http://www.bipm.org/en/publications/guides/vim.html.
Wheelan, C. 2013. Naked Statistics: Stripping the Dread from the Data. New York: W. W.
Norton & Company.
Wortman, B., W. Richardson, G. Gee, M. Williams, T. Pearson, F. Bensley, J. Patel,
J. DeSimone, and D. Carlson. 2007. The Certified Six Sigma Black Belt Primer. West
Terre Haute, IN: The Quality Council of Indiana.
5
Oops! Unintentional Variation
All men make mistakes, but a good man yields when he knows his course
is wrong, and repairs the evil. The only crime is pride.
Sophocles, Antigone
93
94 • Problem Solving for New Engineers
5.1 HISTORY OF MISTAKES
It probably isn’t necessary to convince you that new engineers and sci-
entists make mistakes. However, most of us don’t realize that even great
scientists and engineers make mistakes. Some of the biggest names in sci-
ence and engineering have made basic beginner mistakes. As a matter
of fact, some of the biggest mistakes have led to great breakthroughs in
science. Dr. Mario Livio, an astrophysicist and author at Space Telescope
Science Institute in Baltimore, Maryland, wrote a book about 12 of these
great mistakes in his book Brilliant Blunders. Included in the roll call of
scientists on this list are the rock stars of science: Charles Darwin, Linus
Pauling, Lord Kelvin, and Albert Einstein. According to Dr. Livio, 20 of
Einstein’s original papers contain mistakes (Livio 2013).
Why spend a whole chapter on mistakes? Here’s an example from medi-
cine. In the mid-1980s, Israeli scientists found that an intensive care spe-
cialist performs an average of 178 individual tasks each day. These tasks
range from administering drugs to suctioning lungs, all of which have
some amount of associated risk. The amazing thing is that the doctors
and nurses were found to make errors on only 1% of the time. However,
this amounted to two errors per day per patient. Of the more than 150,000
Oops! Unintentional Variation • 95
deaths each year following surgery, studies repeatedly show that roughly
50% of those deaths and major complications are avoidable (Gawande
2010). This example isn’t experimentation in the true sense of the word,
but if we take the series of actions or steps that hospital staff take each day
and put it in a lab, the parallels become clear.
We might argue that this study is old since it is from the 1980s. Surely,
we perform better than this more recently. In 2013, Dr. John James pub-
lished a review article in the Journal of Patient Safety entitled “A New,
Evidence-Based Estimate of Patient Harms Associated with Hospital
Care” (James 2013). He estimates that there are 440,000 preventable mis-
takes that contribute to the death of patients each year in hospitals. These
440,000 deaths are roughly one-sixth of all deaths that occur in the United
States each year. In other words, a significant number of deaths in hospi-
tals could be avoided with procedural changes to eliminate mistakes. The
knowledge exists; however, steps are skipped and mistakes are made. In
experimentation, we perform thousands of actions to carefully prepare
our work. What if we only had only a 1% error rate? Would our work be
repeatable and reproducible?
Example
There are many scientific practices that have documented procedures for
how to perform sample preparation. However, in 1951, when scientists
around the world were trying to grow cells outside of a human body, little
was known about the best way to grow cells. At Johns Hopkins University
Hospital, George and Margaret Gey were among these scientists. Many ver-
sions of the perfect culture to grow cells were tested. One technique in the
Gey lab, developed by Margaret from her days in surgical training, involved
chicken bleeding. Margaret worked out the procedure and provided step by
step instructions for any researcher who wanted to use it.
Additionally, contamination was an ongoing problem. Bacteria were con-
stantly being introduced into the samples via unwashed hands, breath, dust
particles, etc., which killed the cells. Through Margaret’s surgical training,
she knew the most up to date practices regarding sterility. Like those in
the Gey’s lab, most scientists working on this problem were biologists who
knew nothing about contamination at that time. Margaret taught everyone
in the Gey’s lab, from her husband George to the lab techs to the gradu-
ate students and scientists, about preventing contamination. It is said that
she is the “only reason the Gey lab was able to grow cells at all.” The cells
they eventually grew and shared with labs around the world were from a
young mother who died from cervical cancer, Henrietta Lacks. The cells
were therefore given the name HeLa, using the first two letters of her first
and last names.
96 • Problem Solving for New Engineers
was around. Although the optical setup was on a stable table, the signal from
my plasma was very low under certain conditions compared to the noise.
The experiments were incredibly complex to set up. The lab was a shared
space between several professors at the University of Michigan in the base-
ment of Naval Architecture and Marine Engineering Building. I knew the
schedule of most of the other graduate students and would coordinate with
them. One Saturday while running my experiment, I was startled when all
the overhead lights suddenly came on. I screamed from the back of the lab
(roughly the size of a football field). Professor Ron Gilgenbach, now chair
of the Nuclear Engineering and Medicine department, had stopped by to
check on his experiment. As a result of the shock, both he and I were in
full amygdala activation mode. We were both able to laugh about it min-
utes later. The point of this story is that this type of environmental varia-
tion could have resulted in either systematic or random shifts in my data.
Other physical examples of similar shifts might be measurements made in
different environments where the results are sensitive to vibrations, drafts,
humidity, changes in temperature, electronic noise, etc.
Another common source of unintentional variation is inconsistent
measurements. Inconsistent measurements could occur when the per-
son making the measurement doesn’t calibrate or zero the equipment.
Hysteresis may also result in variation if the equipment has some memory
effect from the previous measurement. A poor electrical connection may
result in errant values on the equipment display. As graduate students, we
will often make our own measurements, prepare our samples, and run
our own tests. However, occasionally, younger graduate students, under-
graduates, or other support staff are involved in some or all of the steps.
Unintentional variation can be added to the data if there is a lack of train-
ing, skill, or overall physical ability or operations are performed in a dif-
ferent sequence. Misreading the scale divisions on an instrument display,
whether this is due to reading the wrong number, miscounting the scale,
or the parallax effect, is an example of a case where inconsistent measure-
ments might be introduced. This variation introduced into the results may
result from the distance of the person making the measurement from the
scale or indicator used in the measurement. Uncertainty might be added
by the angle of view while we are making the reading on a burette, pipette,
column, or beaker. The parallax effect is measurement variation where the
data are collected with our eyes at different angles, which could result in
either a systematic shift or random shifts in the results. If we consistently
read the scale with our eyes too low, the values we read may be too high
98 • Problem Solving for New Engineers
and vice versa. The parallax effect is illustrated in Figure 5.1. Therefore, it
is essential that everyone working on the experiment performs the work
the same way each time. The way in which equipment is operated has a
bearing on the quantity, quality, and consistency of the measurements.
Typical reactions times for most people are between 200 and 300 mil-
liseconds. If we want to distinguish times on the order of seconds for
our experiments, we need a more consistent and reliable reaction time to
events. There are measurements that vary with time. There are times when
equipment needs time to warm up, reach equilibrium, or recover from
the prior measurement before use. Lag time may also result in inconsis-
tent measurements either made by the same person or between people.
For measuring the time between events, we can easily find equipment
that provides measurements to the millisecond. If these measurements
are performed by hand, the unintentional variation in the measurements
will be significant. We can avoid these by moving to digital data acqui-
sition systems. However, within computer data acquisition measurement
°C
41
Meniscus
40 Eye too high
39
38 37
37 Eye in correct
36
position
36
35
Eye too low
35
FIGURE 5.1
Illustration of the parallax effect demonstrating how incorrect positions when reading
a scale can result in incorrect measurements. Since the molecules of certain liquids are
attracted to the sides of the beaker, this surface tension decreases the further away from
the sidewalls we get. The surface tension effect creates a concave shape in the liquid with
the lowest point is known as the meniscus.
Oops! Unintentional Variation • 99
matter was only then beginning to find a lucrative niche with semicon-
ductor etch and deposition technologies. The designers of semiconductor
processing equipment thought they were building their process chambers
the same way each time, but semiconductor fabrication facilities (affec-
tionately known as fabs) around the world would have problems matching
between chambers and between fabs. They even had difficulty matching
one run to the next. This was a costly issue for the industry. Intel, Hitachi,
Siemens, and other companies instituted “copy exact” worldwide.
In an effort to address this global concern from fabrication facilities and
semiconductor equipment manufacturers, researchers from the National
Institute of Standards and Technology (NIST), along with academ-
ics around the world, designed a tabletop research experimental plasma
chamber that could be duplicated at each lab. Plans were drawn up and
a total of eight universities and NIST each built the Gaseous Electronics
Conference Reference Cell (GEC 2005). The plasma physics community,
along with the scientists and engineers at NIST, saw this as an opportu-
nity to develop a deeper understanding of the fundamental interactions
between hardware components and the plasma (Brake et al. 1995). Each
research team participated in the design of a reference chamber. We each
used the same plans to build a plasma tool. Large optical windows were
added as viewports to the plasma chamber to allow for optical diagnos-
tic comparisons. The details of what it took to match the hardware such
that the process results from chamber-to-chamber required effective
communication among the team members. Effective communication was
achieved through a variety of venues. Clearly defined specifications for the
mechanical and electrical components were required. There were regu-
lar meetings, conference calls, and data sharing between groups. These
collaborative research efforts resulted in numerous papers, master’s and
PhD’s for many students, and a much clearer picture of just how impor-
tant every detail was in matching process performance for equipment
suppliers. It was through sharing data, plans, and practices that a better
understanding of chamber matching became a reality.
Effective communication is probably the single greatest weapon we
can wield against unintentional variation. Whether we are working with
research groups or colleagues around the world or next door, whether
we do all the work ourselves or among our research or work group, it is
essential that we all perform the work in the same way. The best means
for accomplishing this goal is through documentation. Specifications,
requirements, protocols, and checklists are all means of clearly defining a
Oops! Unintentional Variation • 101
Example
Charles Darwin, credited with the development of the theory of evolu-
tion, made amateurish mistakes in his data collection and notes. For a
scientist who kept a fairly detailed journal of his travels to the Galapagos
Archipelago, he had many glaring omissions of details. He attempted to
fill in the details by memory later when he realized that these might be
important but his post-travel journal entries could never be verified. I’m
referring to Darwin’s finch collection from the different islands, although
at the time, he didn’t realize they were all finches. His purpose in collecting
the birds which all looked different—from beak shape and size to feather
color—was to send them to John Gould, the head of The Zoological Society
and an eminent British ornithologist. In Darwin’s Ornithological Notes, he
details the location of only 3 of the 31 species he brought back. After study-
ing these birds, Gould concluded they were indeed all finches. Darwin’s
careless note-taking and sketchy detailed data related to the location of the
collected samples could have cost us the evolutionary theory. These finches
had evolved to harvest the food on the islands where they were living. Some
of the finches harvested seeds for nourishment and other insects. The island
terrains varied and the birds evolved to survive. It appears these finches
were some of the earliest immigrants as they show the most advanced evo-
lution. Although Darwin was a knowledgeable and experienced taxono-
mist, he made a similar mistake with tortoises. The vice-governor of the
archipelago, Nicholas Lawson, pointed out that “the tortoises differed from
the different islands, and that he could with certainty tell from which island
any one was brought.” As with the finches, Darwin didn’t appreciate what
he had because the 30 adult tortoises brought on board the ship were eaten
and discarded by the ship’s crew. This mistake is so obvious and glaring
in hindsight, but remember Darwin didn’t know what he was looking for
at the time of his travels. The lesson we can take from this is to record as
exactly and precisely as possible the details and conditions of our experi-
ments. Those things that seem unimportant at the time may actually be the
key to a breakthrough. (Sulloway 1982)
the elevator and rudder controls. Rather than sending pilots back for more
training, Boeing decided to create a checklist to deal with the complex
details so even an expert would not need to hold it all in memory. With
the aid of this checklist, the Model 299 flew 1.8 million miles without
one accident. The simple checklist has been honed and refined by Boeing
Corporation for all their aircraft. They’ve perfected both the art of the
checklist and the engineering and flying of aircraft. Boeing is the checklist
factory (Boorman 2000, 2001, Gawande 2010).
Checklists are not procedures. They are tools with simple steps that get
easily missed, but in the case of an airplane or surgery, missing one step
can lead to fatal consequences. In the chemical and physical sciences, we
aren’t necessarily looking for checklists to save lives but to ensure con-
sistency in our experiments. We want to make sure that all the steps that
might be crucial to minimizing variation are followed. They are not meant
to be detailed operating instructions (see the next section) so that any-
one walking in off the street can perform the task. Checklists should be
written in the common language of the profession. They are written for
experts on a specific task.
If all persons involved in the problem solving study are well trained
professionals, a checklist may prove adequate to creating uniformity and
eliminating unintentional variation. However, there may be cases where it
is essential to have more detail, and in these cases we want to have operat-
ing procedures for each step of our experiment.
5.3.3 Input–Process–Output Diagrams
A wise investment of time prior to beginning an experiment is map-
ping out the Input–Process–Output diagram and creating a plan to
manage the control (C) variables and minimize the impact of noise (N)
106 • Problem Solving for New Engineers
5.4 DYNAMIC MEASUREMENTS
The time sequence of data should be recorded. Record all the informa-
tion about the collection of the data in addition to the data. Some mea-
surements are time dependent, for example, the stabilization time on
a meter or the hysteresis effect. In order to avoid having these effects
contribute to unintentional variation in our results, these effects
should be characterized and well understood. It is only then that we
can create operating procedures that control these known effects.
Time-dependent or dynamic measurements often require advanced
mathematics (Holman 2001, Coleman and Steele 1999). Advanced texts
have thorough coverage of dynamic measurements. In early experi-
mentation, it is best to make every effort to reach a steady state before
making a measurement.
Oops! Unintentional Variation • 107
5.5 BAD DATA
Unintentional variation can be caused by accidents, carelessness, or
improper, poor, or biased technique and may contribute to variation in
the experimental results. Misreading and intermittent mechanical mal-
function can cause readings well outside of expected random statistical
distribution about the mean. One recent example that really hit home was
a misplaced decimal point recorded in a database for a sample. The data
from the sample were applied to the whole lot of material. These bad values
were published internally and resulted in many, many unhappy people.
This mistake by the person who recorded the data, and by all the people
who used the data without thinking, cost our company thousands and
thousands of dollars. No data set should include known mistakes. Values
that result from reading the wrong value or making some other mistake
should be explained and excluded from the data set. Many times, I’ve
had students in labs and even new engineers deliver reports that include
known bad data. The data and analysis reported contain mistakes and yet
are presented in engineering meetings. If a reading varies greatly from the
true or accepted value, check for unintentional variation (mistakes, blun-
ders, etc.). Poor repeatability and reproducibility (covered in Chapter 6)
are also indications of the unintentional variation at work.
• Data entry errors can be avoided with automated data collection, but
even automated collection requires the data to be reviewed.
• As we review our data and begin to perform calculations, we must
remember to take care with the significant digits. Avoid unnecessary
rounding that will reduce measurement sensitivity. Calculations
based on the data should include at least one more decimal position
than the data point readings. Rounding data will affect the standard
deviation in the data but will not impact the mean.
Although no data set should include bad data or mistakes, the removal
of data from a data set should not be done in a cavalier fashion. Often, the
final conclusions drawn from an experiment can be significantly affected
by mistakes. Removing data can give the impression of data “fixing” or
result in a missed discovery. There are cases where the data that are unex-
plainable are actually the most interesting part, as was the case with the
discovery of fermion superconductors. In 1975, Bell Labs scientists were
studying the magnetic and crystal-field properties of UBe13. In their
search for compounds to use with nuclear cooling and nuclear ordering,
they measured a superconducting transition at 0.97 K. These results were
inconsistent with their expectations. The measurements were thought to
be due to contamination of the uranium filament used in the experiment
because it didn’t fit the expected pattern of temperature-independent sus-
ceptibility or magnetic ordering. The experimenters completely missed
the discovery of fermion superconductivity (Chu 2011).
Henri Poincaré
steady-state universe even when faced with data from Georges Lamaitre
and Edwin Hubble that the universe was expanding (Livio 2013). The
Russian chemist Dmitri Ivaovich Mendeleev, who gave us the periodic table
of elements, believed that the atom was the smallest particle. According to
historian and author Barbara Goldsmith, Mendeleev stubbornly refused
to believe Henry G. J. Moseley when he claimed to have discovered a
smaller particle, the electron. Goldsmith also wrote of Pierre Curie’s
antagonistic relationship with fellow physicist Ernest Rutherford. Pierre
stubbornly clung to his own theories about radioactive elements. The two
scientists aired their dispute publically. Fortunately, and unlike Hoyle and
Mendeleev, Curie finally duplicated Rutherford’s experiments and con-
ceded (Goldsmith 2005).
We employ intuition and hunches daily. Psychologists believe that “intu-
ition is a rapid-fire, unconscious associating process … The brain makes
an observation, scans its files, and matches the observation with exist-
ing memories, knowledge, and experiences” (Brown 2010). According to
Christof Koch, president and chief scientific officer of the Allen Institute
for Brain Sciences, “Intuition arises within a circumscribed cognitive
domain. It may take years of training to develop, and it does not eas-
ily transfer from one domain of expertise to another” (Koch 2015).
Unfortunately, my skills in Scrabble do not transfer to the New York Times
crossword puzzle. The 1978 Nobel Memorial Prize in Economic Sciences
winner Herbert Simon defines intuition: “The situation has provided a
cue; this cue has given the expert access to information stored in memory,
and the information provides the answer. Intuition is nothing more and
nothing less than recognition.” Simon’s definition “reduces the apparent
magic of intuition to the everyday experience of memory” (Simon 1992).
In Thinking, Fast and Slow, the 2002 Nobel Memorial Prize in Economic
Sciences winner Professor Daniel Kahneman reviews some of the research
addressing the “marvels and flaws of intuitive thinking” (Kahneman
2011). “Intuitive answers come to mind quickly and confidently, whether
they originate from skills or heuristics.” The solution is to “slow down and
construct an answer.” However, knowing all this doesn’t make a differ-
ence. He goes on to say, “Except for some effects that I attribute mostly
to age, my intuitive thinking is just as prone to overconfidence, extreme
predictions and the planning fallacy as it was before I made a study of these
issues.” We, whether a trained scientist/engineer or armchair scientist/
engineer, are forecasting, predicting machines. While driving to work,
we are constantly anticipating what the other drivers will do. At sporting
112 • Problem Solving for New Engineers
events, we see skilled athletes predicting the next move of their oppo-
nents. We predict the reactions of our partner or spouse when we deliver
bad news. Our quick predictive judgments are based on data from past
experiences, Lazy System 1 thinking. As engineers and scientists, we must
deliberately and logically use our models and calculations to predict and
theorize certain performance based on past experiences or experiments.
This is using our knowledge of subject matter to guide our work, System
2 thinking. Dr. Khorasani describes this type of predictive judgment as
“the guiding light that helps the researcher” (Khorasani 2016). However,
most other predictions use intuition. These judgments are based on skill
and expertise or intuitions that are “sometimes subjectively indistinguish-
able” from skill and expertise but “arise from the operation of heuristics
that often substitute an easy question for the harder one that was asked”
(Kahneman 2011).
This field of research on the role of intuition and hunches is rife with
debate. There remain a number of scholars who value human judgment
over algorithms. The scientists who are studying the role of insightful
behavior in problem solving have found that intuition is the result of
expertise rather than sudden realizations (Lung and Dominowski 1985,
Wan et al. 2011). Repeated studies support the accumulation of knowl-
edge in problem solving as the result of a gradual process (Bowers et al.
1990, Weisberg 1993). The hunches of scientists may be dependent on
the accumulation of information from the problem. We often deal with
new situations on the basis of what we’ve done in similar situations in
the past. Weisberg calls this “continuity of thought.” He uses Thomas
Edison’s development of the kinetoscope as an example of this type of
thinking. Edison’s invention of the kinetoscope is based on his earlier
invention of the phonograph (Weisberg 1993). Professor Weisberg shows
that ideas and intuition come from the accumulation and acquisition of
information and experiences about the problem we are attempting to
solve. Solutions don’t appear out of the blue but hit us a like a lightning
bolt or magic wand.
Psychologist Paul Meehl analyzed studies of clinical predictions based
on subjective impressions from trained professionals (professions where
judgment is required at work). In one study, he found that statistical algo-
rithms were more accurate than 11 of the 14 counselors. The number of
similar studies comparing algorithms to humans continues to grow. In
60% of roughly 200 studies, the algorithm is more accurate. The remainder
of the studies found a tie between humans and algorithms (Meehl 1986).
Oops! Unintentional Variation • 113
As new scientists and engineers, we want to rely on data and facts and let
our intuition develop.
5.6.2 Paradigms
In addition to personal beliefs and humanness, Professor Thomas Kuhn,
in his classic text The Structure of Scientific Revolutions, identifies scientific
paradigms that also limit our ability to even see anomalous results (Kuhn
1962). Kuhn defines normal science as “research firmly based upon one or
more past scientific achievements, achievements that some particular scien-
tific community acknowledges for a time as supplying the foundation for its
further practices.” Normal science defines our paradigm today. Normal sci-
ence is different for us today than it was for Aristotle or for Galileo or for
Isaac Newton or for Benjamin Franklin or for Marie Curie. Where Roentgen
saw x-rays, Lord Kelvin saw an “elaborate hoax.” Where Antoine-Laurent
Lavoisier saw oxygen, Joseph Priestley saw dephlogisticated air. Where
Newton saw light as material corpuscles, we now see light as photons with
characteristics of both waves and particles (Kuhn 1962). In Kuhn’s definition,
114 • Problem Solving for New Engineers
normal science limits what we can see, as these examples show. Just as these
great scientists and engineers from history operated within a certain para-
digm, we do as well. It is important that we acknowledge our own System
1 thinking—the paradigms, assumptions, rules, ideas, thoughts, and preju-
dices that could limit our contribution.
5.7 KEY TAKEAWAYS
Unintentional variation will happen in experiments. Therefore, a solid
experimental protocol is a good insurance policy against mistakes. While
116 • Problem Solving for New Engineers
there are “no perfect stories,” mistakes can be minimized with good lab
practices, maintenance, inspection, training, and robust experimental
planning. Create checklists, operating procedures, and work instructions
to minimize variation. The primary purpose of a written protocol is to
minimize variation. If we are doing the setup, running the experiment
and making the measurements ourselves, having a detailed procedure
may not be necessary. However, I’d recommend it anyway. If we have
details of exactly what we did at each step documented, we can always
retrace steps if the experiment needs to be reproduced at a later time. The
procedures could be somewhat generic, even covering several processes in
text or graphic format. The detail in each discrete step will vary depending
on the situation, but we can include information about defects to avoid,
safety hazards or precautions, required tooling or consumables, and any
information that ensures the process will be performed in a standard way.
We may choose to describe the process at a general level or provide details
and a step-by-step sequence of activities. Flow charts may also be useful to
show relationships of process steps.
In this chapter, we also looked at the inadvertent effect that intuition,
beliefs, bias, and priming can have on our experiments. The topics are fairly
new to physical scientists and engineers, and therefore, we should keep
our eyes open to possible variation introduced from these phenomena.
P.S. Try creating a standard operating procedure for a piece of metrology
equipment. Have several people try out your procedure. If possible, have both
experienced and inexperienced people perform the procedure. What did you
learn? Was it difficult to write? Did you need to make improvements?
REFERENCES
Apgar, V. 1953. A Proposal for a New Method of Evaluation of the Newborn Infant.
Current Researches in Anesthesia and Analgesia 32:260–267.
Bowers, K., G. Regehr, C. Balthazard, and K. Parker. 1990. Intuition in the Context of
Discovery. Cognitive Psychology 22:72–110.
Brackenridge, J. B. and M. A. Rossi. 1979. Johannes Kepler’s On the More Certain
Fundamentals of Astrology, Prague 1601. Proceedings of the American Philosophical
Society 123(2):85–116.
Brake, M. L., J. T. P. Pender, M. J. Buie, A. Ricci, and J. Soniker. 1995. Reactive Ion Etching
in the Gaseous Electronics Conference RF Reference Cell. Journal of Research of the
National Institute of Standards and Technology 100(4):1995.
Brown, B. 2010. The Gifts of Imperfection: Your Guide to a Wholehearted Life. Center City,
MN: Hazelden Publishing.
Oops! Unintentional Variation • 117
Chu, C. W. 2011. The Evolution of HTS: Tc-Experiment Perspectives. BCS: 50 Years ed. L.
N. Cooper and D. Ė. Feldman. Hackensack, NJ: World Scientific.
Coleman, H. W. and W. G. Steele. 1999. Experimentation and Uncertainty Analysis for
Engineers. 2nd Ed. New York: John Wiley & Sons.
Deming, W. E. 1982. Out of Crisis. Cambridge, MA: Massachusetts Institute of
Technology, Center for Advanced Engineering Study.
Dolnick, E. 2011. Clockwork Universe: Isaac Newton, the Royal Society and the Birth of the
Modern World. New York: HarperCollins.
Feynman, R. P. 1985. Surely You’re Joking, Mr. Feynman: Adventures of a Curious
Character. New York: W. W. Norton.
Finster, M. and M. Wood, 2005. The Apgar Score Has Survived the Test of Time.
Anesthesiology 102:855–857.
Gawande, A. 2010. Checklist Manifesto: How To Get Things Right. New York: Metropolitan
Books/Henry Holt and Company. References used from Dr. Gawande’s book
include the following:
Boorman, D. J. 2000. Reducing Flight Crew Errors and Minimizing New Error
Modes with Electronic Checklists. Proceedings of the International Conference on
Human Computer Interactions in Aeronautics. Toulouse: Editions Cepaudes. 57–63.
Boorman, D. J. 2001. Today’s Electronic Checklists Reduce Likelihood of Crew
Errors and Help Prevent Mishaps. ICAO Journal 56:17–20.
Luby, S. P. et al. 2005. Effect of Handwashing on Child Health: A Randomized
Controlled Trial. Lancet 366:225–233.
Mellinger, P. S. 2004. When the Fortress Went Down. Air Force Magazine October.
pp. 78–82.
Thalmann, M., N. Trampitsch, M. Haberfellner et al. 2001. Resuscitation in
Near Drowning with Extracorporeal Membrane Oxygenation. Annals of Thoracic
Surgery 72.
GEC. 2005. Gaseous Electronics Conference Radio-Frequency (GEC RF) Reference Cell.
Journal of Research of the National Institute of Standards and Technology 100(4).
The special issue contains the collaborative work of twelve research groups from
around the world.
Gladwell, M. 2005. Blink: The Power of Thinking Without Thinking. New York: Back Bay
Books.
Goldsmith, B. 2005. Obsessive Genius: The Inner World of Marie Curie. New York:
W. W. Norton.
Gregoire, C. 2014. 10 Things Highly Intuitive People Do Differently. The Huffington Post
March 19. www.huffingtonpost.com.
Gregory, A. 2016. 7 Tips for Writing an Effective Instruction Manual. http://www.sitepoint
.com/7-tips-for-writing-an-effective-instruction-manual/.
Hess, E. D. 2014. Learn or Die: Using Science to Build a Leading-Edge Learning
Organization. Columbia: Columbia Business School Publishing.
Holman, J. P. 2001. Experimental Methods for Engineers. 7th Ed. New York: McGraw Hill
Higher Education.
Hutson, M. 2015. The Science of Superstition. The Atlantic.
Ishikawa, K. 1991. Guide to Quality Control. Japan: Asian Productivity Organization.
James, J. T. 2013. A New, Evidence-based Estimate of Patient Harms Associated with
Hospital Care. Journal of Patient Safety 9(3):122–128.
Johnson, G. 2008. The Ten Most Beautiful Experiments. New York: Alfred A. Knopf.
Kahneman, D. 2011. Thinking, Fast and Slow. New York: Farrar, Straus and Giroux.
118 • Problem Solving for New Engineers
Kelemen, D., J. Rottman, and R. Seston. 2013. Professional Physical Scientists Display
Tenacious Teleological Tendencies: Purpose-Based Reasoning as a Cognitive
Default. Journal of Experimental Psychology: General 142(4):1074–1083.
Khorasani, F. 2016. The Elements of Effective Investigation. Unpublished.
Koch, C. 2015. Intuition May Reveal Where Expertise Resides in the Brain. Scientific
American, May/June: 25–26.
Kuhn, T. S. 1962. The Structure of Scientific Revolutions. Chicago: The University of
Chicago Press.
Livio, M. 2013. Brilliant Blunders: From Darwin to Einstein—Colossal Mistakes by Great
Scientists That Changes Our Understanding of Life and the Universe. New York:
Simon & Schuster.
Lung, C. and R. L. Dominowski. 1985. Effects of Strategy Instructions and Practice on
Nine-Dot Problem Solving. Journal of Experimental Psychology: Learning, Memory
and Cognition 11(4):804–811.
Meehl, P. 1986. Causes and Effects of My Disturbing Little Book. Journal of Personality
Assessment 50:370–375.
Sandberg, S. 2013. Lean In: Women, Work and The Will to Lead. New York: Alfred
A. Knopf. Reference contained therein:
Danaher, K. and C. S. Cradall. 2008. Stereotype Threat in Applied Settings
Re-Examined. Journal of Applied Social Psychology 39(6):1639–1655.
Simon, H. A. 1992. What is an Explanation of Behavior? Psychological Science 3:150–161.
Skloot, R. 2010. The Immortal Life of Henrietta Lacks. New York: Crown Publishing
House/Random House.
Sulloway, F. J. 1982. Darwin and His Finches: The Evolution of a Legend. Journal of
History of Biology 15(1):1–53.
Texas A&M website. 2016. Guide to Writing Standard Operating Procedures. http://oes
.tamu.edu/new/templates/…/SOPs_How_to_Write.pdf.
Wan, X., H. Nakatani, K. Ueno, T. Asamizuya, U. Chen, and K. Tanaka. 2011. The Neural
Basis of Intuitive Best Next-Move Generation in Board Game Experts. Science 331:
341–346.
Weisberg, R. W. 1993. Creativity: Beyond the Myth of Genius. New York: W. H.
Freeman.
Willard, A. K. and A. Norenzayan. 2013. Cognitive biases explain religious belief, para-
normal belief, and belief in life’s purpose. Cognition 129:379–391.
Wortman, B., W. Richardson, G. Gee, M. Williams, T. Pearson, F. Bensley, J. Patel,
J. DeSimone, and D. Carlson. 2007. The Certified Six Sigma Black Belt Primer. West
Terre Haute, IN: The Quality Council of Indiana.
6
What, There Is No Truth?
H. James Harrington
Measurement affects every part of our lives. Each package of food has the
amount of food written in multiple units. Every doctor visit, even a hang-
nail, is another opportunity to be weighed. Each time we drive our cars,
we are monitoring multiple measurements, speed, temperature, engine
revolutions, battery life or fuel level, etc. The cost of a product can be based
on its weight as can the cost of shipping. In the daily news, we hear or read
of the conclusions that scientists and/or engineers have drawn as a result
of some measurement that has taken place. Whether a measurement is
needed for experimentation, development, or manufacturing, the uncer-
tainty inherent in the measurement is a source of variation. It is critical
that we be able to separate instrument errors from other experimental
errors. In an effort to reduce overall uncertainty, the measurement sys-
tem variation should be one of the first things characterized. In this chap-
ter, we will begin with establishing a common language for describing
the measurement system, then look at standards and calibration and tool
matching. We will walk through setup of a measurement system analysis
and look closely at the analysis portion. Finally, we pull back to the big
picture and look at the global issues surrounding measurements.
119
120 • Problem Solving for New Engineers
6.1 MEASUREMENT EVOLUTION
Measurement is considered the hallmark of human intellectual achievement.
Evidence of measurement tools dates back to prerecorded history. Certainly,
some of our earliest artifacts record examples of the use of a scale for relative
measurement of an objects weight. The earliest evidence dates back to 2400 to
1800 bce in Pakistan’s Indus River Valley. In these prebanking days, smooth
stone cubes were used in weight measurement in balance scales. In Egypt, the
scales and stones were used for gold trade—mining yields, cataloging ship-
ments, etc. No scales have survived or at least been recovered to date but mul-
tiple sets of weighing stones have been found. The Egyptian hieroglyphics and
murals from that time indicate the widespread use of scales in trade.
Time for most of history has been a vague quantity. Nature provided
the early measurements. As the sun moved through the sky, the shadows
cast by a sundial provided the time of day. Figure 6.1 is a photograph of
a sundial prominently displayed in the courtyard of Heidelberg Castle
in Heidelberg, Germany. The Roman numerals show the time. Notice
the astrological signs to provide the time of the year. The first mechani-
cal clocks were recorded in the fourteenth century. With the advent of
mechanical clocks, our understanding of time advanced to include hours
and minutes and seconds. Figure 6.2 is an example of an early clock in a
tower at Heidelberg Castle with Roman numerals for the digits. Notice the
hands have astronomical references to the moon and the sun.
FIGURE 6.1
Sundial at Heidelberg Castle in Heidelberg Germany.
What, There Is No Truth? • 121
FIGURE 6.2
Clock on the Clock Tower of Heidelberg Castle.
The science of measurement has evolved over the centuries. Our under-
standing of measurement evolved as it became more important in com-
mercial interactions. With increased global trade and manufacturing,
we’ve seen the need for standardization of measurement practices which
has given rise to international standards organizations. Metrology, the
science of measurement, is derived from the Greek words metron (“mea-
sure”) and logos (“word” or “reason”). The word metrology arose from two
Greek words that together give us logical measurement (Metrology 2016).
Every piece of data we collect is filtered through a measurement system.
Almost always, there is some type of gauge involved, a person or computer,
and a procedure or method by which the person or computer collects or
interprets measurements. A gauge will consist of a detector, which detects
the signal and converts it to a mechanical or electrical form—either digital or
analog, a signal modifier such as a filter or amplifier and an indicator which
will record or control the resultant signal (Holman 2001). We also know that
all measurements are a combination of the actual effect (true value) and some
uncertainty. The variation in our measurements is most likely some combina-
tion of both random and systematic errors. With random causes, we expect
that the measurements could be on either side of the actual or true value.
Systematic sources of variation will shift the measurements such that they are
not centered on the actual or true value. The systematic variation will shift the
measurements to either side, in one direction, of the true value and thereby
shift the whole distribution of measurements to one side of the “true value.”
Recall from Section 4.5.2, where we distinguished the types of uncertainty,
measurement uncertainty is lumped into Type B (variation not due to ran-
dom variation). In other words, measurement uncertainty characterizes the
range of values within which the true value is asserted to lie with some level
of confidence.
6.2 PROBLEMS
6.3 DEFINITIONS
In this section, we will review what we know or think we know about mea-
surement terminology since there is a lot of confusion surrounding the
definitions. The first step in communicating the results of a measurement
What, There Is No Truth? • 125
1/10 of the least count. However, if the scale is small, we may only feel
confident that we can estimate it to the nearest 1/2 of the least count.
In still other cases, we may not feel confident in estimating to anything
less than the least count. When selecting a gauge for use in an experi-
ment or process, the Rule of Ten should be used. This rule states that
the smallest increment of measurement for the device should be less
than or equal to 1/10 of the tolerance. The gauge should be sensitive
enough to detect differences in measurement as slight at 1/10th of the
total tolerance specification or process spread, whichever is smaller.
Inadequate discrimination will affect both the accuracy and precision
of an operator’s reported values.
A measured value is meaningless without some statement of its accu-
racy. Accuracy is defined as the closeness of agreement between a mea-
sured value and the reference value, in other words, it is the closeness
of agreement between the average value obtained from a large series of
test results and the measured value. All that exists is a series of mea-
surements. Therefore, this deviation from the reference value is a lack
of accuracy. Accuracy is an unbiased reference value and is normally
reported as the difference between the average of a number of measure-
ments and the reference value. Checking a micrometer with a gauge
block is an example of an accuracy check. Accuracy is an expression
of the lack of error and is largely affected by systematic error. From
our normal distribution discussions, accuracy is our location varia-
tion indication. There are multiple definitions for accuracy, however,
and in an effort to avoid confusion, the Measurement System Analysis
Reference Manual recommends that we avoid using the term accuracy
and use bias instead. I will therefore attempt to be consistent with that
guidance here as well.
Precision is the closeness of agreement between independent mea-
surements of a quantity under the same conditions. It is a measure of
how well a measurement can be made without reference to a theoretical
or reference value. The number of divisions on the scale of the measur-
ing device generally affects the consistency of repeated measurements
and, therefore, the precision. Since precision is not based on a true
value, there is no bias or systematic error in the value, but instead it
depends only on the distribution of random errors. The precision of
a measurement is usually indicated by the uncertainty or fractional
relative uncertainty of a value. Precision is the closeness of agreement
between independent measurements. Precision is largely affected by
What, There Is No Truth? • 127
FIGURE 6.3
Illustration of experimental results demonstrating accuracy and precision as related to
random and systematic errors. The bulls-eye in the center represents the “true value” or
target value or reference value that we hope to achieve in our experiment.
FIGURE 6.4
Sketch of experimental results from Figure 6.3 but without the target. Since we do not
know “true value,” this corresponds to our experimental situation most of the time.
Most of us learned in school that gravity is 9.8 m/sec2 and the speed
of light is 3.0 × 1010 cm/sec. However, there is no “true” value for grav-
ity or the speed of light as in Figure 6.5. Figure 6.5a and b show the dif-
ferent measurements of the speed of light over the years. The results are
dependent on the method used in the experiment. For the speed of light
measurements, the initial measurements were astronomical, a rotat-
ing wheel, then a mirror allowed more consistent measurements with
reduced uncertainty. Most recently, microwave interferometry was
used for the accepted value today of 299,792 km/sec. “If two methods
of measuring the speed of light, or for measuring anything were in
statistical control, there might well be differences of scientific impor-
tance. On the other hand, if the methods agreed reasonably well, their
agreement could be accepted as a master standard for today” (Deming
1982). This is exactly what happened. Today, this value is used to define
the meter by the Bureau International des Poids et Mesures: “The meter
What, There Is No Truth? • 129
320,000
300,000
Speed (km/sec)
280,000
260,000
240,000
220,000
200,000
1650 1700 1750 1800 1850 1900 1950 2000
Year
(a)
299,795
299,790
Speed (km/sec)
299,785
299,780
299,775
299,770
299,765
1920 1930 1940 1950 1960 1970 1980 1990
Year
(b)
FIGURE 6.5
Measurements of the speed of light from (a) 1675 to 1983 and (b) 1923 to 1983. (Source:
Halliday, D., Resnick, R., Fundamentals of Physics, John Wiley & Sons, Inc., New York,
1970.)
130 • Problem Solving for New Engineers
6.4 MEASUREMENT SYSTEM
There are five characteristics of a measurement system that we are con-
cerned with: bias, stability, linearity, reproducibility, and repeatability.
Accuracy is indicated by bias, stability, and linearity, while precision is
primarily quantified with reproducibility and repeatability.
Prior to beginning a measurement system analysis, we should con-
firm that the instrument will work for the purposes intended. This can
be determined with the three indicators of accuracy: bias, linearity,
and stability. Bias is the difference between the average value of the
large series of measurements and the accepted reference value. Bias
is equivalent to the total systematic error in the measurement and a
correction to negate the systematic error can be made by adjusting for
the bias. Measurements can vary from true value either randomly or
systematically. Linearity describes how consistent the bias of the mea-
surement system is over its range of operation. Stability describes the
ability of a measurement system to produce the same measurement
value over time when the same sample is being measured. These three
indicators of accuracy are shown in Figures 6.6a and b, 6.7a and b, and
6.8a and b.
In gauge terminology, repeatability is often substituted for precision.
However, precision cannot be expressed with one value. The precision
of a gauge or measurement system describes how “close” the values are
to one another. Precision is the random error piece of the measure-
ment system and is represented by the width (standard deviation) in
our normal distribution. Precision is expressed with repeatability and
reproducibility.
Repeatability is the precision determined under conditions where the
same operator uses the same methods and equipment to make measure-
ments on identical parts. Repeatability is the ability to repeat the same
measurement by the same operator at or near the same time, as is illus-
trated in Figure 6.9. In other words, getting consistent results repeatedly
What, There Is No Truth? • 131
Bias
Measurements
Reference value
(a)
Bias 1
FIGURE 6.6
(a) Sketch illustrating bias in a measurement system with data points. (b) Sketch illustrat-
ing bias in a measurement system with a distribution.
means having the same measurement, same operator, and close to the
same time. The repeatability contribution to precision is known as the
equipment variation.
Reproducibility is the precision determined under conditions where a
different operator uses the same methods but different equipment to make
measurements on identical specimens. In other words, it is the reliability
of a gauge system or similar gauge systems to reproduce measurements.
The reproducibility of a single gauge is customarily checked by com-
paring the results of different operators taken at different times. Gauge
132 • Problem Solving for New Engineers
Bias 1 Bias 2
Measurement 2
Measurement 1
Reference value 2
Reference value 1
(a)
Bias 1 Bias 2
FIGURE 6.7
(a) Sketch illustrating stability in a measurement system with data points. (b) Sketch illus-
trating stability in a measurement system with a distribution.
Stable
Measurements
Unstable
Time
(a)
Time 2
Time 1
Stability
Time
(b)
FIGURE 6.8
(a) Sketch illustrating linearity in a measurement system with data points. (b) Sketch
illustrating linearity in a measurement system with a distribution.
Repeatability
Measurements
FIGURE 6.9
Sketch illustrating repeatability in a measurement system.
134 • Problem Solving for New Engineers
Experimenter A
Experimenter B
Reproducibility
Measurements
FIGURE 6.10
Sketch illustrating reproducibility in a measurement system.
6.6 MEASUREMENT MATCHING
There may be situations where we need to use two different measurement
tools to perform our experiments. These metrology tools may come from
the same manufacturer and may even have sequential serial numbers.
However, it is almost certain that these two tools will not perform exactly
the same and will therefore give different results. It isn’t important that
136 • Problem Solving for New Engineers
these tools give exactly the same results; however, it is critically important
that we know how far the results depart from one another. It is important
that the equipment is in statistical control.
The semiconductor industry, with metrology tools costing sometimes
more than ten million US dollars, has faced tremendous pressure to have
their tools perform exactly the same. In order for the measurement to be
performed exactly the same, the robotic system handling the wafer needs
to be exacting. Tool matching, not only from run-to-run but also between
tools, is critical. One example of all that goes into matching metrology
tools is described by Dr. Clive Hayzelden. Dr. Hayzelden describes the
matching process between two ellipsometers. The process involves mea-
suring a film (typically an oxide film) thickness in five or more locations
using the same silicon wafer every eight hours over a five-day period. Both
dynamic (cycling the wafer in and out of the tool) and static repeat-
ability tests are performed. Static repeatability provides measurement-
to-measurement variation, while dynamic repeatability captures the
variation in robotic accuracy and focus. The stability of the measurement
tools is determined from performing the same measurement over time.
How well the tools match one another is determined by comparing the
mean and standard deviation for each measurement site (Hayzelden 2005).
6.7 ANALYSIS METHODS
There are three methods that can be used to quantify error in a mea-
surement system: the range method, the average and range method, and
the analysis of variance method, often referred to as ANOVA. Table 6.1
compares the three methods for measurement system analyses. The most
accurate method is the analysis of variance method because it allows for
the quantification of repeatability, reproducibility, part variation, per-
son variation, and the interaction between the part and people variation.
Although the calculations in a measurement system analysis involve only
simple mathematical functions (addition, subtraction, multiplication, and
division) for each of the methods, the analysis of variance method is the
most complex.
Let’s walk through the basis for the measurement system analysis
model. With anything we are measuring, whether it’s a part dimension
What, There Is No Truth? • 137
TABLE 6.1
Comparison of Three Different Methods of Measurement System Analysis
Method Pros Cons
Range (R) Easy and quick approximation Cannot distinguish repeatability
of measurement variability and reproducibility
Average and range Provides information about Cannot distinguish any
(X and R) causes of measurement error; interaction between
Able to distinguish between repeatability and
repeatability and reproducibility
reproducibility
Analysis of Most accurate Computationally more difficult;
variance typically performed with
(ANOVA) computer
The Mean Value and Bias are placed in parentheses to stress that they
cannot be separately distinguished unless a master gauge is used.
The contribution to the Measured Value from Within-Part Variation,
Reproducibility, and Repeatability is random. This model is the basis of
the mathematical model used for the development of the analysis of vari-
ance method. A detailed development of the mathematical model can
be found in the Measurement System Analysis Reference Manual (AIAG
2010).
All the methods (and examples presented) ignore the Within-Part
Variation term (out-of-roundness, out-of-flatness, diametrical taper, etc.)
as the data gathering process becomes vastly more cumbersome (AIAG
2010). In order to minimize the impact of the Within-Part Variation effect,
it is best to capture the maximum within-part variation prior to begin-
ning our measurement system analysis. In addition, confirm that the par-
ticular characteristic or property that we are interested in understanding
in the measurement system analysis has a much greater effect than the
within-part variation.
6.7.1 Setup
The most important part of the measurement system analysis, indepen-
dent of the method that we select, is the detail of the setup. The measure-
ment system analysis is useful in determining the amount and types of
variation in a measurement system and how it performs in its operational
environment (as opposed to the manufacturer development lab). We want
to allocate the variation to the two categories, repeatability and reproduc-
ibility, as we’ve defined earlier in the chapter. In most practical situations,
it is important that we have a well-characterized measurement system (i.e.,
known bias, repeatability, linearity, reproducibility, and stability) within
reasonably established limits.
What, There Is No Truth? • 139
TABLE 6.2
Measurement System Analysis Planning Tool
# Step
1 Create an Input–Process–Output diagram and identify each input with (C) for
constant, (N) for noise, or (X) for intentionally varying.
2 Identify how many people will be involved in the study and who they are. Try
to select people who normally make or will be making these measurements.
3 Select the sample parts. Determine the number of parts. Label the parts. We
will want to select typical parts that are really representative. Remember we
are trying to capture the full range of variation that exists within the parts. A
good labeling or identification system is important because the parts will be
measured multiple times by multiple people. (We may want to try to make
this a blind study.)
4 Decide how many times measurements will be repeated. The more critical the
dimension, the more measurements we may want to make in order to
increase our confidence in the measurements.
5 Ensure that we have adequate sensitivity with our gauge.
6 Confirm that the measurement procedure is well defined and that each person
participating in the study is well trained on the procedure.
7 Create a template for logging the measurements. Our template should detail
the order of measurements, people, and parts. Stick as close to this template
as possible. (This can be done in Excel or other spreadsheet format or using a
statistical software package like JMP.)
8 Begin the measurements in the predetermined randomized order of people
and parts.
TABLE 6.3
Definition of Notation Used in the Measurement System Analysis Reference
Manual
Symbol Symbolic Representation
k Number of people making measurements
r Number of repeated measurements each person is making
n Number of parts being measured
m = r*k Number of total measurements for each part
What, There Is No Truth? • 141
calculations. I’ve summarized the steps in this section, but details and
examples can be found in the reference manual.
1. Calculate the average and range (in Excel or spreadsheet per the tem-
plate), both rows and columns.
2. Calculate the average of the averages for each row and column. This
will give us the average of each of the averages of Person A, B, and
C’s measurements through k and the averages for each of the n parts.
To be consistent with the Measurement System Analysis Reference
Manual template, let X A, X B , and XC represent the average measure-
ment performed by each of the k measurers: A, B, C, etc., respectively,
and X P denote the average measurement over all r*k measurements
for part one through n where p = 1 to n for each of the n parts.
3. Calculate the average of the range of all measurements using similar
subscripting notation as in Step 2. Let the average of the ranges be
R p, Ra, Rb, Rc.
4. Compute the average of all the part averages, X, and the average of all
the ranges, R, for each of the operators using the following formulas.
∑
n
Xi
i =1
X= (6.5)
p
X A + X B + XC +…
R= (6.6)
n
EV = R * K1 (6.7)
EV 2
AV = ( X Diff * K 2 )2 − (6.8)
n * r
GRR = EV 2 + AV 2 (6.9)
PV = R p * K 3 (6.10)
TV = GRR 2 + PV 2 (6.11)
in the range Rp values. This is our equipment variation, but the individual
R-average differences may indicate differences in the operators. In this
example, R A is less than R B and RC . This tells us that A may have done bet-
ter at getting the same answer upon repeated measurements of the same
part than B or C did. Investigating the difference between A, B, and C’s
methods might provide an opportunity to reduce variation.
Reproducibility can be thought of as a measure of operator/technician
variation. It is the amount of variation in the readings from differ-
ent measurement systems measuring the same material/parts. This is
important because most of the time, in industry and in labs, we have
different operators making measurements that are considered the same
as other operators’ measurements. We could also use reproducibility to
measure changes in the measurement system. For example, if the same
person is making the measurements but using two different methods,
the reproducibility calculation will show variation due to changes in the
methods. Reproducibility is the variation that occurs between the over-
all average measurements for the different operators (appraisers). It is
reflected in the X values and the X Diff value. If, for instance, X A and X B
are close and XC is very different, it would appear that C’s measurements
are biased. We’d have to investigate further to reduce this variation.
Once we have completed the data collection, the next step is to complete
the GRR report. The quantity labeled EV, equipment variation, is an esti-
mate of the standard deviation of the variation due to repeatability. The
quantity labeled AV, appraiser variation, is an estimate of the standard
deviation of the variation due to reproducibility. The quantity labeled GRR
is an estimate of the standard deviation of the variation due to the mea-
surement system. The quantity labeled PV is an estimate of the standard
deviation of the part-to-part variation. The quantity labeled TV is an esti-
mate of the standard deviation of the total variation in the study.
If the GRR is under 10%, the measurement system is acceptable, and
if it is between 10% and 30%, the measurement system may be accept-
able depending on how important our work is. If the GRR% is more than
30%, the measurement system needs improvement. In this case, the whole
process should be examined to determine where the problems are and
how they can be corrected. There are many reasons that a measuring sys-
tem could give erroneous results (variation) (AIAG 2010, Wortman et al.
2007). Table 6.4 shows how these items might appear in terms of repeat-
ability and reproducibility.
144 • Problem Solving for New Engineers
TABLE 6.4
Sources of Repeatability and Reproducibility Error
Sources of Variation Repeatability Reproducibility
Part, sample, or material Within part, samples, or Between parts, samples, or
variation material material
Equipment variation Within instruments Between instruments
Standards Within standards Between standards
Procedural variation Within the procedure Between procedures
Appraiser variation Within appraiser Between appraisers
Environment Within environment Between environment
Assumptions Violations of stability and Violation of assumptions in
proper operation the study
Application Part size, position, Part size, position,
observation error observation error
Software variation Within an instrument Between instruments
Laboratory variation Within laboratory Between laboratory
There are times when testing is destructive such that it prevents retest-
ing. In these cases, sample or material variation accounts for all the
variation within and between samples. Sample variation would account
for variation due to form, position, surface finish, or any inconsistency
within the sample. Equipment variation can be identified and quantified
through measurement system analysis. Equipment variation may show
up as a fixed error shift from the true value or it may show up with the
slow measurement changes over time as with signal drift. Standard vari-
ation is unlikely but should be considered. The standards should be more
stable than the measurement process. Procedure variation occurs when
standard operating procedures are not followed or are not error-proofed.
Appraiser variation may occur when one appraiser uses the same gauge
and same standard operating procedure but measures variation or it
may occur when the measurement system analysis is performed and
variation occurs between the different appraisers. Environmental varia-
tion may occur within an environment due to short-term changes in
the environment or between environments due to differences over time
caused by changes in the environment. Examples of environmental fac-
tors include temperature, humidity, lighting, cleanliness, etc. Software
variation within a program may be a result of variation in the formulas
or algorithms, which may result in errors, even with identical inputs or
What, There Is No Truth? • 145
6.9 KEY TAKEAWAYS
Measurements impact all areas of our lives. Our measurements will contain
both random and systematic variation. Measurement systems account for
most or at least a large portion of the Type B uncertainty contribution. In
order for us to have confidence that we can repeat our experimental results
and that our work can be duplicated by other researchers, corporations, or
customer groups, we want to ensure that the equipment we use for measure-
ment is properly sensitive, calibrated, and well characterized. A measure-
ment system analysis allows us to do just that. The Measurement System
What, There Is No Truth? • 147
REFERENCES
ASTM. 2016. American Society for Testing and Materials International. www.astm.org.
AIAG. 2010. Measurement System Analysis Reference Manual. Chrysler Corporation,
Ford Motor Company, and General Motors Corporation, 1998, 2003 and 2010. This
document is being updated regularly. When performing a gauge study, the latest
version should be used as the definitions are honed in the science of metrology.
BIPM. 1983. Bureau International des Poids et Mesures (International Bureau of Weights
and Measures). www.bipm.org. See Resolution 1 of the 17th CGPM from 1983.
Deming, W. E. 1982. Out of the Crisis. Cambridge, MA: Massachusetts Institute of
Technology, Center for Advanced Engineering Study.
Dolnick, E. 2011. Clockwork Universe: Isaac Newton, the Royal Society and the Birth of the
Modern World. New York: HarperCollins.
Duncan, A. J. 1986. Quality Control and Industrial Statistics. 5th Ed. Homewood, IL: Irwin.
Goldsmith, B. 2005. Obsessive Genius: The Inner World of Marie Curie. New York: W. W.
Norton.
Hayzelden, C. 2005. Gate Dielectric Metrology. Handbook of Silicon Semiconductor
Metrology. ed. Alain C. Diebold. New York: Taylor & Francis.
Holman, J. P. 2001. Experimental Methods for Engineers. 7th Ed. New York: McGraw-Hill
Higher Education.
ISO (International Standards Organization). 2010. Document ISO/CD 22514-7: Capability
and Performance—Part 7: Capability of Measurement Processes. Geneva. http://
www.iso.org.
Marciano, J. B. 2014. Whatever Happened to the Metric System: How America Kept Its
Feet. New York: Bloomsbury.
Metrology. 2016. http://www.french-metrology.com/en/history/history-mesurement.asp.
Mlodinow, L. 2008. The Drunkard’s Walk: How Randomness Rules Our Lives. New York:
Pantheon Books.
Randall, L. 2011. Knocking on Heaven’s Door: How Physics and Scientific Thinking
Illuminate the Universe and the Modern World. New York: HarperCollins.
Taylor, J. R. 1982. An Introduction to Error Analysis: The Study of Uncertainties in Physical
Measurements. 2nd Ed. Sausalito, CA: University Science Books.
VIM. 2012. International Vocabulary of Metrology—Basic and General Concepts and
Associated Terms. 3rd Ed. Paris: Bureau International des Poids et Mesures. JCGM
200:2012. http://www.bipm.org/en/publications/guides/vim.html.
Wortman, B., W. Richardson, G. Gee, M. Williams, T. Pearson, F. Bensley, J. Patel,
J. DeSimone, and D. Carlson. 2007. The Certified Six Sigma Black Belt Primer. West
Terre Haute, IN: The Quality Council of Indiana.
http://taylorandfrancis.com
7
It’s Random, and That’s Normal
THE
NORMAL
LAW OF ERROR
STANDS OUT IN THE
EXPERIENCE OF MANKIND
AS ONE OF THE BROADEST
GENERALIZATIONS OF NATURAL
PHILOSOPHY—IT SERVES AS THE
GUIDING INSTRUMENT IN RESEARCHES
IN THE PHYSICAL AND SOCIAL SCIENCES AND
IN MEDICINE AGRICULTURE AND ENGINEERING—
IT IS AN INDISPENSABLE TOOL FOR THE ANALYSIS AND THE
INTERPRETATIONS OF THE BASIC DATA OBTAINED BY
OBSERVATION AND EXPERIMENT
Jack Youden
Earlier in this book, we saw that uncertainty can be broadly divided into
systematic variation or random variation. Systematic variation may be a
result of a measurement system or method, but random variation is an
inherent part of any measurement. Random variation occurs naturally in
nature. No two snowflakes are exactly the same; no two flowers are exactly
the same even when grown on the same plant, just as two children born
from the same parents are not the same. Even identical twins are not 100%
carbon copies of one another. (Clones are the beyond the scope of this
work.) No two machined parts are exactly the same. No two measurement
systems (no matter how much they cost) are the exact same. Assuming
instruments are calibrated and in good operating condition, repeated
149
150 • Problem Solving for New Engineers
measurements of the same sample will vary around a value. These mea-
surements will form a characteristic symmetric pattern even in the absence
of systematic effects purely due to random experimental error. Because
we cannot completely eliminate all variation, we must master quantifica-
tion of variation. In this chapter, we will look closer at random variation
and our propensity to see patterns in random events. Once we quantify
random variation in the data we are analyzing, we can leave it alone and
stop trying to make unnecessary adjustments to the process until we see
variation that is outside of the quantified and characterized random varia-
tion. This approach allows us to understand the natural capabilities of our
system and make better decisions on engineering tolerances and designs.
7.1 PATTERNS
From where we stand the rain seems random. If we could stand somewhere
else, we would see the order in it.
Tony Hillerman
in Coyote Waits, 2009
FIGURE 7.1
Random cloud patterns on an afternoon in the Sonoran Desert. What do you see?
(Courtesy of Mary M. Walker, used with permission.)
with dust particles are supposedly proof that ghosts exist (Novella 2016).
These are examples of pareidolia—seeing a familiar pattern in random
data.
We unconsciously see patterns in our daily lives in order to organize our
actions and predict responses of the people around us. This is evidenced in
our superstitions. The absence of 13th floors in hotels in the United States
and full moon phenomena are all examples of superstitious patterns. We
consciously find patterns (and assign meaning to them) in clouds, in stars,
and all around us, even when they don’t really exist.
We also look for patterns in our everyday lives—independent of whether
they are true patterns or just random events. We examine these things as
if they were patterns and not randomness. When viewing or analyzing
data, our Lazy System 1 will cause us problems with the common logical
fallacy of confusing correlation with causation. Recall our Lazy System 1
discussion from Chapter 5 (Kahneman 2011, Novella 2016). Much time
and efforts go into “snooping out” fake patterns. The human brain pro-
cesses and recalls information using pattern recognition. “Science is
partly the task of separating those patterns that are real from those that
are accidents of random clumpiness. … The only way to navigate through
152 • Problem Solving for New Engineers
the sea of patterns is with the systematic methods of logic and testing that
collectively are known as science” (Novella 2007). We want to understand
our world, make sense of it, and yet we keep running into randomness
and the role it plays in our lives, our experiments, and our data.
We make assumptions about the world in an attempt to make it more
predictable. We continually try to make the world fit into this model
of patterns that we’ve built from our experiences. We are really good at
pattern recognition. The very nature of the human brain function is pat-
tern recognition. “Random information is likely to contain patterns by
chance alone,” Carl Sagan said. “Randomness is clumpy.” Think of rolling
two dice. How likely is it that you will roll double sixes three times back to
back? It may not happen often, but if you roll dice enough it will happen
eventually. Our Lazy System 1 is lousy at recognizing when these patterns
are real and when they are not real (Kahneman 2011). As scientists, we are
tasked with identifying real and random patterns. “Our ‘common sense’
often fails to properly guide us, apparently being shaped by evolution to err
hugely on the side of accepting whatever patterns we see” (Novella 2016).
Recognition of patterns is not a bad thing. It is natural to observe pat-
terns and correlations. In fact, it is difficult to unsee a pattern or trend that
we see. My favorite cartoonist, the creator of Dilbert, Scott Adams, actu-
ally encourages us to look for, pay attention, and leverage patterns in our
lives. He cautions, and I agree, to do this carefully (Adams 2013). Although
random variation may initially appear to have a pattern, causation should
be established using scientific methods and statistically valid techniques.
Recently, a new field of research and study has developed whose primary
objective is pattern hunting (data mining, data analytics, etc.). As digital
technologies allow us to tap into old data, we now have unprecedented
opportunities to use information collected on paper forms, in files, and
in forms. These data open doors for data mining and data analytics with
sophisticated specialized algorithms. The application of these data is used
in fields from marketing to medicine; even social scientists use pattern
hunting in their research (Brown 2010, Dormehl 2014).
7.2 SIMPLE STATISTICS
Variation exists; therefore, we need some simple methods of describing it.
We can use a histogram to graphically display the data, but there are times
It’s Random, and That’s Normal • 153
∑
n
xi
i =1
µ= . (7.1)
n
Another way to represent the middle of a data set is with the median.
The median of a data set is found by first sorting the sample data from
smallest to largest; then the median is the middle value, if there is an odd
number of data points, or the average of the two middle values if there is
an even number of data points.
Let’s take a simple example. I roll two dice six times. By summing the
two dice from each roll, I get a series of six numbers: 9, 11, 11, 5, 10, and 7.
53
The average or mean is calculated to be µ = = 8.83. In this example, if
6
we sort the numbers from smallest to largest, we get 5, 7, 9, 10, 11, 11. We
have six data points (even), and the two middle values are 9 and 10, so the
median is m = 9.5.
The mean and median are both measures of the center of a data set; dif-
ferent aspects of the data affect them. How are the mean and the median
different? The median separates the data into two equal parts. Of course,
the value of the data is essential for ranking only, but the values of the
value on either side of the median do not affect the median. Each data
point is given equal weight or value independent of how extreme it is. In
other words, the median is independent of the tail values. The median
wouldn’t change if our value of 11 was replaced with 24 in the previous
example. This is not true for the mean. The mean is sensitive to extreme
values. Very different data sets could have the same mean and median but
look completely different.
Imagine a horizontal line is a seesaw. If we placed one pound weights
on a horizontal line at the values of the data set, and the axis itself had
negligible weight, the mean would be the point on the horizontal axis
that is in balance. The mean acts as a fulcrum to balance the system of
weights. If there were many data points, our seesaw would begin to look
154 • Problem Solving for New Engineers
12
11
10
9
Frequency
8
7
6
5
4
3
2
1
3.1 3.2 3.3 3.4 3.5 3.6 3.7
Ti rod length (mm)
FIGURE 7.2
Frequency diagram of the Ti rod length measurements from Table 7.1.
TABLE 7.1
30 Measurements of Length on the Titanium Part
3.54 3.5 3.46 3.31 3.46 3.15
3.63 3.45 3.33 3.46 3.12 3.31
3.68 3.33 3.35 3.28 3.29 3.26
3.67 3.37 3.33 3.3 3.38 3.45
3.57 3.35 3.23 3.3 3.34 3.44
It’s Random, and That’s Normal • 155
The easiest dispersion measure that comes to mind might be the range.
We can use range to measure dispersion in a data set, which gives us a feel
for the spread in the data. The range, R, is simply the difference between
the largest and smallest values in the set of numbers or the maximum and
minimum values.
The range for the data in Table 7.1 and Figure 7.2 is 0.56 mm. A disad-
vantage of the range is that it completely relies on the extreme data points.
The range tells us how far apart the boundaries are, but nothing about
what’s in between the boundaries.
Let’s recap the discussion so far; in this section we’ve discussed two val-
ues that are commonly used to describe the center or middle of a data
set and one value that we can calculate to provide information about the
boundaries of our dataset. However, neither mean, median, nor range
alone nor together gives us enough information about the data to com-
pletely represent the whole set of data. We need a different parameter to
completely describe a random set of numbers. The mean, a measure of
central tendency, will give us the location of the center of our data, but we
still need something to accurately describe the dispersion. Recall that the
mean, μ, of a set of n values is
∑
n
xi
i =1
µ= . (7.3)
n
Note that μ is also the highest point on the frequency distribution his-
togram. The data are roughly symmetric about this mean, which is thick
with data points at the center and sparser at either end.
Recall in an earlier paragraph that one disadvantage of simply using the
range as our measure of dispersion is that it puts so much weight on the
extreme values. These extreme values tell us little or nothing about what’s
happening in the center of the data set. Another disadvantage in using the
range to represent the spread in the curve is that from the range we know
nothing about how narrow or flat the distribution of data is. We need a
different variable to accurately and uniquely identify the curve. We need
the standard deviation. The standard deviation is a measure of variation
about the mean. The spread in the data can be represented by the standard
156 • Problem Solving for New Engineers
∑
n
( xi − µ )2
i =1
σ= (7.4)
n
With the standard deviation, we are no longer subjected to the effects of
extreme values. The standard deviation tells us if the distribution of the
data is narrow or wide. The standard deviation tells us exactly how the
data are dispersed. The dispersion or width of the curve of measured data
will vary depending on many factors. The primary factors that affect the
dispersion are type of measurement performed, care used in perform-
ing the measurements, and quality of the equipment used to make the
measurements. By the way, another advantage of the standard deviation is
that we can also say what proportion of our measurements falls within any
specified limits, which we will discuss further in the next section.
What have we done so far with simple statistics? We’ve learned that with
the mean and standard deviation for a set of random data set, we can com-
pletely describe both where the data are centered and how the data are
dispersed. Now let’s go a step further.
7.3 IT’S NORMAL
In the presence of randomness, regular patterns can only be mirages.
Daniel Kahneman
Sir Isaac Newton was one of the first and for many years the only sci-
entist to use the mean value to represent his measurements (Mlodinow
2008). It wasn’t until Marquis Pierre-Simon de Laplace, born 120 years
after Newton’s death, that experimental physics began to become “mathe-
matized.” Laplace’s work, along with Antoine Lavoisier and Carl Friedrich
Coulomb, led to the then new field of mathematical statistics and one of
the most important mathematical descriptions of all time, the normal dis-
tribution, also known as a bell curve or Gaussian distribution. The initial
characterization can be credited to Abraham De Moivre’s The Doctrine
of Chances, in which he describes the bell shape of the curve. The curve
is named for Carl Friedrich Gauss, an eighteenth century German math-
ematician, who demonstrated that repeatedly measuring the same astro-
nomical phenomenon produced a continuous pattern. Gauss was the first
to use the pattern that became known as the normal distribution. (His
derivation was invalid by his own admission.) Laplace’s contribution was
connecting the central limit theorem and this continuous normal dis-
tribution. Belgian astronomer Adolph Quetelet established the connec-
tion between the histogram and the bell curve in 1870, toward the end
of Gauss’s life. In the 200 years from Newton and De Moivre through the
lives of Gauss, Lavoisier, Laplace, and Quetelet, the mathematical descrip-
tions of randomness were developed.
Here’s where the beauty of the discussion in the last section begins to
become obvious. Recall that given a mean and standard deviation, we
can completely describe a set of random data. Given a mean and stan-
dard deviation, we can also create the equation for a normal distribution,
a Gaussian or bell shaped curve that fits the data set. The mathematical
equation for the normal distribution is given by
1 x −µ2
exp −
2 σ
f (x ) = . (7.5)
2πσ
These two values allow us to draw a bell-shaped curve over the frequency
diagram, creating a normal distribution (see Figure 7.3). The ends of the
normal distribution are called tails. Note that the mean and median are the
same for a normal distribution curve. The mean defines the center or peak
of the distribution, while the standard deviation gives the shape of the curve.
The normal distribution is both beautiful and powerful. The normal dis-
tribution is a well-defined curve given by Equation 7.5, which is determined
158 • Problem Solving for New Engineers
12
11
10
9
Frequency
8
7
6
5
4
3
2
1
3.1 3.2 3.3 3.4 3.5 3.6 3.7
Ti rod length (mm)
FIGURE 7.3
Frequency diagram fitted with a normal curve of the length measurements from Table 7.1.
The normal curve fitted to the data gives a mean of 3.39 mm and a standard deviation of
0.14 mm.
simply by knowing the mean and standard deviation. The curve will be
symmetrically drawn around the mean value in a bell shape. The mean,
μ, is the middle of the distribution and the measure of the spread (disper-
sion) in the data is the standard deviation, σ.
Notice in Figure 7.3, the overlayed curve isn’t an exact fit to the fre-
quency distribution of the data. With a limited or sample data set, it is
unlikely that any curve will fit the frequency diagram exactly. However,
in addition to the visual or eye-ball test for fit, there is a more rigorous
test of fit of the curve to the data called a goodness-of-fit test. Software
packages like JMP® will automatically calculate this for you (JMP 2016).
We can also use a normal probability plot and a Tukey outlier box plot to
support our use of the normal distribution or the decision not to use it.
Also, I should mention that as you delve deeper into the topic of distribu-
tions, you will learn that there are many different distributions, and it is
important that you select the best distribution for your data. We will limit
the discussion in this chapter to the normal distribution.
Now a bit more about the goodness-of-fit testing. Statisticians use some-
thing called hypothesis testing to determine goodness of fit. Hypothesis
testing is used for many other tests as well, but limit our discussion to the
context of goodness of fit. For goodness of fit, the null hypothesis states
that the data are from a normal distribution. The goodness-of-fit test will
calculate a p value, which can be used to determine whether to reject the
null hypothesis or not. Typically, if the p value is small (<0.05), the null
hypothesis can be rejected. If you’ve ever spoken with a statistician, you
It’s Random, and That’s Normal • 159
know that they are very noncommittal. The hypothesis test only allows
us to reject the null hypothesis; it says nothing about whether the null
hypothesis is actually true. Are you having fun yet? For the data in
Table 7.1, the goodness of fit test gives us a p-value of 0.29 which is greater
than 0.05. We still don’t know if the data are from a normal distribution,
but we cannot reject the normal distribution either. More information can
be found on p values and hypothesis testing in the references provided in
Chapter 12.
I mentioned two other indicators that we can use to test whether a data
set is from a normal distribution: the normal quantile plot and the Tukey
outlier box plot. For the data in Table 7.1, the normal probability plot and
the Tukey outlier box plot are show in Figure 7.4. The normal quantile plot
allows us to graphically determine whether or not a data set can be approx-
imated by a normal distribution. If the data can be fitted with a diagonal
line, this indicates that the normal distribution is good. Departures from
1.64
0.94
1.28 0.91
0.86
0.8
0.67
0.65 Probability
0.0 0.5
0.35
−0.67
0.2
0.14
−1.28 0.09
0.06
−1.64
0.03
12
11
10
9
Frequency
8
7
6
5
4
3
2
1
3.1 3.2 3.3 3.4 3.5 3.6 3.7
Ti rod length (mm)
FIGURE 7.4
Frequency diagram fitted with a normal curve of the length measurements from Table 7.1
along with a normal probability plot and a Tukey outlier box plot.
160 • Problem Solving for New Engineers
this straight diagonal line indicate departures from normality. The previ-
ous normal quantile plot plots the Ti rod length data along the x axis and
the probability (0 to 1) from something called the cumulative distribution
function on the y axis. The secondary scale on the y axis plots the quantiles
from the standard normal distribution, where μ = 0 and σ = 1. Quantiles
simply split the data into bins based on percentages, where the median is the
50th percentile and the 25th and 75th percentiles are called the quartiles.
The other graph in Figure 7.4 is the Tukey outlier box plot, which is use-
ful in identifying potential outliers. If an outlier exists in the data set, it
will be highlighted in the Tukey outlier box plot (Tukey 1977). The Tukey
outlier box plot divides the data into four groups. The box contains 50%
of the data. Each end of the box has whiskers, which extend from the box,
which show any mild outliers. Any data outside the whiskers are consid-
ered an extreme outlier. In this case, there were no outliers highlighted by
the Tukey outlier box plot. I will leave further explanation about this
graph and Tukey outlier box plot to further research (see Chapter 12) and
the statistical software package you choose to use.
As I alluded to earlier in the chapter, the standard deviation divides the
normal curve into equal length multiples of the standard deviation about
the mean as shown in Figure 7.5. We see that one standard deviation on
either side of the mean represents 68.27% of the population (Figure 7.5a).
In other words, we expect that 68% of the population will be within
one standard deviation on either side of the mean. The ordinates erected at
one standard deviation on either side of the mean include 68.27% of the area
under the curve. Two standard deviations on the either side of the mean
represent 95.45% of the population (Figure 7.5b). Likewise, three standard
deviations on either side of the mean represent 99.73% of the population
of the random data set.
This is what’s so beautiful about the normal distribution. There are
many normal curves, but they all share the same characteristic density
properties described with the 68%–95%–99.7 rule, which is also called
the Empirical Rule. One standard deviation on either side of the mean
contains 68% of the data (from μ − 1σ to μ + 1σ). Two standard deviations
on either side of the mean contain 95% of the data (from μ − 2σ to μ +
2σ). Three standarddeviations on either side of the mean contain 99.7%
of the data (from μ − 3σ to μ + 3σ). For a normal distribution, almost
all the data are contained within three standard deviations of the mean
(Figure 7.5c) and the complete area under the normal curve represents
100% of the population.
It’s Random, and That’s Normal • 161
μ
Probability
−1σ +1σ
68%
of
data
μ − 1σ μ + 1σ
(a) Measurements
μ
Probability
−1σ +1σ
95%
of
data
−2σ +2σ
μ − 2σ μ + 2σ
(b) Measurements
μ
Probability
−1σ +1σ
−2σ +2σ
−3σ +3σ
68.3%
95.5%
99.7%
(c) Measurements
FIGURE 7.5
Normal distribution curve showing the data within (a) one standard deviation of the
mean accounts for 68.27% of the population, (b) two standard deviations of the mean
accounts for 95.45% of the population, and (c) three standard deviations of the mean
accounts for 99.73% of the population.
162 • Problem Solving for New Engineers
∑
n
xi
i =1
x= (7.6)
n
and
∑
n
( xi − x )2
i =1
s= . (7.7)
n −1
164 • Problem Solving for New Engineers
The mean and standard deviation are now the summary statistics
that represent the sample that represents the population. Notice that the
denominator for the standard deviation has changed for the sample. Using
“n − 1” in the denominator compensates for underestimating the disper-
sion or spread in the population. Because “n − 1” is just a bit smaller than
“n,” this will make the standard deviation larger for the sample.
Now, let’s look at how to determine the dispersion in the sample means.
In other words, we want to know how closely all the sample mean values
cluster around the population mean. To answer this inquiry, we need to
look at the standard error, SE. The standard error is defined as
s
SE = , (7.8)
n −1
where n is the sample size and s is the standard deviation. It is impor-
tant to not get standard deviation and standard error mixed up. Standard
deviation measures the dispersion in the population, while standard error
measures the dispersion in the sample means. Notice that they are related.
The standard error depends on both the standard deviation of the sample
and the sample size. A large standard error indicates that we have a large
standard deviation or a small sample size. In other words, the sample
means are not clustered but are potentially highly spread out around the
population mean (Wheelan 2013).
Now, going back to our powerful central limit theorem, we know that
the sample means are normally distributed. This tells us that 68.2% of the
means lie within 1 standard error of the population mean, 95.4% lie within
2 standard errors, and 99.7% lie within 3 standard errors.
We might also be interested in relative variability, where the most com-
mon measure is the coefficient of variation, which is simply the ratio of the
standard deviation to the mean.
s
CV = (7.9)
x
9
8
7
Frequency
6
5
4
3
2
1
3 4 5 6 7 8 9 10 11 12
# of bends to break
FIGURE 7.6
Frequency diagram of how many bends it took for Ben’s paper clips to fail (break). The
data have been fit with a normal curve defined by a mean number of bends of 5.9 with a
standard deviation of 1.2.
166 • Problem Solving for New Engineers
# of bends to break
9 12
8
10
7
6 8
5 6
4
4
3
0 2 3 4 5 6 7 8 9 10 0 2 3 4 5 6 7 8 9 11 13
Frequency Frequency
FIGURE 7.7
Frequency diagram showing Ben’s and Melisa’s bending 30 paper clips to failure. The data
are fit with a normal curve defined by a mean number of bends for Melisa’s breaks of 6.5
with a standard deviation of 2.4.
It’s Random, and That’s Normal • 167
top of the tail. This was interesting. Did I happen to grab three paper clips
that were twice as strong as the others in the container or was something
else going on? Should we assume that the manufacturer of the paper clips
mixed materials or made a few of the paper clips thicker? When there are
outliers in a data set, it is important to try and identify root cause or what
might have caused it. When I reviewed the data in the order in which
I broke the paper clips, I noticed that these outliers were my first three
attempts at bending to breaking of the paper clips.
It could be that I didn’t have my bending technique perfected initially.
Therefore, because I had plenty of paper clips, I just repeated the whole
experiment. I was now an experienced paper clip breaker. The second set
of break attempts can be found in Figure 7.8. The data are fit with a normal
curve defined by a mean number of bends for my second attempt of 5.9
and the standard deviation of 1.4. Notice that the distribution in Figure 7.8
has the same mean as Ben’s experimental results of 5.9, but Ben still has
a slightly smaller standard deviation. This simple example is perfect for
illustrating that nonrandom sources can lead to false conclusions or skew
a data set. It is never enough to only examine a data set in one way and
draw conclusions. If possible, plot the data multiple ways, looking specifi-
cally for nonrandom sources in the data.
The best way to address outliers is to repeat the experiment. With
enough repeats, the outliers in the tail of the distribution have less effect
on the mean and standard deviation of the distribution. Technically,
although we reran the experiment, we wouldn’t want to throw out the data
# of bends to break
# of bends to break
9 9
12
8 8
10
7 7
6 8 6
5 5
6
4 4
4
3 3
0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10
Frequency 0 2 4 6 8 10 12 14 Frequency
Frequency
FIGURE 7.8
Frequency distribution of Melisa’s second round of bending 30 paper clips to failure. The
data are fit with a normal curve defined by a mean number of bends for Melisa’s second
attempt of 5.9 and a standard deviation of 1.4.
168 • Problem Solving for New Engineers
from the first experiment and only use the second round, unless we knew
that something had gone wrong. In this example, we see that it’s simply
my lack of practice and inconsistent method that caused the variation.
Therefore, I may want to include both experimental trials worth of data.
This new combined frequency diagram and normal distribution curve are
shown in Figure 7.9. Notice that the outliers still have an effect on the
results even with double the number of data points. The data are fit with a
normal curve, which is defined by a mean of 6.2 and a standard deviation
of 1.96 for my combined attempts. The mean and standard deviation for
the combination of the trials is in between the results for the individual
first (s = 2.4) and second (s = 1.4) experimental trials.
There are times when we cannot repeat experiments. If something was
obviously wrong, for example, recording the wrong units or missing a
decimal place, these measurements should be discarded. We should never
include data that we know are wrong in our analysis. However, there are
times when outliers exist and there is no explainable or obvious reason
to exclude them. One statistically based method for “throwing away”
unreasonable data points is named Chauvenet’s criterion. Chauvenet’s cri-
terion provides an acceptable inclusion range about the mean for a data
set. Stated another way, Chauvenet’s criterion specifies the location on
the tail of the distribution beyond which we can reject those data points.
The criterion specifies that any point may be rejected if the probability of
15
Frequency
10
3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
# of bends to break
FIGURE 7.9
Frequency distribution of Melisa’s first and second round of bending paper clips to failure.
It’s Random, and That’s Normal • 169
obtaining the point is less than ½n, where n is the number of data points.
“If the expected number of measurements is at least as bad as the sus-
pect measurement is less than ½, then the suspect measurement should be
rejected” (Lin and Sherman 2007).
There are multiple methods for implementing Chauvenet’s criterion
(Coleman and Steele 1999, Kirkup 2002, Lin and Sherman 2007, Taylor
1982). One of the simplest is presented by Lin and Sherman (2007) and
doesn’t involve looking points up on normal distribution tables. Assume
we have a dataset and would like to identify any outliers. For all n data
points, calculate the mean and standard deviation. Use the following
expression to reject a suspicious data point, xi:
x −x 1
n × erfc i < .
s 2 (7.10)
The function erfc (x) is the complimentary error function. This function
can be calculated easily using online calculators. After using Chauvenet’s
criterion, if a point is thrown out, we will have a completely new distribu-
tion with a different number of data points, n′, a new mean, x ′ , and a new
standard deviation, s′. We now have two distributions, one with an outlier
and one without the outlier. It’s important to make every effort to under-
stand what happened and why the data set contained outliers to begin
with. An example lab report from Alex Cress and Briana Fees demonstrat-
ing this technique is shown in Figure 7.10. They participated in an experi-
ment to measure the hardness of stainless steel discs using a Rockwell
Hardness Tester in Material Engineering 210 at San Jose State University.
Now for a word of caution, this process is somewhat controversial
among some scientists and engineers who do not approve of its use.
Dr. John Talyor’s book, Introduction to Error Analysis, has a great discus-
sion of this use of Chauvenet’s criterion and the controversy surround-
ing its use (Taylor 1982). For practical purposes, I think we need some
test to account data that are unreasonable, but as scientists and engineers,
you will need to develop your own stand on this. I would add that any-
time Chauvenet’s criterion or other similar techniques are used, it should
be mentioned so that everyone who reads or hears your findings, results,
and/or conclusions will be aware. Throwing away data points is not some-
thing to be done without incontrovertible explanations. As computational
output has become so easy in recent years, building in an error checker is a
170 • Problem Solving for New Engineers
1.0 Introduction
Random variation is inherent in any process or measurement. With
a large enough sample of repeated measurements, if the variation is
purely random in nature, the data can be fit with a normal distribu-
tion. The normal distribution is defined by two quantities, the mean
of the distribution and the standard deviation in the data. By con-
trolling all critical inputs and minimizing other process variation,
the random variation can be estimated with the mean and standard
deviation. In this experiment, the hardness of 30 discs machined
from 304L stainless steel rod were measured. The lab objective was to
quantify the random variation of the hardness measurement process.
2.0 Material
The Carpenter Technology Corporation Certificate of Conformance
for stainless steel rod showed a hardness of 84 HRB with the ele-
mental weight percentages shown in Table 1, with the balance being
iron.
A rod of 304L low carbon stainless steel was machined into discs.
FIGURE 7.10
Sample lab report from Alex Cress and Briana Fees demonstrating the use of Chauvenet’s
criterion.(Continued)
It’s Random, and That’s Normal • 171
3.0 Measurement
Hardness measurements were performed in the HRB scale using an
uncalibrated Wilson 500 series Rockwell Hardness tester and a 1/16”
ball tip. The samples were split into 6 groups of 5 samples at random,
with each measurement performed one time on all 30 samples. The
measurements purposefully targeted areas away from any edge machin-
ing effects.
4.0 Method
An Input-Process-Output diagram, Figure 1 was created to help
confirm that the only input variable we were deliberately changing
was the stainless steel discs. Input items are labeled with C for con-
trolled, N for noise or uncontrolled and X for intentional variation.
Input Process Output
The normal quantile plot can be used as a test for normalcy. Here,
if the data is normally distributed, the normal probability plot will
be represented by a linear, diagonal line. This allows for a visual
evaluation of how well a data set is normally distributed, and helps
identify possible outliers in the data set. The Tukey outlier box plot
can be used to further identify outliers in the data. We can quickly
visualize the 1st and 3rd quartiles bounding the central 50% of data
and as a larger probability range defined by ‘whiskers’, as well as how
the data falls within these ranges. When data falls outside of this
90
88
Hardness (finished)
86
Count
8
6 84
4
82
2
80
80 82 84 86 88 90
Hardness (finished) –2 –1 0 1 2
a) b) z
Figure 2: a) Distribution of measurement data and Tukey outlier box plot of hard-
ness measurement data, showing a mean hardness of 85.8 with a standard devia-
tion of 1.7, b) normal quantile plot of hardness data.
Both the histogram and the outlier box plot clearly show a data
point that is disconnected from the distribution, a suspected outlier.
Chauvenet’s criterion is a test that can be used for the determination
of whether data that is shown to be significantly distant from the
mean is ‘ridiculously improbable’. By applying Chauvenet’s criterion,
the outlier can be evaluated to determine ‘reasonableness’.
Assume that we have made N measurements of quantity x. In
our case, we have made 30 measurements of hardness. The mean is
x = 85.8 and standard deviation σ = 1.7. Both from visual observation
of the distribution and using the outlier box plot, the data point x =
79.8 is a suspicious measurement. In other words, 79.8 looks different
from the rest of the population and is far away from the mean. The
quantity t, the number of standard deviations which our suspect data
point differs from the mean, is defined mathematically in Equation 1.
( x − n)
t= Equation 1
σ
In this case, the calculated value for t is 3.5, indicating that the sus-
pect data point is 3.5 standard deviations away from the mean. Next,
we want to calculate the probability that the suspect point is outside
the main distribution by at least 2 standard deviations. po is given by
Equation 2, and the error function, erf, is operating on t.
P = Npo Equation 3
Applying these equations to our data set for the outlier as n = 79.8,
we find a value for P = 0.0143, which is well below the traditional
cutoff of 0.5. Using Chauvenet’s criterion, we can say that the data
measurement of n = 79.8 doesn’t meet our ‘reasonableness’ criteria.
After removing this measurement from the data set, the data can be
replotted. By repeating the process once again of fitting the data to a
normal curve, the new distribution can be seen in Figure 3. As expected,
the mean was only slightly affected by the outlier but the standard devi-
ation was strongly affected. The mean hardness measurement increased
from 85.8 to 86.0, while the standard deviation dropped from 1.7 to 1.3.
90
89
Hardness (finished)
88
87
Count
8
6 86
4
85
2
84
84 85 86 87 88 89 90
Hardness (finished) 83
–2.0 –1.5 –1.0 –0.5 0.0 0.5 1.0 1.5 2.0 2.5
z
a) b)
Figure 3: a) Distribution and Tukey outlier box plot of modified hardness data,
showing a mean of 86.0 and a standard deviation of 1.3, b) normal quantile plot of
hardness data with the outlier removed.
Although the data set now looks better with a tighter distribution,
the outlier box plot has identified another potential outlier at 89.1.
While repeated use of Chavenets’ criterion is discouraged by many
scientists, performing the same calculations with this outlier against
the original data set gives some interesting results. Using the value
of n = 89.1, we find a probability of 1.57, which is > 0.5, thereby, pro-
hibiting a reasonable exclusion.
6.0 Conclusions
The random variation of hardness measurements on 30 samples of a
machined 304L stainless steel rod was measured. The mean hardness
was determined to be 86.0 with a standard deviation of 1.3. The mean
hardness measurement is in reasonable agreement with the value pro-
vided by Carpenter (84.0). The standard deviation magnitude is indica-
tive of random variation or noise in the measurement inputs. Although
outliers were identified in the measurement data using the Tukey outlier
box plot, the application of Chauvenet’s criterion showed the farthest
data point from the mean was more than 2 standard deviations away.
Chauvenet’s criterion is not ‘proof’ that this data point was an outlier.
The data point was outside 2 standard deviations, and no reasons were
found for why this sample would be very different from the others from
this rod. We removed the point from the data and recalculated our
results. The hardness measurements for the samples ultimately fit a nor-
mal distribution, with no indication of any inputs which would lead to
systematic variation in the hardness value data. With the random varia-
tion characterized for the hardness measurements, future experiments
can now be performed on these samples allowing us to be more confi-
dent in any changes to hardness due to additional processing.
prudent and reasonable approach. For example, for his PhD thesis, David
R. Wagner used a combination of analytical techniques combined with
error checking to ensure the best results possible (Wagner 2013). Also, we
should use Chauvenet’s criterion only on a normal distribution; it doesn’t
work well on multimodal distributions. With multiple modes in the distri-
bution, it may be the case that we have different data sources that need to
be better understood before analysis (Lin and Sherman 2007).
Throwing away data, especially outliers, could be a big mistake. The outliers
could be the most interesting part of a data set. These are often points worthy
of investigation in order to understand why they differ. It is the outliers that
might lead to significant discoveries. One such example was described in a
Scientific American article (Benedick 1992). As you may know, satellites mea-
sure the ozone level over Antarctica regularly. In the early 1980s, a significant
seasonal drop in ozone levels over Antarctica was detected. Scientists analyz-
ing the data subsequently spent two years rechecking their satellite data. The
scientists discovered that satellites had been measuring the data and record-
ing a drop in ozone levels over time. However, the program used to ana-
lyze the data was programmed to reject outliers and treat it as anomalies. If
the computer had been programmed to highlight the outliers, the scientists
could have investigated the outliers on the first occurrence. A lot time and
resources were wasted by “throwing away” the outliers.
176 • Problem Solving for New Engineers
3*(mean − median )
SK = . (7.11)
standard deviation
10
9
8
7
Frequency
6
5
4
3
2
1
8 9 10 11 12 13 14
Measurement (units)
FIGURE 7.11
Frequency histogram of measurements made over 50 days fit with a normal distribution
curve with a mean of 10.9 and a standard deviation of 1.06.
It’s Random, and That’s Normal • 177
Daily measurements
15
14
13
12
Measurement (units)
11
10
9
8
7
6
5
0 5 10 15 20 25 30 35 40 45 50
Day
FIGURE 7.12
Normal distribution of a measurement made repeatedly over time. Notice that the dis-
tribution of the data is normal but using a scatter plot measurement over time shows a
systematic variation.
7.6 KEY TAKEAWAYS
We humans see patterns all around us. In many cases, it is good to pay
attention to patterns on a personal level, e.g., foods that make us feel sleepy
or energize us, etc. (Adams 2013). However, not all patterns are real; we
want to pay attention to them, but be cautious. When it comes to recog-
nizing patterns, it is essential that we not let clumps of data convince us
that anything other than a normal (as in Gaussian) pattern exists when it
doesn’t. As scientists and engineers, we must use our critical thinking skills
with all data. Randomness shows up in all measurements. Randomness is
178 • Problem Solving for New Engineers
expected. It’s normal and that’s normal. With our experimental results,
it is important that we characterize and quantify the random varia-
tions. Randomness can be characterized with experimental duplication.
The more replicates, the better. However, when there are at least 30 ran-
domly selected sample means from a population, the power of the central
limit theorem is on our side. Finally, normal distributions can be used to
describe a set of random data provided that there is no systematic vari-
ation in the measurement. It is important to explore the data set using
various analysis techniques in order to root out any nonrandom sources.
Happy experimenting! May the central limit theorem be with you.
P.S. We have reached another milestone in the book. Let’s take a moment
to reflect on the big picture. We have a problem to solve. We are attempt-
ing to accurately measure an experimental response; therefore, we want to
control as much of the variation as possible to minimize any uncertainty
in our experimental findings. We’ve learned three ways to deal with varia-
tion: (1) eliminate or minimize unintentional variation through checklists
or standard operating procedures, (2) thoroughly characterize our mea-
surement system to quantify systematic variation, and (3) quantify any
random variation. Now, in Chapters 8 and 9, we will discuss options to
exploit certain types of variation (intentional variation) in our experi-
ments in order to explore certain effects.
REFERENCES
Adams, S. 2013. How to Fail at Almost Everything and Still Win Big: Kind of the Story of
My Life. New York: Portfolio/Penguin.
Benedick, R. 1992. Essay: A Case of Déjà vu. Scientific American 266:160.
Brown, B. 2010. The Gifts of Imperfection: Your Guide to a Wholehearted Life. Center City,
MN: Hazelden Publishing.
Catmul, E. 2014. Creativity, Inc.: Understanding the Unseen Forces That Stand in the Way
of True Inspiration. New York: Random House.
Coleman, H. W. and W. G. Steele, Jr. 1999. Experimentation and Uncertainty Analysis for
Engineers. 2nd Ed. New York: John Wiley & Sons.
Deming, W. E. 1982. Out of the Crisis. Cambridge, MA: Massachusetts Institute of
Technology.
Dormehl, L. 2014. The Formula: How Algorithms Solve Our Problems…and Create More.
New York: Penguin Group.
Flaig, J. J. 2016. A Bell Shaped Distribution Does NOT Imply Only Common Cause
Variation. The Quality Technology Corner. www.d577289.u36.websitesource.net
/articles/BellCurveNotRandom.htm.
Hillerman, T. 2009. Coyote Waits. New York: Harper.
It’s Random, and That’s Normal • 179
We now have all the pieces of the puzzle to create a reliable, repeatable
experiment. We know that we need to control our experimental setup and
procedures. We know that we must have a well-characterized measure-
ment system in which we have quantified repeatability and reproducibil-
ity. We know that random variation will play a role in our results and how
to quantify its contribution. Now, we can confidently begin to explore and
experiment by intentionally manipulating variables. The first seven chap-
ters of this book were just a setup to give us confidence as experimental
problem solvers.
Although there are a number of different mathematical techniques that
can be used, we’ll stick with one in this chapter. We’ll look at one-factor-
at-a-time experimentation and use regression analysis to build a model
of the experimental process space. This is a great starting point for any
experimentalist. We change one factor and record the effects. We did this
in high school and college labs most likely without realizing it. Any time
we’ve fit our data with a line and displayed the equation for that line, we’ve
built a model of the experimental process space. Most of the time, we
use some form of regression analysis to create that line. Although many
graphic software programs now make this very easy, there is sophisticated
mathematics behind the development. It is important that we understand
where it all comes from and why we get the results that we get.
181
182 • Problem Solving for New Engineers
8.1 TORTURING NATURE
Recent history (okay, the last few hundred years of history) is rich with
scientists eager to learn the “eternal laws that govern the universe”
(Dolnick 2011). This history reveals an evolution in the methods that sci-
entists have used to gain this knowledge. Our history begins with detailed
observations and evolves to measurement of those observations. Galileo
introduced us to indirect measurement techniques in the sixteenth and
seventeenth centuries. Sir Francis Bacon brought us empiricism and the
scientific method. Nature must be “put to the torture,” Bacon declared. By
the mid-seventeenth century, the dozen founding members of the Royal
Society were calling for experimentation, creating artificial situations and
recording observations. Experiments were something new. To the seven-
teenth century world, this was a radical call. The universities at the time
saw it as their responsibility “not to discover the new but to transmit a her-
itage,” according to historian Daniel Boorstin. Students who didn’t abide
by this philosophy could be fined five shillings (Boorstin 1983). Curiosity,
according to Augustine, was the equivalent of lust. The men of the Royal
Society wanted to probe, poke, and test, not passively observe the world
from behind a curtain (Dolnick 2011).
Our curiosities lead us toward a deeper understanding of the world
inside us, around us, and beyond us. Although it may seem that the Royal
Society and Isaac Newton existed long ago in an entirely different era,
we are still, hundreds of years later, trying to create, optimize, and teach
repeatable, reproducible experimental practices.
y = mx + b , (8.1)
V = IR (8.2)
Experimenting 101 • 185
TABLE 8.1
Commonly Used Names for Experimental Variables
Variable Common Names Explanation
Independent, x Explanatory variable Inputs
Control variable
Control parameter
Key process variable
Dependent, y Dependent variable Outputs
Response variable
Output variable
Response parameter
The resistance in a circuit (or any portion of the circuit) can be deter-
mined by measuring the voltage at various currents and then determining
the slope of the line created for the different values of current and volt-
age. In this case, the current would be our independent variable or control
variable and the voltage would be the dependent variable or response vari-
able. Another example is the relationship between stress and strain for an
elastic material, where the slope gives us Young’s modulus.
σ = Eε (8.3)
These linear relationships seem simple, yet we recall from our lab
classes that the measured data points, when graphed, do not necessarily
form a straight line but contained varying amounts of scatter. In order to
establish the relationship between the experimental variables, we drew a
line through our (x, y) data points. Our measured data (x, y) points have
embedded in them the response to the factor that was changed, random
variation, systematic variation, as well as influences from uncontrolled
factors. We now come back to something that sounds familiar.
Let’s revisit the Input–Process–Output diagram discussed earlier. This is
a wonderful tool to use with our experiment to safeguard that the “whole
environment” of the experiment is controlled. In reality, a process has four
categories of variables, all of which should be examined thoughtfully prior
to any experimentation. The variable categories are given in Table 8.2 and
illustrated in Figure 8.1. Therefore, if our experiment considers only the
control variables and the response variables, we will not have created a
reproducible experiment.
The experimental outputs are our response variables. The response
variables are those variables that are measured to evaluate the process
186 • Problem Solving for New Engineers
TABLE 8.2
Categories of Variables Contributing to Experimental Results
Variable Designation Type Description
Response Y Output Those variables that are measured to evaluate
process (dependent variable)
Controlled C Input What we will hold constant during any
experimentation (independent variable)
Uncontrolled N Input Anything that cannot be held constant
during the experiment (independent
variable)
Key Process X Input What we will vary during the experiment
Variables (independent variable)
Source: Wortman, B., Richardson, W., Gee, G., Williams, M., Pearson, T., Bensley, F., Patel, J.,
DeSimone, J., Carlson, D., The Certified Six Sigma Black Belt Primer, The Quality Council of
Indiana, West Terre Haute, IN, 2007.
performance. The controlled variables are all the inputs that we hold con-
stant during the experiment. Recall that we can create a standard operat-
ing procedure as an insurance policy for consistency of these variables.
The variables that cannot be controlled during the experiment all fall into
the uncontrolled input bucket (noise). We take safeguards to ensure the
Response
Process
variables (Y)
FIGURE 8.1
Illustration of the relationship between different types of variables from an expanded
all-encompassing view. (From Wortman, B., Richardson, W., Gee, G., Williams, M.,
Pearson, T., Bensley, F., Patel, J., DeSimone, J., Carlson, D., The Certified Six Sigma Black
Belt Primer, The Quality Council of Indiana, West Terre Haute, IN, 2007.)
Experimenting 101 • 187
(a)
0.10
Thermal diffusivity (cm2/sec)
0.08
0.06
0.04
0.02
0.00
0 500 1000 1500
Temperatures (C)
(b)
FIGURE 8.2
Experimental values (a) and graph (b) of the thermal conductivity of alumina exposed to
different temperatures. The line is a smoothed trend line showing the shape of the data
points.
0.015
0.0145
Thermal diffusivity (cm2/sec)
0.014
0.0135
0.013
0.0125
0.012
1000 1100 1200 1300 1400 1500
Temperatures (C)
FIGURE 8.3
Experimental values and graph of the thermal diffusivity of alumina exposed to tempera-
tures between 1000°C and 1500°C. The line is a smoothed trend line showing the shape
of the data points.
cm 2
Thermal Diffusivity = 0.020 − (5.2034e − 6)*Temperatures (°C )
sec
(8.4)
Once we have our model, our equation for a line that best fits our data,
y = mx + b, we can use it as our best guess at predicting y from an x value.
Experimenting 101 • 191
0.0155
0.015
Thermal diffusivity (cm2/sec) 0.0145
0.014
0.0135
0.013
0.0125
0.012
1000 1100 1200 1300 1400 1500
Temperatures (C)
FIGURE 8.4
Measured values of the thermal diffusivity of alumina exposed to temperatures between
1000°C and 1500°C with a linear regression fit to the data.
Can we really use our model to predict y values within the range of x values
that we didn’t test? It goes without saying (but I’ll say it anyway) that this
line can be expected to predict only values within the range of values for x
that we investigated (see Figure 8.5). The predicted values from interpola-
tion (within the x values we tested) will provide values for the dependent
0.0145
Thermal diffusivity (cm2/sec)
0.014
0.0135
0.013
0.0125
0.012
1000 1100 1200 1300 1400 1500
Temperature (C)
FIGURE 8.5
Overlay of measured and predicted values for the diffusivity as a function of tempera-
tures. The predicted values use the model from the regression analysis.
192 • Problem Solving for New Engineers
variable, y, that are as good as our experimental data. Can we use it for
extrapolation? The answer is a strong, resounding maybe. Proceed down
the extrapolation path with trepidation and caution. Always keep in
mind that, when using a model built for a certain experimental range of
independent variables, our model is good ONLY for that experimental
range of independent variables. We want to use our models to predict
what will happen in other areas of our experimental space. If the model
works outside of the original ranges of independent variables, how won-
derful! Figure 8.6 shows what happens in our example. The further away
from the temperature range of our original model, the worse the predic-
tion of the model. We see at 500°C that the model prediction is somewhat
close for the thermal diffusivity. However, the closer we get to 0°C, the
more the model diverges from the actual measurements. We might be
able to avoid further experimenting in that range. However, if our model
doesn’t work, we can use this model and our understanding of the prior
experimental model to create a new experimental range. In other words,
the extrapolated prediction may be used as the starting point for a follow-
up experiment.
0.08
0.06
0.04
0.02
FIGURE 8.6
Overlay of measured and predicted values for the diffusivity as a function of tempera-
tures including temperatures below 1000°C. The predicted values use the model from the
regression analysis to extrapolate outside the region of the model.
Experimenting 101 • 193
Now, back to our experimental model. Let’s look at our model. Notice
whether there is a positive or negative sign in front of the key process vari-
able. The sign tells us whether we have a direct relationship or an inverse
relationship. The inverse relationship or negative slope tells us that as the
control parameter increases in our model, the response variable decreases.
The magnitude of the slope is also of interest in our model. For example,
let’s say our experimental results show an increase in corrosion rates by a
factor of 5.3 for metals exposed to dog urine. Is 5.3 a large or small value?
One question we may need to answer is how does this compare to cor-
rosion rates for this same metal not exposed to dog urine under other-
wise similar conditions (the population as a whole). The reason the size
of the value 5.3 is important has to do with its significance as opposed
to the absolute numerical value. The significance is really at the heart of
our reason for investigating in the first place. We need to determine if
this result is representative. Assuming that we’ve followed all the rules of
experimentation, there is a large enough data set (at least 30 samples), we
can use the central limit theorem, the normal distribution and standard
error to determine significance (Wheelan 2013). (Refer to a statistics book
for other options when working with a smaller data set.)
There is another important part of our model that we need to discuss.
Notice the “Summary of Fit” table included in Figure 8.4. Along with our
model (the “best fit” equation to our data), a value called R2 was calcu-
lated. The value of R2 is used to estimate how “good” our “best fit” is to the
data. R2 provides a measure of the variation explained by the regression
equation—the proportion of the variance in y attributable to the variance
in x. (At this point, we should be wondering about R. R is used to repre-
sent something called the Pearson product moment correlation coefficient.
R is a dimensionless number that ranges from −1.0 to 1.0, inclusively, and
reflects the extent of a linear relationship between two data sets. For more
information on this topic, I’d recommend having coffee with a statisti-
cian or consulting a statistics textbook.) R2 can vary between 0 and 1.0,
inclusively. When R2 = 0, the model that we’ve built performs no better
than the mean at predicting the relationship between our experimental
variables. When R2 = 1.0, the model predicts this relationship between the
two variables exactly. Typically, we will find that the R2 value is somewhere
in between these two extreme values.
Remember the discussion in Chapter 1 on the caution about collapsing
correlation and causation. Refer back to Figure 8.4. In this case, we see the
R2 = 0.977. It is probably reasonable to say that the temperature change is
194 • Problem Solving for New Engineers
1800
US arcade revenue ($millions)
1700
1600
1500
1400
1300
1200
1100
900 1000 1100 1200 1300 1400 1500 1600
Math doctorates awarded (US)
FIGURE 8.7
There is an R 2 correlation of 0.89 between US arcade revenue and math doctorates
awarded between 2000 and 2009. (From Vigen, T., Spurious Correlations: Correlation
Does Not Equal Causation, Hachette Books, New York, 2015.)
in part a source of causation for the diffusivity change in Al2O3 due to the
strong correlation we see between the two variables and our knowledge
of subject matter. Knowledge of subject matter tells us (recall from fresh-
man physics) that temperatures of radiating bodies actually scale as T 4.
However, we saw that temperature corrections over small ranges can be
treated linearly. An R2 = 1.0 simply means correlation; it doesn’t imply
causation. Without knowledge of subject matter, we are wandering around
in the dark. Take the example in Figure 8.7, which shows the correlation of
math doctorates awarded and arcade revenue, though we probably cannot
build a case for causation.
We’ve just walked through a discussion about linear regression analysis
using ordinary least squares and built a simple but beautiful linear model
for our experimental data. However, there are many times when the rela-
tionship between the two experimental variables isn’t going to be linear
or best described by a line. This doesn’t mean that linear regression isn’t
applicable. We’ve covered a simplified case of the general form of regres-
sion. Once we are comfortable and confident with the simplified case,
there are references available for more sophisticated model development
later in the book.
Experimenting 101 • 195
8.5 KEY TAKEAWAYS
The most common experimental strategy in use in the physical sciences
today remains one-factor-at-a-time experimentation. It is a good way to
get our feet wet, so to speak, in experimental problem solving. If there are
adequate resources available for experimentation, this tool is the by far
the most intuitive to gain a basic understanding of experimentation and
model building from our data.
The one-factor-at-a-time technique is a perfectly good experimental
strategy, but it is limited. A one-factor-at-a-time experiment may be what
is needed when we know that the variable interactions are not complex
or if we have a large resource pool so the number of experiments that we
can perform is not limited. Limited resource situations with known or
suspected complex interactions are the occasions when designed experi-
ments of higher order are needed in our experimental toolbox. For these
situations, we’ll need Experimenting 201 in Chapter 9.
P.S. We’ll move to more complex experimentation in Chapter 9. When
complex relationships exist between variables, we need to use more sophis-
ticated techniques than one-factor-at-a-time experimentation. I learned
about complex relationships very early on the farm. My father was interested
in the tomato yield each year. Every day, I watched and watered the plants at
roughly the same time of day and recorded whether we had tomatoes or not.
I dutifully counted the tomatoes we harvested from each plant and the har-
vest date in the family garden for two consecutive summers. At the end of
the each summer, I proudly showed my father this fancy prediction capabil-
ity that I had learned in school and proclaimed that I could tell him almost
exactly when the tomatoes would start producing and the yield rate over
time. This little experiment taught me a valuable lesson in experimentation.
Reflect back on your own experimental experiences, what lessons have you
learned with one-factor-at-a-time experimentation?
REFERENCES
Boorstin, D. 1983. The Discoverers. New York: Random House.
Dolnick, E. 2011. The Clockwork Universe: Isaac Newton, the Royal Society & the Birth of the
Modern World. New York: HarperCollins.
Munro, R. G. 1997. Evaluated Material Properties for a Sintered alpha-Al₂O3. Journal of the
American Ceramic Society 80:1919–1928.
196 • Problem Solving for New Engineers
NIST. 2016. National Institute of Standards and Technology Ceramic Data Portal contains
experimental data. http://srdata.nist.gov/CeramicDataPortal/Pds/Scdaos.
Vigen, T. 2015. Spurious Correlations: Correlation Does Not Equal Causation. New York:
Hachette Books.
Wheelan, C. 2013. Naked Statistics: Stripping the Dread from the Data. New York: W. W.
Norton.
Wortman, B., W. Richardson, G. Gee, M. Williams, T. Pearson, F. Bensley, J. Patel,
J. DeSimone, and D. Carlson. 2007. The Certified Six Sigma Black Belt Primer. West
Terre Haute, IN: The Quality Council of Indiana.
9
Experimenting 201
John Sall
9.1 COMPLEX PROBLEMS
In today’s world, we want to measure everything, from the number of
steps we walk to protein levels in our blood. Luke Dormehl, in his book
197
198 • Problem Solving for New Engineers
The Formula: How Algorithms Solve All Our Problems … and Create More,
relays the story of one such person who regularly, quantifiably moni-
tored his health only to observe a certain protein, indicative of infection,
increasing. Bringing his data to his personal physician, he was rebuked
for coming in with data and not a health problem. Weeks later, the man
was having surgery to remove his appendix (Dormehl 2014). We’ve come
a long way from leeches sucking out the “bad” blood, but maybe not as far
as we like to think.
It wasn’t until the 1950s that medicine began its transition from art
to science. In the physical sciences, we have Galileo to thank for ensur-
ing that investigations in physics “will never be the same” (Sobel 2000).
Galileo stopped looking for why natural phenomenon happened and
began observing and measuring (repeatedly) what was actually happen-
ing in nature. In medicine, we have prisoner of war Dr. Archie Cochrane
to thank for introducing a scientific approach. The first evidence we
have of statistical experimentation in medicine was from the work of
Dr. Cochrane during his World War II imprisonment. He performed ran-
dom control trials on fellow prisoners (Sur and Dahm 2011). The expan-
sion and benefits of randomized control trials were further developed by
Drs. Thomas Chalmers, Ian Chalmers, and Murray Enkin in the decades
of the 1950s to 1960s. The physicians showed that even medicine is sus-
ceptible to both evidence selection and bias. For most of us today, we can’t
imagine what a radical shift this actually was, and it didn’t happen over-
night. Evidence-based medicine was actually coined in 1991. Seriously,
you read that correctly. In 1991, Dr. Gordon Guyatt introduced a new
method for bedside teaching of residents called “Scientific Medicine” later
changed to “Evidence-Based Medicine” in an editorial he authored for the
ACP Journal Club (Guyatt 1991). Quoting Dr. Deborah Kilpatrick, chief
executive officer of Evidation Health, “Controlled clinical trials and for-
malized, evidence-based recommendations as to how medicine should be
practiced is a fairly recent phenomenon” (GE 2016). We are actually liv-
ing in the midst of this revolution—or maybe it would be better to call it
a paradigm shift—in the way scientific data will be used in the practice
of health care. The algorithms, formulas, and/or models that come out
of the systematized, scientific approach to the analysis of the volumes of
data collected about us are already impacting our lives. This is evident in
Google’s, Amazon’s, and Facebook’s use of what we click on and even how
long we hover over a particular screen, or from Apple’s iPhone, or our
Fitbit tracking where we go and how many steps it takes to get there. The
Experimenting 201 • 199
20 days
Mass loss (g)
10 days
20 days
Mass loss (g)
15 days
10 days
% Salinity
(b)
FIGURE 9.1
Two ways to visualize the relationship between variables: (a) Three graphs displaying data
versus (b) all data on a single graph.
20
ace
Time (days)
Sp
s
es
oc
Pr
10
4 % Salinity 10
FIGURE 9.2
A third way to visualize the process space that captures the relationship of different
variables that easily extrapolates up to four dimensions.
Experimenting 201 • 203
9.3 SELECTING A DESIGN
Once we have an idea of our process space, before actually beginning
experimentation, we will need to select a design for our experiment. When
we are designing our experiment, either in a software program or by hand,
we want to take a chance but not be impractical in choosing the process
space. We want to assign levels to each independent variable in light of our
knowledge of the process, equipment, resources, etc. It is critical that we
not throw out our experiences in exchange for a computer program. “The
computer told me to do it” is not a valid excuse. It is beneficial, even for
experienced experimenters, to have another person critically review the
outline of the designed experimentation.
An experiment is typically designed with the goal in mind. Keep that
goal in mind during the design selection. There are several primary rea-
sons we want to run designed experiments with multiple variables. Two
of the most common experimental objectives for scientists and engineers
are comparing and screening. Table 9.1 lists several common objectives
for designed experiments. Notice in Table 9.1, even one-factor-at-a-time
experimentation can be improved through design. When the goal is com-
paring, we have several factors under investigation and our primary goal
is to determine which factor or factors are “significant” or the “most sig-
nificant.” In this case, we need a comparative design solution. The goal of
screening experiments is to screen out the important experimental vari-
ables. Screening designs allow us to evaluate a large number of experimen-
tal variables with very few experimental runs. Therefore, typical screening
experiments involve two-level designs with varying degrees of fraction-
alization. A full factorial screening design will have us run all combina-
tions of the process (input) variables (X). A fractional factorial screening
design experiment is a fraction of a full factorial experiment. A fractional
factorial screening design allows us to quantify the changes occurring in
TABLE 9.1
Common Design Objectives and Guidelines for Selecting a Design
Objective/
No. of Factors Comparing Screening
1 Randomized one-factor-at-a-time design
2 or more Randomized block design Full or fractional factorial
204 • Problem Solving for New Engineers
the response (output) variable (Y) of a process while changing more than
one process (input) variable (X). With a fractional factorial, there is no
need to run every combination of experimental conditions. The fractional
factorial screening design uses confounding to consume fewer resources.
Confounding means that the value of a main effect estimate comes from
both the main effect itself and a contamination of higher order interaction
terms.
Of course, with a fractional factorial screening design, there are trad-
eoffs. The advantage is that we run fewer experiments in less time and
with fewer resource requirements than with a full factorial screening
experiment. The results from a fractional factorial screening design will
be average responses associated with multiple factors. We must be very
careful when interpreting the results of these fractionalized screening
experiments. Just because a factor is not highly significant in a fraction-
alized design does not necessarily mean that it is not significant. The
experimental design itself is critical in ensuring that significant effects
aren’t missed. However, the fractional factorial screening design has less
power to quantify the interactions between the process (input) variables
because of the confounded effects. Many today think fractional factorials
are outdated. Although the terminology is still used, it is possible to work
through custom design and definitive screening designs to create efficient
designs without having to work through the traditional steps of fractional
factorials. I’ve included fractional factorials for comparison only.
Besides comparing and screening, there are other experimental objec-
tives that involve more advanced designs and concepts. Designed experi-
ments intended for mapping (again the cartography comparison) allow
an experimenter to discover the shape of the response surface (topogra-
phy) under investigation. These experiments are aptly named response
surface designs. The response surface design will fully explore a process
window. Typically, this type of design is used to improve or optimize a
process space or troubleshoot in a well-understood process space. This
type of experimentation is best performed with a well-understood Input–
Process–Output diagram and is rarely, if ever, the initial experiment per-
formed. An experiment intended to fully map out a process space would be
performed as a follow-up experiment to other experimentation. Response
surface designs are most effective when there are fewer than five process
(input) variables (X). These designs are resource intensive, requiring at
least three levels of every process (input) variable (X). However, quadratic
models are generated for each of the response (output) variables (Y).
Experimenting 201 • 205
additional runs can be added to check for curvature and to correct any
experimental mishaps. There are a few decisions that we will want to
make prior to the design selection that should help make our design
selection easier. Another consideration in choosing the design is the
difficulty with which an experimental run can be changed. For exam-
ple, biologist and mathematician, Sir R. A. Fisher, developed split-plot
designs for use in agricultural experiments (Fisher, 1925). A split-plot
design naturally blocks the experiment such that the blocks are experi-
ments. Dr. Bradley Jones and Professor Christopher Nachtsheim have
published a paper with details of split-plot design motivation (Jones,
2009). Other examples can found in Statistics for Experimenters, 2nd
Edition (Box, 2005). The field of designed experimentation is continuing
to develop. Most industrial and engineering experiments are split-plot
designs, which makes it a valuable technique for scientists and engineers.
The very first things we will want to identify are the response (output)
and process (input) variables that are important to us. The response vari-
ables (Y’s) are the variables that show the observed/measured results of
our experiment. The process variables (X’s) are the independent variables
that have some type of effect on the response (output) variables (Y’s).
The levels or conditions that we use for input factors will determine the
response(s) that we eventually will measure. Both the response (output)
variables and the process (input) variables can be qualitative or quantita-
tive. Quantitative measurements (numeric and continuous) tend to be pre-
ferred by most engineers and scientists; however, there are times when the
only response possible is qualitative. Quantitative measurements are those
that give us a numeric value, a quantity. Qualitative inputs or responses
(characters and nominal) have different properties or attributes. For exam-
ple, let’s say we are interested in whether we have created a leak-free join.
Leak detectors allow us to determine whether a leak is present or not. Our
responses might be a qualitative yes (a leak is measured) or no (a leak is
not measured). However, rather than leak-free join, we may be interested
in creating and holding a vacuum. If our leak detector is sensitive enough,
we might be able to measure the actual leak rate and have a quantitative
(continuous numeric value) result to use in our analysis. There may be
times when we are working with discrete numeric input parameters rather
than continuous parameters. Discrete numeric values are more desirable
than qualitative data but less desirable than continuous data.
Once we have selected the response and input variable(s) or factors that we
are interested in, the next step is to decide on our process space. The process
Experimenting 201 • 207
space is determined by the extremes or highest and lowest values that we are
interested in exploring. For example, let’s say we are interested in the effect
of bath temperature and time on the nickel electroplating thickness onto
aluminum. In a screening experiment, the process space will be determined
by the highest and lowest values chosen for temperature and time. If we
decided to vary the time the parts are in the bath from 10 to 20 seconds and
vary the temperature from 10°C to 40°C, these settings will determine the
outer range of the experimental process window. With the range defined,
we will want to select the number of levels within that range we want to run
for the experiment. The most common are two or three levels or values. For
example, the low value for time would be 10 seconds, the high value would
be 20 seconds. For the third value, we might choose 15 seconds. The number
of values is, in part, determined by the resources (parts, time, etc.) that we
have available to experiment upon. The more values we choose to experi-
ment upon, the more confidence we can have in the process space.
With a screening design, the number of experiments we would like to
run will be determined by the resources we have available to dedicate to
the experiment. The simplest case is known as a full factorial. A full facto-
rial screening experiment will use all combinations of all the input values
that we selected. Depending on our resources, a full factorial might be
very expensive. An alternative screening design is a fractional factorial,
which can significantly reduce the number of experiments performed.
A full factorial designed experiment consists of testing all possible com-
binations of process (input) levels. The total number of different com-
binations for k factors at two testing levels is = 2k. For example, in our
experiment with two factors and two testing values each, there will be
a total of n = 22 or 4 combinations. This allows us to create a matrix of
experimental runs.
The advantage of testing the full factorial is that we obtain informa-
tion on all main effects plus all interaction effects. The main effect is an
estimate of the effect of a factor independent of any other factors. Let’s
take the previous electroplating example with two input factors and two
values. The main effects would be time and temperature. An interaction
effect occurs when the effect of one input factor on the output depends on
the level of another input factor. The interaction effect would be the effect
of the interaction between time and temperature, written as time * tem-
perature. These effects are key to the type of model that we build from our
experiments. With n = 22 or 4 experimental runs in our plating example,
the model would include time, temperature, and time * temperature.
208 • Problem Solving for New Engineers
10
5 15
Ultrasonic bath time (min)
(a)
10
Cleaning solution (%)
Center point
5 15
Ultrasonic bath time (min)
(b)
Cleaning solution (%)
)
in
m
e(
im
et
ns
(c)
FIGURE 9.3
Illustration of (a) the process space under investigation, (b) the process space under inves-
tigation with center point added, and (c) the process space under investigation with a
third variable added.
212 • Problem Solving for New Engineers
We are also interested in the level of confidence we can put in our mea-
surements. One way to determine repeatability is to replicate one or more
of the experiments by performing the same experiment multiple times.
Repeated trials, or replicates, are conducted to estimate the pure trial-to-
trial experimental error or random error independent of any lack of fit error.
9.5 ANALYSIS
The final and most important step is to analyze and interpret the results.
We want to confirm that the data are consistent with the experimental
assumptions and that the results are consistent with what we know about
the subject. For example, if I reduce the amount of heat that I applied to
water below a certain temperature, I don’t expect the water to boil. The
findings may lead to further runs or additional designed experiments.
It is important that we take the time and learn all that we can from the
results and have others ask us questions about the process as well as the
results. Software packages such as JMP that allow us to design the experi-
ments will help with all the calculations and graphs needed for the analy-
sis. However, generating graphs using advanced statistical software is only
half the battle; the analysis and conclusions are still up to us. We will still
need to translate the graphs and models into the physical realm to answer
a question or solve the problem we were interested in.
It is important that we draw conclusions from our analysis of the experi-
ment. We must ask ourselves at each step: “Do these results make sense
with what I know about the subject?” A “surprise” result doesn’t mean
that something has gone wrong, but it is important to verify our findings.
Verification can be achieved by replicating runs or whole experiments.
Depending on our experimental objective, we may want to proceed with
further experiments.
The prior chapters all build on one another. All the information
covered in the preceding chapters is important and applies here. A
properly executed experiment will ensure that the right kind of data is
collected and that there are enough data to meet the objectives of the
experiment.
Also, we may want to avoid using responses that combine two or more
process measurements. For example, a critical response in thin film etch
processes is uniformity. Uniformity can be calculated in a number of ways,
Experimenting 201 • 213
but all involve several calculations. Typically, we need to measure the pre-
etch film thickness because the incoming film will have some topography
and varying thickness. Once the etch is complete, we measure the post-
etch film thickness. The difference in the film thickness will give us the
etch depth or film removed, which is calculation number one. Because we
are interested in a uniform etch, we need to make these measurements at
multiple locations. The crudest estimate would be two locations—at the
center and edge. Typically, the measurements are performed using a pat-
tern that will cover multiple locations on the surfaces. The calculation of
uniformity might consist of anywhere from two points to tens, hundreds,
or thousands of measurement points. The most complex calculation for
uniformity uses the coefficient of variation (a measure of relative variabil-
ity). The coefficient of variation (CV) is determined by taking the ratio of
the standard deviation and the mean. This calculation provides a num-
ber that represents the relative variability of the etch depth on the silicon
wafer. Embedded in this simple estimate for the uniformity is first a dif-
ference calculation (post-thickness minus pre-thickness), then an average
to get the mean and standard deviation calculation. The calculated value
for the coefficient of variation is far away from what we actually measured.
This may be unavoidable, but the closer we stay to the actual measure-
ments, the more accurate our experimental results.
We have run a designed experiment in a controlled manner. Now, we
want to measure the effect of multiple key process variables (X’s) on a par-
ticular response variable (Y). In our case, we are looking at two variables.
Because we are doing a screening experiment, by definition, our goal is
to create a model for the response variable that allows us to identify how
important each of our key process variables is to the response variables of
interest.
If we want to design an experiment to test the effect of one of our key
process variables on our response variables, what would we do? We’d
design our experiment so that we tested our key process variables at mul-
tiple levels and then measure the response. For example, if we ask the
question: Does the temperature of salt water make a difference in the cor-
rosion rate of steel? We might decide to look at samples of water from the
Pacific Ocean or we might decide to make our own solution of salt water
so that we could more accurately control the properties of the water and
the % salinity. We might do a bit of research and see that ocean, slough,
and bay temperatures vary. Once we ran our experiment, we would be
able to plot the data on a line graph with temperature as our x variable
214 • Problem Solving for New Engineers
and mass change as our y variable. From this, we could create an equa-
tion that would model the data for the temperature range in our experi-
ment. Similarly with multivariable experiments; however, the math just
becomes more cumbersome the more dimensions we add. With a 22 or
23 full or fractional factorial, the analysis could easily be done by hand
for the experiments. The more experience we gain, the more complex the
experiments where performing the analysis by hand isn’t practical and
really adds no value to the solution. However, it is important for us to
understand the analysis so that we understand the limitations.
9.6 CODED VALUES
I want to walk through an example of solving a simple designed experi-
ment by hand, but first, let me introduce a common method to simplify
the math by introducing coded or scaled values. Scaled values are simpli-
fied values used in the analysis to simplify building the model and create
standardized, scaled units for all key process variables (inputs). Coding
will allow us to work with 1 and −1 rather than the actual values of 5 and
15, for example. Even though the equations look ominous, this process is
simple. To find the coded values:
where
For example, consider key process variables time from our titanium rod
cleaning example earlier, in which the high value was 15 minutes and the
low value was 5 minutes. If we substitute these values into the previous
equation, we get:
Experimenting 201 • 215
We can repeat this coding procedure for the cleaning solution and the
rinse time. A full factorial designed screening experiment for these three
input factors will scale the high and low values for the clean solution, 10%
and 4%, to 1 and −1 and the rinse time from 5 and 1 minute to 1 and −1,
respectively.
chamber. The three key process variables (inputs) that we want to vary in
the experiment are ultrasonic bath time, cleaning solution, and rinse time.
As the engineers responsible for the experiment, we wish to identify the
key process variables affecting the removal of the trace amounts of sodium
ions from the parts. We decide to run a full factorial experiment because it
is suspected that there may be important interactions between the process
input variables that may impact the quantity of sodium ions on the parts.
We want to determine the effect of all three factors and their interactions;
therefore, a 23 full factorial must be run. We establish high and low values
based on our existing knowledge of the process and equipment. The values
and input factors are shown in Figure 9.4.
A 23 screening full factorial will contain eight different experimental
runs, which JMP will generate. Our experiments can be seen in Figure 9.5.
Normally, we want to perform the experimental runs in a randomized
order. JMP will allow randomization of the runs; however, for illustrative
purposes, the runs are sorted in a pattern from left to right.
For simplicity, we will not deal with any center points or replicates in
this example. Also, to make the analysis by hand simpler, I’d like to use
the scaled or coded values. In this case, our experimental runs for analysis
would look like Figure 9.6. I am doing this only to illustrate what’s behind
Factors
Continuous Discrete numeric Categorical Remove Add n factors 1
FIGURE 9.4
High and low settings for process input factors from JMP screen shot.
FIGURE 9.5
List of eight experimental runs generated from JMP software.
Experimenting 201 • 217
Factors
Continuous Discrete numeric Categorical Remove Add n factors 1
Name Role Values
Ultrasonic bath time (min) Continuous −1 1
Cleaning solution (%) Continuous −1 1
Rinse time (min) Continuous −1 1
(a)
(b)
FIGURE 9.6
Coded high and low settings for process variables: (a) for the high and low settings for our
key process variables and (b) the eight experimental runs of the experiment.
the calculations. There is no need for us to deal with coded or scaled values
if we are using JMP or another statistical software package.
Once we have the design, we can run the experiments and compile the
results. (Okay, I realize this may take a long time to do and collect all the
data, but I want to focus on the analysis here.) The results are shown in
Figure 9.7. Before beginning any mathematical analysis, we want to review
the data to make sure that everything seems reasonable. In this case, I
observe that the longer ultrasonic bath time reduces the sodium by half.
The longer rinse time also appears to help. However, the cleaning solution
is not so straightforward. Since this appears to be a reasonable result and
FIGURE 9.7
Coded runs for the full factorial with results tabulated in JMP.
218 • Problem Solving for New Engineers
nothing looks out of the ordinary, let’s calculate the effects of each of the
key process variables. To do this, we simply sum the sodium result values
when the ultrasonic bath time is high and subtract the sum of the sodium
values when the ultrasonic bath time is low, dividing the results by 4. The
effect of ultrasonic bath time:
What does this mean? When the ultrasonic bath time is set at the high
level (15 minutes) the process removes 18.3 more sodium ions on the
parts, as opposed to the low level (5 minutes). In other words, when we
increase the ultrasonic bath time from 5 minutes to 15 minutes, we reduce
the sodium ions by 18.3 × 1012 atoms/cm 2 on the parts. All of this yield
improvement can be attributed to ultrasonic bath time alone since, during
the four high ultrasonic bath time experiments, the other two input fac-
tors were twice low and twice high.
Now, let’s look at the main effect of cleaning solution
The effect of increasing the cleaning solution from the low level to the
higher level results in an increase in the sodium ions by 0.3 × 1012 atoms/cm 2.
Increasing the rinse time from the low level to the higher level reduces the
sodium ions by 2.8 × 1012 atoms/cm 2 .
Now we want to calculate the interaction terms. We’ll use our coded matrix
again for this. The coded value for the interaction terms is the product of
the two coded values for the main input factors for each run. To calculate
the coded value for the interaction term ultrasonic bath time * cleaning solu-
tion, we need to multiply the coded values for each of these input factors.
The results can then be displayed in a new column that represents the coded
values of the interaction term. Figure 9.8 shows three additional columns,
Experimenting 201 • 219
FIGURE 9.8
Coded runs for the full factorial with yield results and interaction factors.
one for each two factor interaction. For example, let’s look at run 1. The coded
value for ultrasonic bath time is −1 and the coded value for cleaning solution
is −1. The product of these two gives us the value for the interaction term.
= (−1) × (−1) = 1
Now we’ll use the same method to calculate the effect on yield of the
interaction terms. The interaction term physically means the change in
sodium atoms present when the ultrasonic bath time and cleaning solution
values are both low or are both high, as opposed to when one is high and
the other is low. The effect of the interaction term ultrasonic bath time *
cleaning solution can be calculated.
The effect of the interaction term ultrasonic bath time * rinse time is
While I’ve not included this term in Figure 9.8, we can also calculate the
effect of the interaction of all three ultrasonic bath time * cleaning solution
concentration * rinse time in a similar manner to the other interaction terms:
In this example, most of the interactions have little effect on the sodium
ions remaining after cleaning. The ultrasonic bath time * cleaning solu-
tion concentration interaction term shows a reduction in sodium ions by
5.8 × 1012 atoms/cm 2 when either both input factors are at their lowest level
or both input factors are at their highest level. Comparison of the relative
values of each main effect and interaction terms tells us that the most sig-
nificant in order are ultrasonic bath time, ultrasonic bath time * cleaning solu-
tion concentration, rinse time, and cleaning solution concentration * rinse time.
Notice that the interaction terms were more significant than the main effect
of cleaning solution concentration. However, the interaction term ultra-
sonic bath time * cleaning solution concentration tells us that the cleaning
solution concentration is an important process (input) variable in reducing
the sodium ions from the surface of the parts.
FIGURE 9.9
Coded runs for the fractional factorial with yield results. In this case, interactions are
confounded with the main effects.
TABLE 9.2
Confounding Pattern for the Fractional Factorial Example
Effects Aliases
Ultrasonic bath time (min) = Cleaning solution (%) * rinse time (min)
Cleaning solution (%) = Ultrasonic bath time (min) * rinse time (min)
Rinse time (min) = Ultrasonic bath time (min) * cleaning solution (%)
Table 9.3 summarizes the results from both the full and fractional facto-
rial. Notice that the main effect results are not the same as the full factorial
experiment. Recall in the full factorial, we found that one of the interactions
was significant. Here in the fractional factorial, we have no way of knowing
if an interaction is significant. There are a number of conclusions that can be
drawn with regard to the effects of these variables on the final result.
1. The ultrasonic bath time has the strongest effect in both the full and
fractional factorial.
2. The cleaning solution has the smallest effect in both the full and frac-
tional factorial.
222 • Problem Solving for New Engineers
TABLE 9.3
Calculated Effects for All Terms in the Full and Fractional Factorial Examples
Full Factorial Fractional
Effects Results Factorial Results
Ultrasonic bath time (min) −18.3 −17.5
Cleaning solution (%) 0.3 1.1
Rinse Time (min) −2.8 3.1
Cleaning solution (%) * rinse time (min) 5.8
Ultrasonic bath time (min) * rinse time (min) 0.7
Ultrasonic bath time (min) * cleaning solution (%) 0.8
Ultrasonic bath time (min) * cleaning solution (%) 0.8
* rinse time (min)
3. The magnitude of the effect for rinse time is similar between the full
and fractional factorials but the signs are opposite. (More on this in
the next section.)
What do these numbers mean? The “average” part has 20 × 1012 ions/cm 2
of sodium for the runs in this experiment. If the ultrasonic bath time is
increased from our low level to the high level, we expect to reduce the
sodium ions per part by 17.5 × 1012 ions/cm 2. Similarly, if the cleaning solu-
tion is increased from the low to high level, we would expect to increase
the sodium ions per part by 1 × 1012 ions/cm 2. However, when the rinse
time is increased from the low to high level, we would expect to increase
the number of sodium ions per part by 3.1 × 1012 ions/cm 2. This is not what
we see in the data or in the full factorial results. Although the strong effect
of ultrasonic bath time is reasonably estimated, the other two main effects
are not well captured by this fractional factorial model at all.
35
30
25
20
15
10
0
0 5 10 15 20 25 30 35 40
Sodium ions measured (1012 atoms/cm2)
(a)
30
Sodium ions predicted
25
(1012 atoms/cm2)
20
15
10
0
0 5 10 15 20 25 30 35
Sodium ions measured (1012 atoms/cm2)
(b)
FIGURE 9.10
Comparison of actual versus predicted from the model for both the full (a) and fractional
(b) factorial models.
224 • Problem Solving for New Engineers
The cube plots in Figure 9.11 provide a visual of the process space for the
two sets of experiments. We can see a direct comparison between the full
factorial model and predictive capability of the fractional factorial model.
The fractional factorial experiment predicts that the best experimental
conditions, with the lowest value for sodium ions present on the parts,
25.6625 13.9375
28.3375 15.1625
−1 Cleaning solution (%) 1
30.3375 6.9625 1
31.05 13.6
28 10.55
−1 Cleaning solution (%) 1
30 12.55 1
FIGURE 9.11
Cube plots of the full (a) and fractional (b) factorial process space. The “predicted” values
are circled for the fractional factorial.
Experimenting 201 • 225
give us 9.5 × 1012 ions/cm 2 . This was run 2 in the fractional factorial experi-
ment. Notice that this predicted value is higher than the value we obtained
as our best result in the full factorial model, where we had more experi-
mental data on which to build our model and where our model predicted
7.0 × 1012 ions/cm 2 .
Let’s discuss this discrepancy between the full and fractional facto-
rial results. The full factorial tells us that rinsing longer will reduce the
sodium ions while the fractional factorial would have us use the shorter
rinse time. Let’s take look at the confounding pattern for rinse time in
the fractional factorial. We see in Table 9.2 that the main effect term
rinse time in the fractional factorial is actually confounded with the
interaction term ultrasonic bath time * cleaning solution. From Table
9.3, recall the magnitude of this interaction term. The magnitude of the
ultrasonic bath time * cleaning solution interaction term had the sec-
ond largest effect magnitude in the full factorial. With the fractional
factorial, we cannot tell the difference between the main effect and the
interactions. A strong interaction effect can mask the main effect that
it confounds.
9.10 NONLINEARITY, REPEATABILITY,
AND FOLLOW-UP EXPERIMENTS
When we compare this to the result from the full factorial, we see that
the lowest sodium ions per part were found under different conditions.
Although tempting, just looking at the best run conditions may be mis-
leading, as we can see from this example. Ultimately, with a screening
experiment, we want to screen for the important factors then follow up
with an experiment(s) to characterize the process space and identify an
optimum condition.
It’s clear that 7.3 × 1012 ions/cm 2 is better than 35 × 1012 ions/cm 2 of sodium
on each part. If the specification was less than 10 × 1012 ions/cm 2 of sodium
per part, for some engineers and scientists, this might be the end of the
experiment. However, for others, this might be the beginning. Options for
follow-up experiments might include the following:
3. Extend the matrix even further along the path of steepest descent.
Would additional ultrasonic bath time and rinse reduce the sodium
even further?
4. Since cleaning solution had the least effect, we might consider drop-
ping it from any further experimentation, in other words, holding it
constant.
5. Since the cleaning solution had little impact on the results, we might
take this as good news. Maybe we could relax controls on the clean-
ing solution or maybe we could further reduce the solution to 1% or
2% and save money.
There are many options for future combinations. These are only a few.
I want to spend this last paragraph discussing the path of steepest ascent/
descent. This is a part of strategic experimentation that is outlined by
Professors George Box, William Hunter, and Stuart Hunter in Statistics
for Experimenters: An Introduction to Design, Data Analysis and Model
Building. Professor Box and co-authors outline a procedure for creating
contour lines using the method of least squares within the process space.
The path of steepest ascent/descent is perpendicular to the contours (Box
et al. 2005). Another follow-up option might be to allow the path of steep-
est ascent or descent to predict an extrapolated value of interest and per-
form several exploratory runs based on the prediction. This experimental
evolution is actually a strategic use of resources that allows us to continue
to improve the experimental results.
9.11 KEY TAKEAWAYS
Designed experiments allow multiple process variables to be changed
simultaneously and at the same time allow us to capture complex interac-
tions between the variables with fewer experiments. Although the ideas
and concepts involved in designed experimentation are fairly straightfor-
ward, larger designs quickly become complex. Software packages make
the computations and graphical preparation easy. It is critical that experi-
menters keep in mind that the software package doesn’t analyze the mean-
ing or interpret the results. Analysis and interpretation are still up to the
investigator.
Experimenting 201 • 227
P.S. At this point, you are ready; try a simple, inexpensive designed
experiment where you vary two or three process variables. Trail versions
of statistical software packages can be downloaded for free, which allow
easy analysis.
REFERENCES
Box, G. E. P., W. G. Hunter, and J. S. Hunter. 2005. Statistics for Experimenters: An
Introduction to Design, Data Analysis and Model Building, 2nd Edition. New York:
John Wiley & Sons.
Buie, M. J. and F. Khorasani. 1998. Using Simulation for Matrix Determination in Process
Characterization. Presented at Sematech Statistical Processes Conference in Austin,
TX.
Dormehl, L. 2014. The Formula: How Algorithms Solve Our Problems…and Create More.
New York: Penguin Group.
Fisher, R. A. 1925. Statistical Methods for Research Workers. Edinburgh: Oliver and Boyd.
GE. 2016. What the Doctor Ordered: New Silicon Valley Startup and Stanford Health
Care Will Test Digital Device Claims. GE Report. http://www.gereports.com/post
/112786788335/what-the-doctor-ordered-new-silicon-valley.
Guyatt, G. 1991. Evidence-based Medicine. ACP Journal Club 114:A-116.
Jones, B. and C. J. Nachtsheim. 2009. Split-Plot Designs: What, Why, and How. Journal of
Quality Technology, 41(4):340–361.
Sobel, D. 2000. Galileo’s Daughter: A Historical Memoir of Science, Faith and Love. New York:
Penguin Books.
Sur, R. L. and P. Dahm. 2011. History of Evidence-based Medicine. Indian Journal of
Urology 27(4):487–489.
http://taylorandfrancis.com
10
Strategic Design: Bringing
It All Together
You’ve got to think about big things while you’re doing small things, so
that all the small things go in the right direction.
Alvin Toffler
Let’s talk about planning. You may be wondering why I’m including a
chapter on planning at the very back of the book rather than the very first
topic. There are multiple reasons. First, planning may be the single activ-
ity that we resist the most. I’ve experimented and taught experimentation
for several decades now. Even when I provide a template and stress how
important planning is, most students and new scientists and engineers
give me a “deer in the headlights” kind of stare when I want to review
their plan. The second reason for including it here is because we’ve finally
covered all the considerations for experimentation. What procedures or
checklists need to be prepared? What noise factors can I live with as a
part of the experiment? What do I need to measure and control? What
equipment will be used? Is the equipment reliable? Is the equipment
repeatable? How much systematic and random variation can I expect for
each measurement? How much random variation is there within certain
materials? What type of experiment would best answer my questions?
What type of analysis will I need to do on the data? Everything we’ve cov-
ered is a piece of the puzzle; now, in this chapter, we bring it all together
in a plan.
229
230 • Problem Solving for New Engineers
10.1 PROCESS OF PLANNING
There is a great quote from Dwight D. Eisenhower’s address to the
National Defense Executive Reserve Conference from November 14, 1957,
about planning that I always come back to. Speaking about going to battle but
the same concepts translate to experimentation, President Eisenhower says,
I heard long ago in the Army: Plans are worthless, but planning is every-
thing. There is a very great distinction because when you are planning for
an emergency you must start with this one thing: the very definition of
“emergency” is that it is unexpected, therefore it is not going to happen the
way you are planning. So the first thing you do is to take all the plans off
the top shelf and throw them out the window and start once more. But if
you haven’t been planning you can’t start to work, intelligently at least. That
is the reason it is so important to plan, to keep yourselves steeped in the
character of the problem that you may one day be called upon to solve – or
help to solve. (Eisenhower 1957)
10.2 WHAT’S IN A PLAN?
There are lots of good reasons for developing plans prior to experiment-
ing. The primary reason for planning is being completely prepared or as
prepared as possible. Being prepared ensures that we’ve thoughtfully con-
sidered all the options available to us and that we are completely ready to
Strategic Design • 231
Having a solid plan that addresses each of these concerns will help
ensure that we are addressing the problem at hand and we will know to
what extent we can be certain of the results. It allows us to stay on track
and focus on the big picture rather than getting lost in the details. We have
documentation of all our assumptions, justifications, and expectations.
Experimental plans can be formal or informal, detailed documents
or sketched outlines. At a minimum, the information to include in an
experimental or problem solving plan is a problem statement, required
resources, uncertainty analysis, tasks to be completed, and the schedule or
timeline. A more thorough plan might additionally include cost estimates
and design discussion, as well as data collection, analysis, and interpre-
tation considerations. Independent of the type of plan, some form of
experimental plan should be developed prior to taking any actions.
The less experienced the problem solver is, the more detailed the plan
should be. Some companies or research groups provide problem-solving
templates for scientists and engineers to use. These templates will allow us
to populate the information in each section. Other companies or research
groups can be less formal. As a side benefit to planning, remember, the
more work we do here in the planning phase, the better prepared we will
be for the final report and the less time it will take to collate, organize, and
document our accomplishments.
There are formal problem-solving structures available that can be help-
ful in creating a plan. Check with your company about a specific for-
mat. Four of the most commonly used in industry are known as A3, 8D,
DMAIC, and TRIZ. A3 problem solving was developed in Japan. A3 is
the size of a standard piece of paper at Toyota in Japan. A3 problem solv-
ing outlines the problem on a single piece of paper and follows the PDCA
(plan–do–check–act) outline (Matthews 2010). One advantage of using
the A3 format is that it forces us to be as concise as possible because every-
thing must fit on a single piece of paper. In 1986, a team at Ford Motor
Company developed the 8D template to capture the eight disciplines of
problem solving (Duffy 2014, Rambaud 2011). Drs. Arthur Jonath and
Fred Khorasani have published a nice example of using the 8D problem-
solving approach in the development of a medical device (Jonath and
Khorasani 2011). There are many other example cases published on the
Internet as well. Six Sigma’s structured problem-solving methodology is
known as DMAIC (Define, Measure, Analyze, Improve, and Control) and
was developed at Motorola also in 1986 (Castaneda-Mendez 2012, Cudney
and Agustiady 2016, Wortman et al. 2007). The TRIZ methodology was
Strategic Design • 233
developed in Russia. The initials come from the Russian name “Teoriya
Resheniya Izobreatatelskikh Zadatch,” which, translated into English,
is “Theory of Inventive Problem Solving” or “Creative Problem Solving
Method.” Genrich Altshuller, a Russian patent reviewer, came up with this
method in 1946 by studying how discoveries were made (Cerit et al. 2014,
Ekmekci and Koksal 2015). The TRIZ Foundation is a good resource for
examples of using the method (TRIZ 2016).
We’ll not go into great detail with these methodologies here (with the
exception of the DMAIC technique), but they may be helpful in creat-
ing a template and structure for problem-solving planning. I have used all
of these; however, because of my Six Sigma training, I naturally migrate
toward the DMAIC plan. I recommend learning a bit about different
problem-solving approaches to determine which might be the most valu-
able for a particular situation. Additionally, check with your company as
they may prefer one approach over another.
Before we discuss the DMAIC planning methodology, I want to state
what may be obvious to some but not so much to others. Initially, the plan-
ning document should start out very detailed in the Define and Measure
sections and become less detailed in the Analyze, Improve, and Control
sections. As more information becomes available during the experimen-
tation, revamping of the planning document will be required, most likely
several times. Don’t be afraid to do this. Remember President Eisenhower’s
advice and “throw them [plans] out the window.” The more information
we gather and learn, the more our planning document grows. Keep the
document alive and updated until the experiment is completed or the
problem is solved.
10.4 MURPHY’S LAW
Assuming that we’ve planned our experiment well, have a fully charac-
terized measurement system, and have quantified any random variation,
execution of the experiment should be straightforward. Well, don’t forget
about Murphy’s Law. Murphy’s Law roughly states that if anything can go
wrong, it will … and at exactly the wrong time. There is another form: “if
someone can get it wrong, they will.” No matter how much we plan, things
will happen that aren’t in the plan. Unexpected events occur frequently,
and when they do, we will need to decide whether to proceed or start
Strategic Design • 237
10.5 KEY TAKEAWAYS
Test plans are an important and sensible part of performing an experi-
ment. Plans save time and money, assist in getting the best results, and can
facilitate speedy test report writing. Reviewing a plan with a more experi-
enced engineer prior to performing any part of the experiment may help
us avoid costly mistakes. Most of all, it’s important to have a good balance
between planning and improvisation.
REFERENCES
Castaneda-Mendez, K. 2012. What’s Your Problem? Identifying and Solving the Five Types
of Process Problems. New York: Productivity Press/Taylor & Francis.
Cerit, B., G. Kucukyazici, and D. Sener. 2014. TRIZ: Theory of Inventive Problem Solving
and Comparison of TRIZ with the Other Problem Solving Techniques. Balkan
Journal of Electrical & Computer Engineering 2(2):66–74.
Cudney, E. A. and T. K. Agustiady. 2016. Design for Six Sigma: A Practical Approach
through Innovation. New York: CRC Press/Taylor & Francis.
238 • Problem Solving for New Engineers
There will be opened a gateway and a road to a large and excellent science
into which minds more piercing than mine shall penetrate to recesses
still deeper.
Galileo
The prior chapters have introduced the strategic problem solving con-
cepts necessary to confidently craft experimental plans and effectively
communicate findings. Chapter 2 examined the myths related to problem
solving that can stop us from pushing forward. Chapter 3 reviewed the
importance of communication and the common tools used in commu-
nication. The types and characteristics of data, definition of uncertainty,
and an introduction to variation were presented in Chapter 4. Chapters 5
through 7 introduced three basic types of variation found in experimenta-
tion. Chapter 5 covered the importance of controlling unintentional varia-
tion with preparation of checklists and/or standard operating procedures.
Chapter 6 explored systematic variation introduced by measurement
equipment, while Chapter 7 looked at natural random variation within
an experiment. Intentional variation was covered in Chapters 8 and 9,
where the resulting data were used to build representative, descriptive
mathematical models. Chapter 10 introduced the critical nature of strate-
gic experimentation. My primary goal has been to compile the tools and
organize an overarching strategy for anyone new to problem solving and
experimentation and to provide additional resources and reference mate-
rials for further growth.
Where to next? All that’s left to do is begin experimenting. Beginning
can be the most difficult part, but once we start wrestling with these tools
and strategies, doors will begin to open. It is only by struggling through
239
240 • Problem Solving for New Engineers
these ideas and concepts and living in the uncertainty of whatever hap-
pens that experimental problem solving muscles begin to develop. It is
then and only then that we begin to discover for ourselves the fascinating,
amazing world of science and engineering.
successful scientists and engineers who have not been exposed to strategic
problem solving. Knowledge of strategy takes that necessary knowledge of
subject matter and makes it elegant.
Although knowledge of our subject area and basic skills in our field are
critical, in order to be a successful problem solver, we must hold on to
our imagination and creativity. For scientists and engineers, knowledge
and creativity go hand in hand. Knowledge of subject matter is the sub-
stance that feeds and fuels our curiosity and creativity. As young chil-
dren, we might have been constantly creating. Many of us have become
less creative as we get older. It is possible to recapture that curiosity and
imaginative exploration by allowing ourselves to be inquisitive and curi-
ous. Experiments test out our theories and answer niggling questions.
Experiments help us solve problems. To create is to be human, and as with
other abilities, it can be developed into a spectrum of competencies.
Curiosity and creativity alone are not enough, however. It is through the
deliberate accumulation of subject knowledge that “AHA!” moments arise.
We need background information in our long-term memory in order to
more be more creatively efficient. As educators have long known, break-
throughs, those “EUREKA!” moments, are more likely to spring from a
larger store of background materials in our long-term memory library
(Leslie 2014). To rephrase Louis Pasteur’s famous statement, luck or seren-
dipity favors the prepared, sagacious mind.
Great scientists and engineers didn’t win an experiment lottery. Their
most impressive discoveries did not fall into their laps. Think of the broth-
ers Orville and Wilbur Wright (McCullough 2015). They weren’t discour-
aged by the myriad of problems they encountered along the way to the first
flight. Each problem was another hurdle that got them closer to the big
problem they wanted to solve. They worked for years, dedicating their lives
and sacrificing much for the accomplishment. The problem solving under-
lying creativity is cognitive thought with “very high degrees of persistence
and motivation” (Weisberg 2013). Orville and Wilbur’s persistence and
motivation likely resulted from their mindset.
Our mindset has to do with our own self-perception about how we learn
or if we can learn something new (Dweck 2007). With a fixed mindset,
we cannot grow and develop further. With a growth mindset, we can.
242 • Problem Solving for New Engineers
One must learn by doing the thing; though you think you know it, you have
no certainty until you try.
Sophocles
The main virtue of a first sketch is that it breaks the blank page. It is the
spark of life in the swamp, beautiful if only because it is a beginning. …
When we envy the perfect creations of others, what we do not see, what
Where to Next? • 243
we by definition cannot see, and what we may also forget when we look at
successful creations of our own, is everything that got thrown away, that
failed, that didn’t make the cut. When we look at a perfect page, we should
put it not on a pedestal but on a pile of imperfect pages, all balled or torn,
some of them truly awful, created only to be thrown away. This trash is not
failure but foundation, and the perfect page is its progeny. (Ashton 2015)
Some writers even if they do not try to publish them, do not crumple up
false starts or their failed drafts. They save every scrap of paper as if they
recognize that they will never reach perfection and will eventually have
to choose the least imperfect from among all their tries. These documents
of the creative process are invaluable when they represent the successive
drafts of a successful book or any work of a successful writer. … Creating a
book can be seen as a succession of choices and real or imagined improve-
ments. (Petroski 1982)
The seesaw process of creation, of fail, revise, fail, revise, on and on, is
common to art, writing, and experimentation. Each failure is an opportu-
nity to learn and revise for the next round of experimentation.
Great scientists have great failures. Newton, one of the greatest phys-
icists in recorded history, spent a third of his life working on alchemy.
Alfred Russell Wallace, a contemporary of Charles Darwin and indepen-
dent developer of the theory of evolution, participated in experiments to
communicate with the dead. Urbain Jean Joseph Le Verrier, the discov-
erer of Neptune, also predicted the existence of another planet, which was
incorrect. Albert Einstein couldn’t reconcile the fundamental controlling
laws of quantum mechanics with his own intuitive understanding of his
beliefs. Thomas Edison has some 1093 US patents. He also has 500 to 600
that failed. Here are a few quotes from Edison about failure:
• “I have not failed. I’ve just found 10,000 ways that don’t work.”
• “I’m not discouraged, because every wrong attempt discarded is
another step forward.”
244 • Problem Solving for New Engineers
• “Results! Why man, I have gotten lots of results. I know several thou-
sand things that won’t work.”
• “When I have fully decided that a result is worth getting I go ahead
of it and make trial after trial until it comes.”
• “Many of life’s failures are men who did not realize how close they
were to success when they gave up.”
The chemical lubricant WD-40 got the 40 in its name because the prior
39 formulations didn’t work. Syphilis cure Salvarsan 606 was aptly named
for the number of attempts to get it right. Linus Pauling, one of the few
scientists to receive two Nobel Prizes, said, “The best way to have a good
idea is to have lots of ideas.” Dr. Pauling knew that it takes many, many
dead ends to find a path that works.
From great scientists and engineers to newbies, we all stand on the shoul-
ders of many who came before us. Galileo, Isaac Newton, Marie Curie,
Linus Pauling, and Albert Einstein all devoted their lives to solving a big
problem. They were aided by the efforts of others. The small problems had
to be solved first before the big one could be addressed. For example, before
Einstein could come up with theory of relativity, Faraday needed to firmly
establish the relationship between electricity and magnetism, which pro-
vided the basics for the electric engine and the concept of energy. Newton
gave us the property of matter known as mass. Lavoisier, the father of
chemistry, showed us how the mass of materials could be combined and
separated establishing conservation of mass. Galileo provided an early
experimental attempt to measure the speed of light. It was finally James
Clerk Maxwell who helped us understand the relationship between elec-
tricity, magnetism, and light following the work of Danish astronomer Ole
Roemer and Faraday. Robert Recorde, an English textbook publisher, gave
us the equal sign. Newton and Leibniz both developed calculus in order
to explain physical phenomena they observed. Newton gave us mass times
velocity and Leibniz gave us mass times velocity squared. It was Emilie
du Châtelet who finally settled the issue and gave us the square. Einstein
harnessed the contributions of all these scientists to develop his famous
equation, E = mc2 (Bodanis 2000).
You might say that all that we will learn in our early problem solving
or experimentation is already known by others, and you might be cor-
rect. We shouldn’t let this discourage us or cause us to say, “Oh, there’s
nothing new here. It’s no big deal. I’m just a novice engineer and what
I do doesn’t really matter.” We can’t sell ourselves short. Scientists and
Where to Next? • 245
engineers readily admit they cannot get everything right and often mis-
takes or wrong information gets published. Too often, we take the word of
well-respected journals or academicians and never duplicate their results.
It is important that experiments be repeated—not just to verify the results
of others but more importantly so that we can discover the results for our-
selves. Replication of experiments is a wonderful way for us to discover for
ourselves what these others before us have discovered. These experiments
allow us to learn what’s important and determine where improvements or
permutations of the experiment might be of interest. When we discover
for ourselves, this discovery is then ours.
As we gain confidence and delve deeper into the mysteries of science
and engineering, there is less and less certainty. Any uncertainty cre-
ated at the beginning of the movie is resolved within 120 minutes. In
science and engineering, many of the mysteries we encounter may never
be resolved. The only certainty is uncertainty. In school, science and
engineering problems are presented as nice little packages of answers.
Dr. Freeman Dyson, retired Princeton theoretical physicist, expert in
quantum electrodynamics and author, observed that science is not a
collection of truths but a “continuing exploration of mysteries” (Leslie
2014). What we know from science and engineering today are actually
answers to the scientific puzzles, the mysteries, solved by those who came
before us. Solving small pieces of the larger scientific puzzle that is life
can be rewarding and at the same time provoke many more questions.
For many of us, this is part of the fun of science—this never-ending quest
to put pieces of the larger puzzle together—to find those puzzle pieces
that aren’t Google-able. This “rigorous and persistent exploration of what
we don’t know” is really what keeps us curious. The inventor and audio
pioneer Ray Dolby said, “To be an inventor, you have to be willing to live
with a sense of uncertainty, to work in the darkness and grope toward
an answer, to put up with the anxiety about whether there is an answer”
(Leslie 2014). As scientists and engineers, we are explorers, adventurers,
and innovators each time we discover something unknown to ourselves.
The more we embrace the unknown and solve problems, the easier it gets.
“Part of being able to tackle complex and difficult questions is accepting
that there is nothing wrong with not knowing. People who are good at
questioning are comfortable with uncertainty” (Berger 2014). Each time
we discover for ourselves, we become more confident through our expe-
riences with experimentation. Questioning and experimenting go hand
in hand.
246 • Problem Solving for New Engineers
REFERENCES
Ashton, K. 2015. How to Fly a Horse: The Secret History of Creation, Invention, and
Discovery. New York: Doubleday.
Berger, W. 2014. A More Beautiful Question: The Power of Inquiry to Spark Breakthrough
Ideas. New York: Bloomsbury USA.
Bodanis, D. 2000. E = mc2: A Biography of the World’s Most Famous Equation. New York:
Walker & Company.
Duckworth, A. 2016. Grit: The Power of Passion and Perseverance. New York: Simon &
Schuster.
Dweck, C. S. 2007. Mindset: The New Psychology of Success. New York: Random House.
Goldsmith, B. 2005. Obsessive Genius: The Inner World of Marie Curie. New York: W. W.
Norton.
Leslie, I. 2014. Curious: The Desire to Know and Why Your Future Depends on It. New
York: Basic Books.
McCullough, D. 2015. The Wright Brothers. New York: Simon & Schuster.
Oakley, B. 2014. A Mind for Numbers: How to Excel at Math and Science (Even if You
Flunked Algebra). New York: Jeremy P. Tarcher/Penguin.
Petroski, H. 1982. To Engineer Is Human: The Role of Failure in Successful Design. New
York: Vintage Books/Random House.
Weisberg, R. W. 1993. Creativity: Beyond the Myth of Genius. New York: W. H. Freeman
& Co.
12
One More Thing…
… like any skill, becoming very good at scientific reasoning requires both
practice and talent. But becoming tolerably good requires mainly practice
and only a little talent. And for most people tolerably good is good enough.
So work at developing your skills little by little.
Ronald N. Giere
The ideas and concepts gathered in this book are from my own experi-
ences. However, in my research and preparation for this book, I discov-
ered so many valuable and wise references from a variety of fields. Each
new reference I found sent me in multiple directions to read additional
new authors. I am grateful to them all for the valuable contribution they
have made to the body of literature available on the topics discussed here
but also to my work. The following list contains the books and papers
I enjoyed the most.
12.1 REFERENCES ON EXPERIMENTATION
A wonderful reference with fun experiments that makes great reading
is Thinking, Fast and Slow by Daniel Kahneman. Professor Kahneman
is a behavioral economist. His book is an easily read collection of his
experiments. Even for those of us in the physical sciences, reading about
experiments in other areas can be a source of inspiration and enjoy-
ment. Behavioral economists have fun with their experiments and it
shows. Several other examples in my library referenced herein include
Predictably Irrational by Dan Ariely and Freakanomics by Steven D. Levitt
and Stephen J. Dubner.
247
248 • Problem Solving for New Engineers
12.2 REFERENCES ON COMMUNICATION
Professor Roald Hoffman’s, Nobel Laureate in chemistry, writings tell
me that he loves science. Jeffrey Kovac and Michael Weisberg collected
his writings in, Roald Hoffman on the Philosophy, Art, and Science of
Chemistry. I can’t write this without giving physics equal time. Professor
Richard Feynmann’s books are also enjoyable. The Pleasure of Finding
Things Out is a great first read for someone new to Feynmann. For
additional information on the topic of data displays, Stephen Few and
Professor Edward Tufte’s books are invaluable. They are all wonderful
and stress the codependence of language and graphics in communica-
tion. The other book that I’d recommend is Carmine Gallo’s Talk Like
TED: The 9 Public-Speaking Secrets of the World’s Top Minds. Public
speaking takes practice, if you do not have opportunities through your
activities to present to an audience; Toastmasters International is a won-
derful resource. There are many resources available to assist people with
presentation skill development. TED talks are great resources available
on the Internet to see great speakers—some well known and others less
well known. Additional resources are available at Toastmasters, an edu-
cational nonprofit specifically established to assist members with public
speaking and communication. Regional clubs have been established all
around the world.
Jeffrey and Laurie Ford have written a powerful book on communica-
tion. The Four Conversations: Daily Communication That Gets Results is a
useful tool for further development of ourselves in requesting or question-
ing conversations with our assistants, peers, or managers.
12.4 REFERENCES ON CHECKLISTS
I found Dr. Atul Gwande’s Checklist Manifesto: How to Get Things Right
an invaluable reference book on this topic. A fascinating story that covers
the development of good lab practices can be found in Rebecca Skloot’s
book The Immortal Life of Henrietta Lacks. Time-dependent measure-
ments cannot always be avoided, and when they can’t, I’d recommend
beginning with either Experimental Methods for Engineers by J. P. Holman
or Experimentation and Uncertainty Analysis for Engineers by Hugh W.
Coleman and W. Glenn Steele.
12.5 REFERENCES ON MEASUREMENTS
As far as I know, nothing completely entertaining has been written about
measurement system analysis. Therefore, I will just refer you to the mea-
surement system analysis manual, and if that isn’t enough, try reading
through Design and Analysis of Gauge R&R Studies: Making Decisions
with Confidence Intervals in Random and Mixed ANOVA Models (ASA-
SIAM Series on Statistics and Applied Probability) by Richard K. Burdick,
Connie M. Borror, and Douglas C. Montgomery.
250 • Problem Solving for New Engineers
12.6 REFERENCES ON RANDOMNESS
Randomness is fascinating, and like chaos, there have been many won-
derful books written on the topic. A few of the books that I’ve really
enjoyed include Naked Statistics: Stripping the Dread from the Data by
Charles Wheelan and The Drunkard’s Walk: How Randomness Rules Our
Lives by Leonard Mlodinow. Although these books are written by super
smart professors from Dartmouth and California Institute of Technology,
they’ve been able to write about random statistical phenomena with an
entertaining and historical slant. I found Creativity, Inc. by Ed Catmul
enjoyable because he wrote on the randomness of how creative work pro-
gresses. If you are looking for a relaxing book filled with advice from a
cartoonist, then Scott Adams’ How to Fail at Almost Everything and Still
Win Big: Kind of the Story of My Life is worth a read. Mr. Adams’ book
encourages systems thinking and looking for patterns in life. To gain a
more technical understanding of the intricacies of normal distribution,
one of the best resources that I’ve found is Introduction to Error Analysis:
The Study of Uncertainties in Physical Measurements by John R. Taylor.
Professor Taylor covers the normal distribution and develops proofs of
what we know about normal distributions. He also has a chapter dedicated
to Chauvenet’s criterion. His book would be a great asset to an experimen-
tal physical scientist’s library.
12.7 REFERENCES ON STATISTICS
AND DESIGNED EXPERIMENTATION
Naked Statistics: Stripping the Dread from the Data by Professor Charles
Wheelan is an excellent introductory reference on regression analysis.
His examples can be ridiculous and may leave you groaning, but you will
smile and maybe even laugh on occasion. How many other statistics books
can you say that about? Other good references for one-factor-at-a-time
experimentation are the early algebra and calculus books. The bible on
designed experimentation is Statistics for Experimenters: An Introduction
to Design, Data Analysis, and Model Building by George E. P. Box, William
G. Hunter, and J. Stuart Hunter. You may find the book difficult to navi-
gate, but stick with it. Another text that might be more approachable is
One More Thing… • 251
12.8 REFERENCES ON CURIOSITY,
CREATIVITY, AND FAILURE
The books How to Fly a Horse by Kevin Ashton, Curious: The Desire to
Know and Why Your Future Depends on It by Ian Leslie, and A More
Beautiful Question: The Power of Inquiry to Spark Breakthrough Ideas by
Warren Berger solidified my thoughts on creativity and genius. Professor
Carol Dweck’s Mindset and Professor Barbara Oakley’s A Mind for
Numbers confirmed my own experience with research and data that per-
severance is critical to success. Scott Berkun’s The Myths of Innovation
does a great job of expelling many of the myths surrounding the creative
process. Professor Henry Petroski’s articles and books are educational
and enjoyable.
Books about great scientists and engineers who failed are fascinating.
It reminds me how difficult it is and what it takes to achieve great things.
The Wright Brothers by David McCullough is an incredible read. Brilliant
Blunders by Mario Livio and Einstein’s Mistakes: The Human Failings of
Genius by Hans C. Ohanian are good places to begin.
http://taylorandfrancis.com
In Gratitude
I gratefully acknowledge the contributions of the following friends and
colleagues:
253
254 • In Gratitude
A C
Adams, Scott, 250 Cancer patients, protein marker used in, 72
Alvarez, Luis, 29 Cardano, Girolamo, 34, 36
Amazon, 198 Catmul, Ed, 165, 250
Analysis of variance (ANOVA), 136, 145 Central limit theorem, 157, 162, 193
Anderson, M. J., 251 Chalmers, Ian, 198
Apgar, Virginia, 115 Chalmers, Thomas, 198
Apple iPhone, 198 Châtelet, Emilie du, 244
Appraiser variation, 132 Chauvenet’s criterion, 168, 250
Ariely, Dan, 24, 247 Child, Julia, 2
Aristotle, 43 Cleveland, William, 53
Ashton, Kevin, 242 Cochrane, Archie, 198
Asimov, Isaac, 19 Coefficient of variation (CV), 213
Cognitive bias, 114–115
Coleman, Hugh W., 249
B
Communication, see Storytelling,
Bacon, Francis, 182 experimenting with
Beginning experimenting, 239–246 Confounding, 204, 208
knowledge, 240 Contour plot, 58
mindset, 241 Coulomb, Carl Friedrich, 157
perseverance, 242, 246 Curie, Marie, 3, 5, 110, 134, 242, 244
seesaw process of creation, 243 Curie, Pierre, 5, 110
uncertainty, 245 Current noise, 89
Begley, Glen, 9 CV, see Coefficient of variation (CV)
Bell curve, 157
Bell Labs, 108
D
Berger, Warren, 251
Berkun, Scott, 251 Darwin, Charles, 94, 101, 243
Bias Data
cognitive, 114–115 bad, 107–108
emotional, 113 chaos, 72–76
measurement, 130 collection, 231
Boeing Corporation, 102 integrity, 99–106
Bohr, Niels, 110 measurement scales and units, 77–78
Boorstin, Daniel, 182 most preferable type of, 74
Borror, Connie M., 249 qualitative, 206
Box, George, 226, 250 quantitative, 73
Braze process, Input–Process–Output significant digits, 76–77
diagram for, 51 storytelling with, 41–44
Brown, Brene, 44 unambiguous, 40
Burdick, Richard K., 249 Deardorf, David, 81, 248
255
256 • Index
O Regan, Gene, 3
Regression design, 205
Oakley, Barbara, 251
Response surface designs, 204
Ohanian, Hans C., 251
Roemer, Ole, 244
Ohm’s law, 184
Royal Society, 181
Oncology, 72
Rule of Ten, 126
Ozone levels, 175
Rutherford, Ernest, 111, 134
P
S
Paradigms, 113–114
Parallax effect, 132 Sall, John, 197
Pasteur, Louis, 28, 241 Sandberg, Sheryl, 114
Pauling, Linus, 94, 244 Scatter plots, 58
Pearson product moment correlation Schlossberg, Edwin, 41
coefficient, 193 Schottky, Walter, 89
Petroski, Henry, 165, 243, 251 Scrabble, 6
Pie charts, 58–60 Semiconductor equipment manufacturers,
Poincaré, Henri, 28, 108 99
Precision, definition of, 126 Shockley, William, 29
Priestley, Joseph, 113 Shot noise, 89
Priming, 114–115 “Signal-to-noise ratio,” 72
Process flow charts, 47–48 Significant digits, 76–77
Proctor & Gamble, 102 Skloot, Rebecca, 249
Pushkin, 4 Sophocles, 93, 242
Pythagoras, 34 Spreadsheet programs, 61
Standard error, 164
Standard operating procedures, 103–105
Q Stanford-Binet IQ test, 29
Quantitative measurements, 206 Stanton, Andrew, 42
Quetelet, Adolph, 157 Statistical analysis, benefits to, 9
Steele, W. Glenn, 249
Steininger, John, 5
R
Storytelling, experimenting with, 13,
Rail systems, 52 33–69
Randall, Lisa, 48, 121 conclusions, importance of, 67–68
Randomization, 209 data, storytelling with, 41–44
Random variation, 87, 149 experimental results, communication
Recorde, Robert, 244 of, 53–66
References/resources, 15, 247–251 graphics, storytelling with, 44–53
checklists, 249 graphs, components of, 53–58
communication, 248 histogram, 60–63
curiosity, creativity, and failure, 251 hockey stick graph, 64
error analysis, 248–249 hypothesis development, 42
experimentation, 247 language of science, 35–41
measurements, 249 mean free path, 52, 53
randomness, 250 pie charts, 58–60
statistics and designed process flow diagram, 47
experimentation, 250–251 scatter plots, 58
260 • Index
checklists, 101–103 W
data integrity, 99–106
Wagner, David R., 175
dynamic measurements, 106
Wallace, Alfred Russell, 110, 243
history of mistakes, 94–96
Washington, George, 134
human behavior, theories of, 109
Wheelan, Charles, 71, 187, 250
ineffective communication, 99
Whitcomb, P. J., 251
Input–Process–Output diagrams,
Williams, Holly, 42
105–106
Wright, Wilbur and Orville, 3, 241
intuition and hunches, 109–113
paradigms, 113–114
standard operating procedures, X
103–105
unintentionally introducing variation, x–y scatter plots, 63
96–99
Velickovic, Vladica, 8
Y
Vigen, Tyler, 188
Visual test for fit, 158 Youden, Jack, 149
Vocabulary of Metrology (VIM), 82 Young’s modulus, 185