Chapter 10 Measurement and Statistics
Chapter 10 Measurement and Statistics
Modeling
Chapter 8:
Measurement and statistics
1
Measurement and Statistics
Purpose:
This section is about the methodology of measurement. What goes into designing an
experiment, gathering some numbers, interpreting the results, and presenting those
results to management in a way that allows them to make the necessary decisions.
Warm-up Experiment:
Divide into teams and measure the length of an object in the classroom. To do so you
will need to make team decisions about tools, techniques, and reporting metrics.
2
Measurement and Statistics
FUNDAMENTAL QUESTIONS ABOUT MEASUREMENT:
i.What kind of accuracy can you expect from a computer (or any other)
measurement?
ii.When you make a measurement, can you believe the result? How sure are you
of the result?
iii.How should you state the result of an experiment? How do you reflect your
belief in its accuracy?
vi.How do you extrapolate from what you know to what you'd like to know?
3
Measurement and Statistics
FUNDAMENTAL QUESTIONS ABOUT MEASUREMENT:
iv.Should you always know the result of a measurement before you make it?
v.How do you figure out dependencies; how does one variable depend on
another?
vi.So after all this talk about the details of measurement, how do you actually
design an experiment?
4
Measurement and Statistics
1. What kind of accuracy can you expect from a computer (or any
other) measurement?
5
Here are some factors that lead to experimental variation:
6
Measurement and Statistics
1. What kind of accuracy can you expect from a computer (or any other)
measurement?
Example: You want to measure the time required to execute a routine and
have available a system call named get_time_of_day. get_time_of_day
returns time in units of 1/65535 seconds = 16 microseconds. The time
required to execute the get_time_of_day routine itself is 100
microseconds. What is the shortest routine that can be measured with
this tool? How would you do it?
Bottom Line: Never believe a real system number to better than 5 - 10%.
Artificial numbers can sometimes be repeated to 1 - 2%, but are
susceptible to spurious factors.
7
Measurement and Statistics
2. When you make a measurement, can you believe the result? How
sure are you of the result? ?
Suppose you make several determinations of some measure. If you can answer yes to
the following questions, then you can have some faith in your measurement:
• Can you explain why the numbers vary? (“Handwaving” isn't allowed here, but “statistics”
may be a valid answer.)
• If variations are greater than 10%, can you figure out what's causing the variation and could
you eliminate it if time allowed?
• If the granularity of your tool is greater than the measurement variations, is that acceptable?
(Your granularity then becomes your uncertainty.)
m1 , m2 ,...mn 8
Measurement and Statistics
2. When you make a measurement, can you believe the result? How
sure are you of the result? ?
( M mi ) 2
The first form of the Standard Deviation is the form of the underlying data. The second form is that of
the measured data. They are the same for an infinite amount of data and close enough for a large set
of numbers.
NOTE: Use of these equations assumes that the measurements are independent of each other.
9
Measurement and Statistics
2. When you make a measurement, can you believe the result? How sure
are you of the result?
You’ve just measured the Performance of the latest release of your product. The numbers are better
than they were when you measured them on the last release. But what does “better” mean.
How do you show that two sets of numbers, with lots of uncertainty in each of the sets, really
have one set better than the other.
First of all, here’s the easy way. With your two sets, calculate their means and their confidence
intervals (the % confidence you use is up to you.) Visually plot these results as show in the
three examples below:
A B C
A. Here the confidence The results are such that The confidence intervals
intervals don’t overlap. The the mean of one set is overlap but the means are not
results are different from each within the confidence inside the CI of the other set.
other. interval of the other set. Need to do a more complex
The two sets are NOT
different.
test. 10
Measurement and Statistics
3. How should you state the result of an experiment? How do
you reflect your belief in its accuracy?
Pat has developed a new product, "rabbit" about which she wishes to determine performance.
There is special interest in comparing the new product, rabbit to the old product, turtle, since
the product was rewritten for performance reasons. (Pat had used Performance Engineering
techniques and thus knew that rabbit was "about twice as fast" as turtle.) The measurements
showed:
Performance Comparisons
Which of the following statements reflect the performance comparison of rabbit and turtle?
o Rabbit is 100% faster than turtle. o Rabbit takes 200% less time than turtle.
o Rabbit is twice as fast as turtle. o Turtle is 50% as fast as rabbit.
o Rabbit takes 1/2 as long as turtle. o Turtle is 50% slower than rabbit.
o Rabbit takes 1/3 as long as turtle. o Turtle takes 200% longer than rabbit.
o Rabbit takes 100% less time than turtle. o Turtle takes 300% longer than rabbit.
11
Measurement and Statistics
3. How should you state the result of an experiment? How do
you reflect your belief in its accuracy?
• The guiding principle in stating a result is to keep it simple.
• State the accuracy using the same methods we've just discussed. Use Means, Standard
Deviations, and Confidence Intervals.
• It goes without saying that reflecting your belief in the accuracy presupposes you’ve done the
experiment correctly. Some simple guidelines:
A. You always do the experiment wrong the first five times. Through experience you learn to look critically
at your result to see if it makes sense. If not, then you go figure out what went wrong. Usually it’s some
parameter that wasn’t controlled.
C. Watch out for interactions between parameters. The result of changing one parameter results in some
other parameter changing as well.
E. Get someone else to check your results – by the time you finish a measurement you have too much
invested in it and are very likely to miss something obvious.
12
Measurement and Statistics
4. Can one number represent the performance of a product?
Recall:
n
Mean or Expected Value: Mean E ( x) pi xi xf ( x)dx
i 1
Median That value for which there’s an equal probability of being above it and below it.
Mode The most likely value. The value with the highest probability.
Mode
Median
Mean
13
Measurement and Statistics
5. When have you measured enough?
This is really two questions:
a) When have you measured enough to get the accuracy of answer that
management expects at this time?
This is a matter of setting the correct expectations before you start. Many times
the answer is in response to a “what if” question – you can get the
appropriate accuracy in one hour. Other times you’ll need weeks of
design/setup/measurement/analysis to get the expected accuracy.
NOTE: Only a small amount of the total experimental time is in the measurement.
Most time goes for design and elimination of unwanted factors. So this
question could be stated as “How complicated should an experiment be?”
b) When have you measured enough to get the degree of accuracy you
expected for the experiment?
1
Confidence
n 14
Measurement and Statistics
5. When have you measured enough?
The relationship between the number of required samples and experimental parameters is:
100 zs
n
rxmean
NOTE: See that the more accuracy you want (s), the more measurements you need.
NOTE: If your numbers all come out the same, stop. Measurement uncertainty is not the largest part
of the error in your metric.
15
Measurement and Statistics
6. How do you extrapolate from what you know to what you'd like to
know?
Often we need a result that is unmeasurable, or would require eons to determine. Is it legal to guess?
Answer:
Sure - as long as you also estimate the uncertainty of your guess.
Here are a few practice situations that will help you improve your powers of estimation. Remember,
there is no RIGHT answer.
1. Estimate how many people will come to this class next week. More important than the answer is
the assumptions you use for your answer.
2. Approximately how many cars were in the parking lot outside this building when you came in
tonight? How many are there now?
4. I recently saw a lawn service truck that had printed on its side “Over 7 trillion blades cut.” Is this
a reasonable claim for them to make?
16
Measurement and Statistics
7. How do you know what tools to use?
• We'll do a lot more on tools later, but for right now, the best answer
is to measure the simplest way possible.
• Make sure the tool is less granular than the required uncertainty.
17
Measurement and Statistics
8. Is everything in a computer measurable? No
18
Measurement and Statistics
9. How do you know what to measure?
• This is the hardest question of all. To know what to measure you must have a picture or
model of your product. Most of the rest of this course will deal with various kinds of
pictures.
• Often an adequate model is a causal one: first procedure A executes; this causes
hardware B to produce an effect; then interrupt code handles the hardware result; etc.
• Example: You wish to design an experiment that will measure the time required to
execute a program on various Intel processors.. What parameters would you need to vary
to try different processors and configurations? DESIGN THE TESTS TO BE RUN.
19
Measurement and Statistics
10. Should you always know the results of a measurement before you
make it?
20
Measurement and Statistics
11. How do you figure out dependencies; how does one variable
depend on another ?
This whole topic is something called linear regression. It says that if you can plot two
variables, x and y, and there’s a simple relationship between the variables, then you can
define the dependency between them.
A linear regression means that we can fit a curve of the form y = a + bx. The quality of the
fit (error) can be defined as the sum of the y distances between the fitting-curve and the
experimental data.
21
Measurement and Statistics
12. So after all this talk about the details of measurement, how do you actually
design an experiment?
3. Select Metrics
a. What are the criteria you want to use to compare performance? This is still not a
quantifiable value, but simply what it is you will measure. This could be a speed metric, or
an accuracy metric.
22
Measurement and Statistics
4. List Parameters
a. What parameters affect performance? If you’re measuring disks, then the model of disk
determines it’s seek time, it’s rotational latency, etc. This is a system parameter.
b. The kind of test you use, determined by the workload you use, can also define parameters.
These might be requested IO’s per second, random or sequential blocks, etc.
8. Design Experiments
a. What experiments will you do to collect the data you want?
b. This means selecting the actual values to be used as factors. If one
of your factors is the type/model of disk, then how many different
disks will you use?
24
Measurement and Statistics
9. Make A Guess What The Result Will Be
a. Many people take a measurement and say “Oh, that must be right.” The best way to be able
to make that statement is to have understood what should happen and then either get what
you expected or not.
b. If you get what’s expected, then you can be confident that:
You understand a picture of how the system is working.
You did your measurements correctly.
c. If you DON’T get what’s expected, then you can be confident that:
You didn’t understand the system and so you need to form a new picture.
You did the measurement wrong – there’s some experimental error.
26