Using PivotT
Using PivotT
Batchelor
Assume you work for a small travel agency for which you need to mass-mail a travel
brochure. Funds are limited, so you want to mail the brochure to people who spend
the most money on travel. From information in a random sample of 925 people, you
know the gender, the age, and the amount these people spent on travel last year.
How can you use this data to determine how gender and age influence a person's
travel expenditures? What can you conclude about the type of person to whom you
should mail the brochure?
How can you use a PivotTable to summarize grocery sales at several grocery stores?
Assume you work for a manufacturer that sells microchips globally. You are given
monthly actual and predicted sales for Canada, France, and the United States for
Chip 1, Chip 2, and Chip 3. You are also given the variance, or difference, between
actual and budgeted revenues. For each month and each combination of country
and product, you would like to display the following data: actual revenue, budgeted
revenue, actual variance, actual revenue as a percentage of annual revenue, and
variance as a percentage of budgeted revenue. How can you display this
information?
What is a PivotTable?
In numerous business situations, you need to analyze, or "slice and dice," your data to
gain important business insights. If we sell different grocery products in different stores at
different points in time, we might have hundreds of thousands of data points to track.
PivotTables let us quickly summarize our data in almost any way imaginable. This is
referred to as "slicing and dicing data." For example, for our grocery store data, we could
use a PivotTable to quickly determine the following:
In the travel agency example, for instance, you would like to slice the data so that you can
determine whether the average amount spent on travel is influenced by age or gender or
by both factors. In the station wagon example, we'd like to compare the fraction of large
families that buy a station wagon to the fraction of small families that purchase a station
wagon. In the microchip example, we'd like to determine our total Chip 1 sales in France
during April, and so on. A PivotTable is an incredibly powerful tool that can be used to slice
and dice data. The easiest way to understand how a PivotTable works is to walk through
some examples.
How can I use a PivotTable to summarize grocery sales at several grocery
stores?
The Data worksheet in the file Groceriespt.xlsx contains more than 900 rows of sales
data. (See Figure 1.) Each row contains the number of units and revenue sold of a product
at a store, as well as the month and year of the sale. The product group (either fruit, milk,
cereal, or ice cream) is also included. We would like to see a breakdown of sales during
P. Batchelor
each year of each product group and product at each store. We would also like to be able
to show this breakdown during any subset of months in a given year (for example, what
the sales were during JanuaryJune).
Before creating a PivotTable, we must have headings in the first row of our data. Notice
that our data contains headings (Year, Month, Store, Group, Product, Units, and Revenue)
in row 2. Place your cursor anywhere in your data and on the Insert tab, in the Tables
group, click PivotTable. Microsoft Office Excel will open the Create PivotTable dialog box
and try to guess your data range. (In our case, Excel correctly guessed that our data range
was C2:I924.) (See Figure 2.) By selecting Use An External Data Source, you can also refer
to a database as a source for your PivotTable.
Figure 2. The Create PivotTable dialog box
P. Batchelor
After clicking OK, you will see the PivotTable Field List dialog box shown in Figure3.
You fill in the PivotTable Field List dialog box by dragging PivotTable headings or fields into
the desired boxes, or zones. This step is critical to ensuring that the PivotTable will
summarize and display the data in the manner you wish. The four zones are as follows:
P. Batchelor
Row Labels. Fields dragged here will be listed on the left side of the table in the
order they are dragged. For example, we dragged to the Row Labels box the fields
Year, Group, Product, and Store, in that order. This will cause Excel to summarize
data first by Year; then for each product Group within a given a year; then by
Product within each group, and finally break down each product by Store. You can at
any time drag a field to a different zone or reorder the fields within a zone by
dragging a field up or down in a zone or by clicking the arrow to the right of the field
label.
Column Labels. Fields dragged here will have their values listed across the top row
of the PivotTable. To begin, we will have no fields in the Column Labels zone.
Report Filter. In Excel 2007, Report Filter is the new name for the old Page Field area.
For fields dragged to the Report Filter area, we can easily pick any subset of the
field values so the PivotTable will show calculations based only on that subset of
field values. In our example, we dragged Month to the Report Filter area. Then we
can easily select any subset of months, for example JanuaryJune, and our
calculations are based on only those months.
Our completed PivotTable Field List dialog box is shown in Figure 4. The resulting
PivotTable is shown in Figure 5 and in the All Row Fields worksheet of the workbook
Groceriespt.xlsx. Before discussing the PivotTable, heres some advice on navigating
workbooks (like this one) containing many worksheets. In the lower-right corner (to the
left of the worksheet names) of your screen, you will see four arrows. Clicking the leftmost arrow takes you to the first worksheet; clicking the right-most arrow shows the
last worksheet; and clicking the other arrows moves you one worksheet to the left or
right.
Figure 4. Completed PivotTable Field List dialog box
P. Batchelor
P. Batchelor
To see the Field list, you need to be in a field in the PivotTable. If you do not see the Field
list, right-click any cell in the PivotTable and select Show Field List.
Our resulting PivotTable is in the All Row Fields worksheet. (See Figure 5.) In row 6, we see
that 233,161 units were sold for $702,395.82 in 2007. In row 30, we find that 2719 units of
Ben and Jerry's ice cream were sold in the west store for $9,627.41 in 2007.
PivotTable layouts available in Excel 2007
The PivotTable layout shown in Figure 5 is called the compact form. In the compact form,
the Row fields are shown one on top of another. To change the layout, place your cursor
anywhere within the table, and on the Design tab, in the Layout Group, click Report
Layout. and choose one of the following: Show In Compact Form (see Figure 5), Show In
Outline Form (see Figure 6 and the Outline Form worksheet), or Show In Tabular Form
(Figure 7 and the Tabular Form worksheet).
Figure 6. The outline form
P. Batchelor
P. Batchelor
P. Batchelor
P. Batchelor
Cerealcollapse worksheet. Clicking the plus sign in cell A6 will bring back the detailed or
expanded view including all the cereals.
Figure 9. The cereal field collapsed
We can also expand or contract an entire field! To expand or contract an entire field, go to
any row containing a member of that field and select PivotTable Tools Options on the
Ribbon. Then click either the green Expand Entire Field button (labeled with a plus sign) or
the red Contract Entire Field button (labeled with a minus sign) from the Active Field group
on the Ribbon. (See Figure 10.) Figure 10. The Expand Entire Field and Contract
Entire Field buttons
For example, suppose you simply want to see for each year the sales by product group.
Pick any cell containing a group's name (for example, A6), select PivotTable Tools Options
on the Ribbon, and click the Collapse Entire Field button. You will see the result shown in
P. Batchelor
Figure 11 on the next page (Groups worksheet collapsed). Selecting the Expand Entire
Field button would bring us back to our original view.
Figure 11. The Group field collapsed
P. Batchelor
For another example of filtering, look at the file Ptcustomers.xlsx, shown in Figure 13.
The worksheet data contains for each customer transaction the customer number, amount
paid, and the quarter of the year in which payment was received. After dragging Customer
to the Row Labels box, Quarter to the Column Labels box, and Paid to the Values box, the
PivotTable shown in Figure 14 is displayed (see the Ptable worksheet in the
Pcustomers.xlsx file).
Figure 13. The Customer
PivotTable data
Naturally, we might like to show a list of just our top 10 customers. To obtain this layout,
simply click the Row Labels arrow and select Value Filters. Then choose Top 10 items to
obtain the resulting layout shown in Figure 15 Of course, by selecting Clear Filter, you
can return to the original layout.
P. Batchelor
Suppose you simply want to see the top customers that generate 50 percent of your
revenue. Select the Row Labels filtering icon, select Value Filters, Top 10, and fill in the
dialog box as shown in Figure 16.
Figure 16. Configuring the Top 10 Filter dialog box to show customers
generating 50 percent of revenue
The resulting PivotTable is in the Top half worksheet. (See Figure 17.) Thus, our top 14
customers generate a little more than half our revenue.
P. Batchelor
Now let's suppose we want to sort our customers by their Quarter 1 revenue (see the
Sorted q1 worksheet). We right-click anywhere in the Quarter 1 column, point to Sort, and
then click Sort Largest To Smallest. (See Figure 18.) The resulting PivotTable is shown in
Figure 19. Note that Customer 13 paid us the most in Quarter 1, Customer 2 paid us the
second most, and so on.
Figure 18. Sorting on the Quarter 1 column
P. Batchelor
P. Batchelor
Excel makes it easy to visually summarize PivotTables by using PivotCharts. The key to
laying out the data the way you want it charted in a PivotChart is to use methods such as
sorting data and collapsing or expanding fields. In our grocery example, suppose we want
to summarize the trend over time of each food group's unit sales. See the Chart 1
worksheet in the file Groceriespt.xlsx. Then we should move the Year field to a Column
field and delete Revenue as a Values field. We also need to collapse the entire Group
field in the Row Labels zone. Now we are ready to create our first PivotChart. Simply click
anywhere inside the table and select Options, PivotChart. You can now pick the chart type
you want created. We chose the fourth Line Graph option, which displays the chart in
Figure 20. For example, the chart shows us that milk sales were highest in 2005 and
lowest in 2006.
P. Batchelor
P. Batchelor
P. Batchelor
P. Batchelor
how gender and age influence a person's travel expenditures? What can be concluded
about the type of person to whom they should mail the brochure?
To understand this data, we need to break it down into the following:
The data is included on the Data worksheet in the file Traveldata.xlsx. A sample of the
data is shown in Figure 25 on the next page. For example, our first person is a 44-year-old
male who spent $997 on travel.
Figure25. Travel agency data showing amount spent on travel, age, and gender
Let's first get a breakdown of spending by gender. To obtain this breakdown, we begin by
selecting Insert PivotTable. Excel extracts the range A2:D927. After clicking OK, we put the
cursor in the table so the field list appears. Next, we drag the Gender column to the Row
Labels zone and drag Amount Spent On Travel to the Values zone. This results in the
PivotTable shown in Figure 26.
Figure 26. PivotTable summarizing the total travel expenditures by gender
P. Batchelor
We can tell from the heading Sum Of Amount Spent On Travel that we are summarizing
the total amount spent on travel, but we actually want the average amount spent on
travel by men and women. To calculate these quantities, we double-click Sum Of Amount
Spent On Travel and then select Average from the Value Field Settings dialog box, shown in
Figure 27 on the next page.
Figure 27. You can select a different summary function in the Value Field
Settings dialog box.
We find that, on average, people spend $908.13 on travel. Women spend an average of
$901.16, whereas men spend $914.99. This PivotTable indicates that gender has little
P. Batchelor
influence on the propensity to travel. By clicking the Row Labels arrow, you can show just
male or female results.
Now we want to see how age influences travel spending. To remove Gender from the
PivotTable, simply click Gender in the Row Labels portion of the PivotTable Field List and
remove it from the Row Labels area. Then, to break down spending by age, drag Age to
the row area. The PivotTable now appears as it's shown in Figure 29.
We find that age seems to have little effect on travel expenditures. In fact, this PivotTable
is pretty useless in its present state. We need to group data by age to see any trends. To
group our results by age, right-click anywhere in the Age column and choose Group. In the
Grouping dialog box, you can designate the interval by which to define an age group.
Using 10-year increments, we obtain the PivotTable shown in Figure 30 on the next page.
P. Batchelor
Figure 30. Use the Group And Show Detail command to group detailed records.
We now find that 2534 year olds on average spend $935.84 on travel, 5564 year olds
spend $903.57 on travel, and so on. This information is more useful, but it still indicates
that people of all ages tend to spend about the same amount on travel. This view of our
data does not help determine who we should mail our brochure to.
Finally, let's get a breakdown of average travel spending by age, for men and women
separately. All we have to do is drag Gender to the Column Labels zone of the Field List
resulting in the PivotTable shown in Figure 31.
Figure 31. Age/gender breakdown of travel spending
Now we're cooking! We see that as age increases, women spend more on travel and men
spend less. Now we know who should get the brochure: older women and younger men. As
one of my students said, "That would be some kind of cruise!"
A graph provides a nice summary of our analysis. After moving the cursor inside the
PivotTable and choosing PivotChart, we select the fourth option from Column Graphs. The
result is the chart shown in Figure 32. If you want to edit the chart further, select
PivotChart Tools. Then, for example, if you choose Layout, you can add titles to the chart
and axis and make other changes.
Figure 32. PivotChart for the age/gender travel expenditure breakdown
P. Batchelor
We see that each age group spends approximately the same on travel, but as age
increases, women spend more than men. If you want to use a different type of chart, you
can change the chart type by right-clicking the PivotChart and then choosing Chart Type.
Notice that the bars showing expenditures by males decrease with age, and the bars
representing the amount spent by women increase with age. We can see why the
PivotTables that showed only gender and age data failed to unmask this pattern. Because
half our sample population are men and half are women, we found that the average
amount spent by people does not depend on the age. (Notice that the average height of
the two bars for each age is approximately the same.) We also found that the average
amount spent for men and women was approximately the same. We can see this because,
averaged over all ages, the blue and red bars have approximately equal heights. Slicing
and dicing our data simultaneously across age and gender does a much better job of
showing us the real information.
I'm doing market research about Volvo Cross Country Wagons. I need to determine what
factors influence the likelihood that a family will purchase a station wagon. From
information in a large sample of families, I know the family size (large or small) and the
family income (high or low). How can I determine how family size and income influence
the likelihood that a family will purchase a station wagon?
In the file Station.xlsx, you can find the following information:
A sample of the data is shown in Figure33. For example, the first family listed is a small,
high-income family that did not buy a station wagon.
Figure 33. Data collected about income, family size, and the purchase of a
station wagon
P. Batchelor
We want to determine how family size and income influence the likelihood that a family
will purchase a station wagon. The trick is to look at how income affects purchases for
each family size and how family size affects purchases for each income level.
To begin, we choose Insert Pivot Table, and then select our data (the cell range B2:D345).
Using the PivotTable field list, we drag Family Size to the Row Labels area, Station Wagon
to the Column Labels area, and any of the three fields to the Values area. The result is
the PivotTable shown in Figure 34. Notice that Excel has chosen to summarize the data
appropriately by counting the number of observations in each category. For example, 34
high-salary, large families did not buy a station wagon, whereas 100 high-salary, large
families did buy one.
Figure 34. Summary of station wagon ownership by family size and salary
P. Batchelor
We would like to know, for each row in the PivotTable, the percentage of families that
purchased a station wagon. To display the data in this format, we right-click anywhere in
the PivotTable data and then choose Value Field Settings, which displays the Value Field
Settings dialog box. In the dialog box, click Show Values As, and then select % Of Row in
the Show Data As list. We now obtain the PivotTable shown in Figure 35.
Figure 35. Percentage breakdown of station wagon ownership by income for
large and small families
From Figure 35, we learn that for both large and small families, income has little effect on
whether the family purchases a station wagon. Now we try to determine how family size
affects the propensity to buy a station wagon for high-income and low-income families. To
do this, we move Salary above Family Size in the Row Labels zone, resulting in the
PivotTable shown in Figure 36.
Figure 36. Breakdown of station wagon ownership by family size for high and
low salaries
P. Batchelor
From this table, we learn that for high-income families, a large family is much more likely
to buy a station wagon than a small family. Similarly, for low-income families, a large
family is also more likely to purchase a wagon than a small family. The bottom line is that
family size has a much greater effect on the likelihood that a family will purchase a station
wagon than does income.
I work for a manufacturer that sells microchips globally. I'm given monthly actual and
predicted sales for Canada, France, and the United States for Chip 1, Chip 2, and Chip 3.
I'm also given the variance, or difference, between actual and budgeted revenues. For
each month and each combination of country and product, I'd like to display the following
data: actual revenue, budgeted revenue, actual variance, actual revenue as a percentage
of annual revenue, and variance as a percentage of budgeted revenue. How can I display
this information?
In this scenario, you are a finance manager for a microchip manufacturer. You sell your
products in different countries and at different times. PivotTables can help you summarize
your data in a format that's easily understood.
The file Ptableexample.xlsx includes monthly actual and predicted sales during 1997 of
Chip 1, Chip 2, and Chip 3 in Canada, France, and the United States. The file also contains
the variance, or difference, between actual revenues and budgeted revenues. A sample of
the data is shown in Figure 37. For example, in the U.S. in January, sales of Chip 1 totaled
$4,000, although sales of $5,454 were predicted. This yielded a variance of $1,454.
Figure 37. Chip data from different countries , different months showing actual, budget,
and variance revenues
P. Batchelor
For each month and each combination of country and product, we would like to display the
following data:
Actual revenue
Budgeted revenue
Actual variance
To begin, select a cell within the range of data we're working with (remember that the first
row must include headings) and then choose Insert PivotTable. Excel automatically
determines that our data is in the range A1:F208.
If we drag Month to the Row Labels area, Country to the Column Labels area, and Revenue
to the Values area, for example, we obtain the total revenue each month by country. A
field you add to the Report Filter area (Product, for example) lets you filter your PivotTable
by using values in that field. By adding Product to the Report Filter area, we can view sales
of only Chip 1 by month for each country. Given that we want to be able to show data for
any combination of country and product, we should add Month to the Row Labels area of
the PivotTable and both Country and Product to the Report Filter area. Next, we drag Var,
Revenue, and Budget to the Values zone. We have now created the PivotTable that is
shown in Figure 38.
Figure 38. Monthly summary of revenue, budget, and variances
P. Batchelor
For example, in January, total revenue was $87,534 and total budgeted sales were
$91,831, so our actual sales fell $4,297 short of the forecast.
We want to determine the percentage of revenue earned during each month. We again
drag Revenue from the field list to the Values area of the PivotTable. Right-click in this
data column, and then choose Value Field Settings. In the Value Field Settings dialog box,
click Show Values As. In the Show Values As list, select % Of Column and rename this field
as Sum Of Revenue2, as shown in Figure 39.
Figure 39. Creating each month's percentage of annual revenue
P. Batchelor
We now obtain the PivotTable shown in Figure40 on the next page. January sales provided
8.53 percent of revenue. Total revenue for the year was $1,026,278.
Figure 40. Monthly revenue breakdown
P. Batchelor
button to add a field to the formula. After clicking Add and then OK, we obtain the
PivotTable shown in Figure 42.
Figure 41. Creating a calculated field
Figure 42. The PivotTable with calculated field for variance percentage
Thus, in January, our sales were 4.7 percent lower than budgeted. By displaying the Insert
Calculated Field dialog box again, you can modify or delete a calculated field.
Using the Report Filter
P. Batchelor
To see sales of Chip 2 in France, for example, you can select the appropriate values from
the Product and Country fields in the Page Fields area. With Chip 2 and France selected, we
would see the PivotTable shown in Figure 43.
Figure 43. Sales of Chip 2 in France
Figure 44. Grouping items together for January, February, and March
P. Batchelor
With numerical values or dates in a row field, you can group by number or dates in
arbitrary intervals. For example, you can create groups for age ranges and then find
the average income for all 2534 year olds.
P. Batchelor
"Drilling down" is when you double-click a cell in a PivotTable to display all the detailed
data that's summarized in that field. For example, double-clicking any March entry in the
microchip scenario will display the data that's related to March sales.
I often have to use specific data in a PivotTable to determine profit, such as the April sales
of Chip 1 in France. Unfortunately, this data moves around when new fields are added to
my PivotTable. Does Excel have a function that enables me to always pull April's Chip 1
sales in France from the PivotTable?
Yes, there is such a function. The GETPIVOTDATA function fills the bill. Suppose that you
want to extract sales of Chip 1 in France during April from the PivotTable contained in the
file Getpivotdata.xlsx. (See Figure 45.) Entering in cell E2 the formula
GETPIVOTDATA(A4,"April France Chip 1 Sum of Revenue") yields the correct value
($37,600) even if additional products, countries, and months are added to the PivotTable
later. We can also obtain the resulting revenue by simply pointing to the cell containing
Chip 2 April sales in France (cell D24).
Figure 45. Use the GETPIVOTDATA function to locate April Chip 1 Sales in France.
P. Batchelor
The first argument for this function is in the upper-left corner of the PivotTable (cell A4).
We enclose in quotation marks (separated by spaces) the PivotTable headings that define
the entry we want. The last entry must specify the data field, but other headings can be
listed in any order. Thus, our formula means "For the PivotTable whose upper-left corner is
in cell A4, find the Sum of Revenue for Chip 1 in France during April." This formula will
return the correct answer even if the sales data for Chip 1 in France in April moves to a
different location in the PivotTable.
If you want to simply return total revenue ($1,026,278), you could enter the formula (see
cell F2) GETPIVOTDATA(A4,"Sum of Revenue").
Often, the GETPIVOTDATA function is a nuisance. Suppose you want to refer to data in cells
B5:B11 from a PivotTable elsewhere in your workbook. You would probably use the formula
=B5 and copy it to the range B6:B11. Hopefully, this would extract B6, B7,..., B11 to
desired cells. Unfortunately, if the GETPIVOTDATA option is active, you will get a bunch of
GETPIVOTDATA functions that refer to the same cell. If you want to turn off GETPIVOTDATA,
you can click the Microsoft Office Button and click Excel Options. Then select Formulas,
and under Working With Formulas, clear the GetPivotdata Function For PivotTable
References. This will ensure that clicking inside a PivotTable yields a formula like =B5
rather than a GETPIVOTDATA function.
Problems
1.Contoso, Ltd. produces microchips. Five types of defects (labeled 15) have been
known to occur. Chips are manufactured by two operators (A and B) using four
machines (14). You are given data about a sample of defective chips, including the
type of defect, the operator, machine number, and day of the week the defect
occurred. Use this data to chart a course of action that would lead, as quickly as
possible, to improved product quality. You should use the PivotTable Wizard to
"stratify" the defects with respect to type of defect, day of the week, machine used,
and operator working. You might even want to break down the data by machine,
operator, and so on. Assume that each operator and machine made an equal
number of products. You'll find this data in the file Contoso.xlsx.
2.You own a fast food restaurant and have done some market research in an attempt
to better understand your customers. For a random sample of customers, you are
given the income, gender, and number of days per week that residents go out for
fast food. Use this information to determine how gender and income influence the
frequency with which a person goes out to eat fast food. The data is in the file
Macdonalds.xlsx.
3.The file Makeupdb.xlsx contains information about the sales of makeup products.
For each transaction, you are given the following information:
Name of salesperson
Date of sale
Product sold
Units sold
Transaction revenue
P. Batchelor
Using your answer to the previous question, create a function that always
yields Jen's lipstick sales.
Total revenue by salesperson and year. (Hint: You will need to group the data
by year.)
4.For the years 19851992, you are given monthly interest rates on bonds that pay
money one year after the day they're bought. It's often suggested that interest rates
are more volatiletend to change morewhen interest rates are high. Does the data
in the file Intratevol-volatility.xlsx support this statement? Hint: PivotTables can
display standard deviations.
5.For our grocery example, prepare a chart that summarizes the trend over time of the
sales at each store.
6.For our grocery example, create a calculated field that computes an average per unit
price received for each product.
7.For the grocery example, create a PivotChart that summarizes the sales of each
product at each store for the years 2005 and 2006.
8.In the customer PivotTable example, show the top 15 customers in one table and the
bottom 5 customers in another table.