0% found this document useful (0 votes)
741 views4 pages

Bookbinders Book Club Case: Assignment 5: Identifying Target Customers (Individual Assignment)

This document describes a case study involving Bookbinders Book Club (BBB Club) and their efforts to use predictive modeling to identify target customers. BBB Club conducted an experiment sending a brochure for a new art book to 20,000 customers. The data from this experiment contains variables like gender, purchase history, and book categories purchased. BBB Club wants to use logistic regression to predict the probability customers will purchase the art book based on these variables. The document provides the data and modeling instructions. It asks the student to estimate a logistic regression model, interpret the coefficients, apply the model to new data to generate scores and ranks, and create a lift curve to evaluate model performance.

Uploaded by

stella
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
741 views4 pages

Bookbinders Book Club Case: Assignment 5: Identifying Target Customers (Individual Assignment)

This document describes a case study involving Bookbinders Book Club (BBB Club) and their efforts to use predictive modeling to identify target customers. BBB Club conducted an experiment sending a brochure for a new art book to 20,000 customers. The data from this experiment contains variables like gender, purchase history, and book categories purchased. BBB Club wants to use logistic regression to predict the probability customers will purchase the art book based on these variables. The document provides the data and modeling instructions. It asks the student to estimate a logistic regression model, interpret the coefficients, apply the model to new data to generate scores and ranks, and create a lift curve to evaluate model performance.

Uploaded by

stella
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

Marketing Research

Professor Thomadsen

Assignment 5: Identifying Target Customers (individual assignment)

Due: online before class

4 page maximum (strict).

Bookbinders Book Club Case


by Gary L. Lilien & Arvind Rangaswamy

Introduction
About 50,000 new titles, including new editions, are published in the United States each year,
giving rise to a $20+ billion book publishing industry. About 10 percent of the books are sold
through mail order. Book retailing in the 1970s was characterized by the growth of chain
bookstore operations in concert with the development of shopping malls. Traffic in bookstores in
the 1980s was enhanced by the spread of discounting. In the 1990s, the superstore concept of
book retailing was responsible for the double-digit growth of the book industry. Generally
situated near large shopping centers, superstores maintain large inventories of anywhere from
30,000 to 80,000 titles. Superstores are putting intense competitive pressure on book clubs, mail-
order firms and retail outlets. Recently, online superstores, such as www.amazon.com, have
emerged, carrying 1–2.5 million titles and further intensifying the pressure on book clubs and
mail-order firms. In response to these pressures, book clubs are starting to look at alternative
business models that will make them more responsive to their customers’ preferences.

Historically, book clubs offered their readers continuity and negative option programs that were
based on an extended contractual relationship between the club and its subscribers. In a
continuity program, popular in such genres as children’s books, a reader signs up for an offer of
several books for a few dollars each (plus shipping and handling on each book) and agrees to
receive a shipment of one or two books each month thereafter. In a negative option program,
subscribers get to choose which and how many additional books they will receive, but the default
option is that the club’s selection will be delivered to them each month. The club informs them
of the monthly selection and they must mark “no” on their order forms if they do not want to
receive it. Some firms are now beginning to offer books on a positive-option basis, but only to
selected segments of their customer lists that they deem receptive to specific offers.

Book clubs are also beginning to use database marketing techniques to work smarter rather than
expand the coverage of their mailings. According to Doubleday president Marcus Willhelm,
“The database is the key to what we are doing…. We have to understand what our customers
want and be more flexible. I doubt book clubs can survive if they offer the same 16 offers, the
same fulfillment to everybody.” Doubleday uses modeling techniques to look at more than 80
variables, including geography and the types of books customers purchase, and selects three to
five variables that are the most influential predictors.

The Bookbinders Book Club


The BBB Club was established in 1986 for the purpose of selling specialty books through direct
marketing. BBBC is strictly a distributor and does not publish any of the books it sells. In

Page 1/4
Marketing Research
Professor Thomadsen

anticipation of using database marketing, BBBC made a strategic decision right from the start to
build and maintain a detailed database about its members containing all the relevant information
about them. Readers fill out an insert and return it to BBBC which then enters the data into the
database. The company currently has a database of 500,000 readers and sends out a mailing
about once a month.

BBBC is exploring whether to use predictive modeling approaches to improve the efficacy of its
direct mail program. For a recent mailing, the company selected 20,000 customers in
Pennsylvania, New York and Ohio from its database and included with their regular mailing a
specially produced brochure for the book The Art History of Florence. BBBC then developed a
database to calibrate a response model to identify the factors that influenced these purchases. For
this case analysis, we will use a subset of the database available to BBBC.

The dependent variable for the analysis is Choice -- purchase or no purchase of


The Art History of Florence. BBBC also selected several independent variables
that it thought might explain the observed choice behavior. Below is a
description of the variables used for the analysis:

Choice (y): Whether the customer purchased the The Art History of
Florence. 1 corresponds to a purchase and 0 corresponds to a
Non-purchase.
Gender: 0 = Female and 1 = Male.
Amount purchased: Total money spent on BBBC books.
Frequency: Total number of purchases in the chosen period (used as
a proxy for frequency.)
Last purchase (recency of purchase): Months since last purchase.
First purchase: Months since first purchase.
P_Child: Number of children’s books purchased.
P_Youth: Number of youth books purchased.
P_Cook: Number of cookbooks purchased.
P_DIY: Number of do-it-yourself books purchased.
P_Art: Number of art books purchased.

Data: The Excel data file for this homework is available from the class web site. The data file
has two worksheets, one called “Estimation Sample” and the other called “Validation Sample”.

BBB Club’s management wants to know whom to send a specially produced brochure promoting
a new art book The Art History of Florence. Thus, it runs an experiment, as we discussed in
class. You will work through the analysis.

Page 2/4
Marketing Research
Professor Thomadsen

Question 1: Estimation using the Estimation Sample


Estimate a logistic regression model predicting probability of response as a function of all the
available variables:
• Gender
• Amt_purchased
• Frequency
• Last_Purchase
• First_purchase
• P_Child
• P_Youth
• P_Cook
• P_DIY
• P_Art
To do this, use the method noted in the class notes. If you are having trouble, you can use the
manual estimation on the first page, by using solver to maximize the log of the probabilities, which
is put in cell Q5 of the excel worksheet. (If any of this is unclear, let me know. Hint: your
coefficients should match those in cells A2:K2 of the ‘Validation Sample’ tab.)

Look at your logistical regression results. Report only the items that are requested below.
(a) (20 points) Report: the estimated scoring equation (3 decimal points are sufficient for each
parameter). Note: keep all of the estimated coefficients. Do not drop statistically insignificant
coefficients. You may go to https://astatsa.com/Logit_Probit/ and put in the data from the
.cvs with Choice listed in the first column.
(b) (20 points) Report: Interpret the sign of each coefficient: Specifically, note which variables
increase the probability that the customer makes a purchase and which decrease this
probability. Report: Do the signs on the coefficients match your intuition of which direction
you would expect them to go? Why or why not? Hint: you do not need to compute any
probabilities because higher scores imply higher probabilities.

Question 2: Applying the scoring model to the Validation Sample


These steps follow the same process we used in our in-class American Express exercise. Unless a
particular step says “report”, you do not need to report anything in your write-up. To complete
this question, use the file HW5BookbindersBookClub.xlsx

(c) From the scoring equation obtained above, compute the score for each person in the
Validation Sample. Put this in score in column M. Hint: the coefficients are already entered
for you in the yellow area above their respective variables. Note: any small differences in
these parameters and your estimates are from rounding error – the estimates in the
spreadsheet are the right ones.
(d) Sort the 1200 prospects in the Validation Sample in decreasing order of score.
(e) The position of a person on this sorted list will be referred to as his/her Rank, create a
1,2,3,… column with the rank of each person in column N. Hint: there are many ways to do
this easily. One way is to type 1 into the first rows, and the in the second row, put =1+N4.
Then fill down from cell N5. If you are having trouble on this step, as me for help rather than
manually typing in the numbers.

Page 3/4
Marketing Research
Professor Thomadsen

(f) Calculate the percentage of people targeted in column O, by taking the rank in column N, and
dividing by 1200.
(g) (10 points) Compute the cumulative sum of actual purchases in the sorted list in column P.
Use the same approach as we used in step 3 of the AmEx in-class exercise. Report: How
many total purchases were there in the validation sample?
(h) Compute percent cumulative response by dividing the cumulative sum of actual responses by
the total number of responses in the entire validation sample in Column Q.
(i) (20 points) Create a plot with the percentage of people targeted on the horizontal axis and the
percent cumulative response on the vertical axis. This is the Lift Curve, and it should look
similar to the jagged line on page 5 of the AMEX in-class exercise. Add the 45-degree line
reflecting a random mailing by adding a series into your graph with column O plotted against
itself. Report: The plot. Comment on this plot. Does this scoring rule have meaningful lift?
(j) Assume that the net profit from each successful sale of The Art History of Florence is $10.20.
Assume the solicitation cost is $1 from mailing costs and brochure costs. For each rank level,
compute the profit from contacting people up and including that rank.
(k) (20 points) Create a plot with the Score on the horizontal axis and the profit on the vertical
axis. Report: The plot. What is the cutoff score that maximizes profits? What fraction of
consumers meet this threshold in the Validation sample?
(l) (10 points) Report: Would you mail to this person?
Gender = 0
Amt_Purchased = 200
Frequency = 10
Last_Purchased = 1
First_Purchased = 10
P_Child = P_Youth = P_Cook = P_DIY = 0
P_Art = 1
Why or why or why not? Show your calculations and support your answer.

To summarize: provide answers to questions (a), (b), (g), (i), (k) and (l) only.

Page 4/4

You might also like