Confounding and Interaction
in Regression
11-1 Preview
Two different goals of a regression analysis are (1) to predict the dependent variable using a set of
independent variables and (2) to quantify the relationship of one or more independent variables to a
dependent variable. These goals differ because the first focuses on finding a model that fits the observed
data and predicts future data as well as possible, whereas the second pertains to producing accurate
estimates of one .or more regression coefficients in the model. The second goal, moreover, is of particular
interest when the research question concerns disease etiology, such as trying to identify one or more
determinants of a disease or other health-related outcome.
Confounding and interaction are two methodological concepts relevant to attaining the second goal.
In this chapter, we describe these concepts using regression terminology. More general discussion of this
subject can be found elsewhere (e.g., Kleinbaum, Kupper, and Morgenstern, 1982) within the context of
epidemiological research, which typically addresses etiologic questions involving the second goal above.
We begin here with a general overview of these concepts, after which we discuss the regression
formulation of each concept separately. In Chapter 15 we shall describe a popular regression procedure,
analysis of covariance (ANACOVA), which may be used to adjust or correct for problems of confounding.
Subsequently, in Chapter 16, we shall briefly describe a strategy for obtaining a "best" regression model
that incorporates the assessment of both confounding and interaction.
11-2 An Overview
Confounding and interaction, though different concepts, both involve the assessment of an
association between two or more variables so that additional variables that may affect this association
are accounted for. The measure of association that is chosen usually depends continuous, as in the
classic regression context, the measure of association will typically, be a regression coefficient. The
additional variables to be considered are synonymously referred to as extraneous variables, control
variables, or covariates. The essential question concerning these variables is whether and how they
should be incorporated into a model with which the association of interest can be estimated.
In more practical terms, suppose we consider a study to assess whether physical activity level
(PAL) is associated with systolic blood pressure (SBP), accounting (i.e., controlling) for AGE. The
extraneous variable here is AGE. We need to determine whether we can ignore AGE in our analysis and
still correctly assess the PAL—SBP association. In particular, we need to address the following two
questions: (1) Is the estimate of the association between PAL and SBP mean ingfully different depending
on whether we ignore AGE? (2) Is the estimate of the association between PAL and SBP meaningfully
different for different values of AGE? The first question is concerned with confounding, the second
question with interaction.
In general, confounding exists if meaningfully different interpretations of the relation ship of
interest result when an extraneous variable is ignored or included in the data analysis. In practice, the
assessment of confounding requires a comparison between a crude estimate of an association (which
ignores the extraneous variable(s) of interest) and an adjusted estimate of association (which accounts in
some way for the extraneous variables). If the crude and adjusted estimates are meaningfully different,
then we say that confounding is present and one or more extraneous variables must be included in our
data analysis. Note that this definition does not require a statistical test but rather a comparison of
estimates obtained from the data (see Kleinbaum, Kupper, and Morgenstern, 1982, chap. 13, for further
discussion of this point).
For example, using the above illustration, a crude estimate of the relationship between PAL and SBP
(ignoring AGE) is given by the regression coefficient, say f3 i , of the variable PAL in the straight-line
model that predicts SBP using just PAL. In contrast, an adjusted estimate is given by the regression
coefficient, Si`, of the same variable, PAL, in the multiple regression model that predicts SBP using both
PAL and AGE. In particular, if PAL is defined dichotomously (e.g., PAL = 1 or 0 for high or low physical
activity, respectively), then the crude estimate is simply the crude difference between the mean systolic
blood pressures in each physical activity group, and the adjusted estimate represents an adjusted difference
in these two mean systolic blood pressures that controls for AGE. In general, confounding is present if
there is any meaningful difference between the crude and adjusted estimates.
Interaction is the condition where the relationship of interest is different at different levels (i.e.,
values) of the extraneous variable(s). In contrast to confounding, the assessment of interaction does not
consider either a crude estimate or an (overall) adjusted estimate, but instead focuses on describing the
relationship of interest at different values of the extraneous variables. For example, in assessing
interaction due to AGE in describing the PAL—SBP relationship, the issue is whether some descrip tion
(i.e., estimate) of this relationship varies with different values of AGE (e.g., whether the relationship is
strong at older ages and weak at younger ages). If the PAL—SBP relationship does vary with AGE, then
we say that there is an AGE x (read "by") PAL interaction. To assess interaction a statistical test may be
employed in addition to subjective evaluation of the meaningfulness (e.g., clinical impor tance) of an
estimated interaction effect. Again, for further discussion, see Kleinbaum, Kup per, and Morgenstern
(1982).
When both confounding and interaction are considered for the same data set, the use of an overall
(adjusted) estimate as a summary index of the relationship of interest would tend to mask any (strong)
interaction effects that may be present. For example, if the PAL—SBP association differs meaningfully at
different values of AGE, the use of a single overall estimate, such as the regression coefficient of PAL in
a multiple regression model containing both AGE and PAL, would hide this i nteraction finding. This
illustrates the following important principle: Interaction should be assessed before confounding is; the
use of a summary (adjusted) estimate that controls for confounding is recommended only when there is no
meaningful interaction (Kleinbaum, Kupper, and Morgenstern, 1982, chap. 13).
Thus, in general, confounding and interaction are different phenomena. A variable may manifest both
confounding and interaction, neither, or only one of the two. Nevertheless, if strong interaction is found, an
adjustment for confounding is inappropriate.
We are now ready to address how these concepts can be employed using regression terminology,
assuming a linear model and a continuous dependent variable. A regression analog for a dichotomous
outcome variable could, for example, involve a logistic rather than a linear model. Logistic modeling is
discussed briefly in Chapter 21; a more detailed discussion in which confounding and interaction are
considered can be found in Kleinbaum, Kupper, and Morge nstern (1982, chaps. 20-24).
11-3 Interaction in Regression
In this section, we shall describe how two independent variables can interact to affect a dependent
variable and how such an interaction can be represented by an appropriate regression model.
11-3-1 An Example
To illustrate the concept of interaction, we shall consider the following simple example. Suppose it is
of interest to determine how two independent variables, temperature (T) and catalyst concentration (C),
jointly affect the growth rate (Y) of organisms in a certain biological system. Further, suppose that two
particular temperature levels (T o and T 1) and two particular levels of catalyst concentration (C o and C1) are
to be examined, and that an experiment is performed in which an observa tion on Y is obtained for each of
the four combinations of temperature—catalyst concentration level, (To , Co), (T o, C1), (T 1, Co), and (T 1,
C1).
(rn statistical parlance, this experiment is called a complete factorial experiment, because observations on Y
are obtained for all combinations of settings for the independent variables (or factors). The advantage of a
factorial experiment is that any existing interaction effects can be detected and measured efficiently.)
Now, let us consider two graphs based on two hypothetical data sets for the experiment scheme
described above. Figure 11-la suggests that the rate of change in the growth rate as a function of
temperature is the same regardless of the level of catalyst con centration; in other c iwords, the
relationship between Y and T does not in any way depend on C.
(For those readers familiar with calculus, the phrase "rate of change" is related to the notion of a derivative
of a function. In particular, Figure 11-la portrays a situation where the partial derivative with respect to T
of the response function relating the mean of Y to T and C is independent of C.)
It is important to point out that we are not saying that Y and C are unrelated, but that the
relationship between Y and T does not vary as a function of C. When this is the case, we say that T and C
do not interact or, equivalently, that there is no T x C interaction effect. Practically speaking, this means
that we can investigate the effects of T and C on Y independently of one another and that we can
legitimately talk about the separate effects (sometimes c alled the main effects) of T and C on Y.
One way to quantify the relationship depicted in Figure 11-1a is with a regression model of the form
𝜇𝑌𝐼𝑇, 𝐶 = 𝛽0 + 𝛽1 𝑇 + 𝛽2 𝐶 (11.1)
Here, the change in the mean of Y for a 1-unit change in T is equal to 𝛽 l, regardless of the level of
C. In fact, changing the level of C in (11.1) has only the effect of shifting the straight line relating 𝜇𝑌𝐼𝑇, 𝐶
and T either up or down without affecting the value of the slope 𝛽i, as seen in Figure 11-1a. In particular,
𝜇𝑌𝐼𝑇, 𝐶 = (𝛽o + 𝛽 2C0) + 𝛽 i T and 𝜇𝑌𝐼𝑇, 𝐶 , = (𝛽 o + 𝛽 2 C1) '+ 𝛽1 T.
In general, then, one might say that no interaction is synonymous with parallelism in the sense that
the response curves of Y versus T for fixed values of C are parallel; in other words, these response
curves (which may be linear or. nonlinear) all have the same general shape, differing from one another
only by additive constants independent of T (e.g., see Figure 11-2).
In contrast, Figure 11-lb depicts the situation where the relationship between Y and T depends on
C; in particular, Y appears to increase with increasing T when C = C o but to decrease with increasing T
when C = C 1 . In other words, the behavior of Y as a function of temperature cannot be considered
independently of catalyst concentration. When this is the case, we say that T and C interact or,
equivalently, that there is a T x C interaction effect. Practically speaking, this means that it really does
not make much sense to talk about the separate (or main) effects of T and C on Y, since T and C do not
operate independently of one another in their effects on Y.
One way to represent such an interaction effect mathematically is to consider a regression model of the
form
𝜇𝑌𝐼𝑇, 𝐶 = 𝛽0 + 𝛽1 𝑇 + 𝛽2 𝐶 + 𝛽12 𝑇𝐶 (11.2)
Here the change in the mean value of Y for a 1-unit change in T is equal to th + P 12C, which clearly
depends on the level of C. In other words, introducing a product term such as P12 TC in a regression
model of the type (11.2) is one way to account for the fact that two such factors as T and C do not operate
independently of one another. For our particular example, when C = C o, model (11.2) can be written as
𝜇𝑌𝐼𝑇, 𝐶 = (𝛽0 + 𝛽2 𝐶0 ) + 𝛽1 + 𝛽12 𝐶0 𝑇
and when C = Cl, model (11.2) becomes
𝜇𝑌𝐼𝑇, 𝐶 = (𝛽0 + 𝛽2 𝐶1 ) + 𝛽1 + 𝛽12 𝐶1 𝑇
In particular, Figure 11-lb suggests that the interaction effect 13 12 is negative, with the linear effect ( 𝛽1 +
𝛽12 𝐶0 ) of T at Co being positive and the linear effect ( 𝛽1 + 𝛽12 𝐶1 ) of T at C1 being negative. A negative
interaction effect is to be expected here, since Figure 11-lb suggests that the slope of the linear relationship
between Y and T decreases (i.e., goes from positive to negative in sign) as C changes from C o to C1. Of
course, it is possible for 13 12 to be positive, in which case the interaction effect would manifest i tself as a
larger positive value for the slope when C = C 1 than when C = Co .
11-3-2 Interaction Modeling in General
As the preceding illustration suggests, interaction among independent variables can generally be
described in terms of a regression model that involves product terms. Unfortunately, there are no precise
rules for specifying such terms. For example, if interaction involving three variables X 1 , X 2, and X 3 is of
interest, one model to consider is:
𝑌 = 𝛽0 + 𝛽1 𝑋1 + 𝛽2 𝑋2 + 𝛽2 𝑋3 + 𝛽4 𝑋1 𝑋2 + 𝛽5 𝑋1 𝑋3 + 𝛽6 𝑋2 𝑋3 + 𝛽7 𝑋1 𝑋2 𝑋3 + 𝐸 (11.3)
In this model, the two-factor products of the form X,X, are often referred to as first-order
t
interactions, whereas three-factor products like X 1 X2 X3 are called second-or r interactions, and so on
for higher-order products. The higher the order of interactio , the more difficult it becomes to interpret
its meaning.
Model (11.3) is not the most general model possible when considering the three vari ables X 1, X2 ,
and X 3. Additional product terms such as X,X;, X,X;, X . X;. , and so on can also be included.
Nevertheless, there is a limit on the total number of such terms: The model cannot contain more than n
— 1 independent variables when n is the total number of observations in the data. Moreover, it may not
even be possible to fit reliably a model with fewer than n — 1 variables if some of the variables (e.g.,
higher-order products) are highly correlated with other variables in the model, as would be the case
when the model contains several interaction terms. This problem, called collinearity, is discussed in
Chapter 12.
Model (11.3) may, on the other hand, be considered too general if one is focusing one particular
interactions of interest. For example, if the purpose of one's study is to describe the relationship between X1 and
Y controlling for the possible confounding and/or interaction effects of X 2 and X3 , the following simpler model
may be of more interest than (11.3):
𝑌 = 𝛽0 + 𝛽1 𝑋1 + 𝛽2 𝑋2 + 𝛽2 𝑋3 + 𝛽4 𝑋1 𝑋2 + 𝛽5 𝑋1 𝑋3 + 𝐸 (11.4)
The terms Xi X2 and Xi X3 describe the interactions of X2 and X3 , respectively, with X1. In contrast, the term
X2X3, which is not contained in model (11.4), does not concern interaction involving X1.
In using statistical testing to evaluate interaction for a given regression model, a number of options
are available. (A more detailed discussion of how to select variables is given in Chapter 16.) One
approach is to test globally for the presence of any kind of interaction and then , if significant interaction
is found, to identify particular interaction terms of importance by using other tests. For example, in
considering model (11.3), one could first test Ho : /3 4 = /3 s = 06 = /37 = 0 using the multiple-partial F
statistic
F(X3 X2, XIX3, X2X3, X1X2X3PC1 , X2, X3)
which has an F4,,,-8 distribution when H o is true. If this F statistic is found to be significant, individually
important interaction terms might then be identified by using selected partial F tests.
A second way to assess interaction is to test for interaction in a hierarchical sequence, beginning with
the highest-order terms and then proceeding sequentially to lower-order terms if higher-order terms are not
significant. Using model (11.3), for example, one might first test Ho : /37 = 0, which considers the second-
order interaction, and then test H o : /34 = Ps = /36 = 0 in a reduced model (excluding the three-way
product term X i X2 X3) if the first test is nonsignificant.
11-3-3 A Second Example
We now consider a study to assess physical activity level (PAL) as a predictor of systo blood pressure
(SBP), controlling for AGE and SEX. A model that allows for possib interactions of both AGE with PAL and
SEX with PAL is given by
Note the absence of a term involving AGE x SEX; such a term does not indicate interaction associated with the
study variable of interest (PAL).
To assess interaction for this model, one might first perform a multiple -partial F test of H0: /34 = P5
= 0; if the test was significant, then partial F tests could be conducted to determine whether one or more of
these product terms should be kept in the model. If the first test was found nonsignificant, one would then
simplify the full model by removing these two product terms entirely, giving the reduced mode l SBP = 00
+ p i (PAL) + /3 2 (AGE) + p i (SEX) + E. At this point the interaction phase of model building would be
complete. The next step would involve the assessment of confounding, which we discuss in the next
section.
11-4 Confounding in Regression
We have emphasized earlier (Section 11-1) that the assessment of confounding is questionable in the
presence of interaction. Thus, in our discussion of confounding here, we shall assume throughout that there is no
interaction.'
11-4-1 Controlling for One Extraneous Variable
Let us suppose that we are interested in describing the relationship between an indepen dent
variable T and a continuous dependent variable Y, taking into account the possible confounding effect of
a third variable C. As described in the previous section, the assessment of confounding requires the
comparison of a crude estimate of the T— Y relationship, which ignores the effect of the control
variable (C), with an estimate of the relationship that accounts (or controls) for this variable. T his
comparison can be expressed in terms of the following two regression models:
????????????? (11.5)
and
???????????????????? (11.6)
The assumption of no T x C interaction precludes the need to consider a product term of the form TC in these
models.
From model (11.5), the relationship between T and Y adjusted for the variable C can be expressed in
terms of the (partial) regression coefficient (p i) of the T variable. The estimate of P i , which we will denote by
Plic, obtained from least-squares fitting of model (11.5), is an adjusted-effect measure in the sense that it gives
the estimated change in Y per unit change in T after accounting for C (i.e., with C in the model).
A crude estimate of the T—Y relationship is the estimated coefficient of T (namely, P i) based on model
(11.6), a model that does not involve the variable C.
Thus, we have the following general rule for assessing the presence of confounding when only one
independent variable is to be controlled: Confounding is presen f the estimate of the coefficient (gi) of the
study variable T meaningfully changes u n the variable C is removed from model (11.5), that is, if
???????????? (11.7)
where ???? denotes the (adjusted) estimate of 13i using model (11.5) and R1 denotes the (crude) estimate of
𝛽1 using model (11.6).
The ≠ sign in expression (11.7) indicates that a subjective decision is required as to whether the two
estimates are meaningfully different; that is, one needs to determine subjectively whether the two estimates
each describe a different interpretation of the T—Y association in question. A statistical test is neither
required nor appropriate (Kleinbaum, Kupper, and Morgenstern, 1982, chap. 13).
As an example, suppose Y denotes SBP, T denotes PAL, and C denotes AGE. For some set of data,
suppose it was found that
??????????????????
Then, it can be concluded that a 1-unit change in PAL yields a 16-unit change in SBP when AGE is
ignored, whereas, when AGE is controlled, a 1-unit change in PAL yields only a 4.1-unit change in SBP:
that is, the association between PAL and SBP is much weaker after controlling for AGE. (As a special
case, if PAL is a 0-1 variable, then pi gives the crude difference in mean systolic blood pressures between
the two PAL groups, and PlIAGE gives an adjusted [for AGE] difference in mean blood pressures.) Thus,
AGE would be labeled a confounder and should be controlled in the analysis.
As another example, suppose that
???????????????????
Here, we would be inclined to say that AGE is not a confounder because there is no meaningful
difference between the estimates 6.2 and 6.1. Unfortunately, an investigator may have to deal with much
more difficult comparisons, such as I31IAGE = 4.1 versus pi = 5.5. When comparing such estimates
numerically, one must also consider the clinical importance of the numerical difference between estimates
based on (a priori) knowledge of the- variable(s) involved. For instance, since the coefficients 4.1 and 5.5
estimate, respectively, adjusted and crude differences in mean blood pressures between high and low PAL
groups, it is important to decide whether a mean difference of 5.5 is clinically more important than a
mean difference of 4.1. One approach to this problem is to control for any variable (as a confounder) that
changes the crude effect estimate by some prespecified amount determined by clinical judgment.
(One approach sometimes used to assess confounding is, for example, to conduct a statistical test of
H0: /32 = 0 in model (11.5). Such a test does not address confounding, but rather precision; that is,
such a test evaluates whether significant additional variation in Y is explained by adding C to a model
already containing T. An almost equivalent approach is to determine whether a confidence interval for
pk, the coefficient of T, is considerably narrower when C is in the model than when it is not.
Precision is often an important issue when considering extraneous factors, but it is a different issue
from confounding. In fact, for etiologic questions, confounding, which concerns validity (i.e., do you
have the right answer?), usually takes precedence over precision. Another reason for not focusing on
$2 is that if 132 0, it does not follow that $1Ic # R1 . That is, $2 ± 0 is not a sufficient condition for
confounding.)2
Before turning to criteria for confounding involving several covariates, we comment on the
practical problem of deciding what type of variables (i.e., covariates) should be consid ered for control as
potential confounders. Although the answer here is somewhat debatable, we take the posi tion that a list
of eligible variables should be constructed based on prior knowledge and/or research about the
relationship of the dependent variable to each covariate under consideration. In particular, we
recommend that only variables known to be reasonably predictive of (i.e., associated with) the
dependent variable should be considered as potential confounders and/or effect modifiers. In
epidemiological terms, such variables are generally referred to as risk factors (Kleinbaum, Kupper, and
Morgenstern, 1982). The idea here is to restrict attention to the control of only those (previously
studied) extraneous variables that the investigator anticipates may account for the hypothesized
relationship between T and Y presently being studied. To develop such a list, the investigators will have
to make a subjective decision.'
11-4-2 Controlling for Several Extraneous Variables
Suppose that we wish to describe the association between T and Y, taking into account several
covariates C 1, C2, . , Cp. Analogous to the procedure described for one covariate, we can assess
confounding by comparing a crude estimate of the T—Y relationship to some adjusted estimate. As
before, the crude estimate can be defined in terms of a regression model like (11.6), which describes the
relationship between T and Y ignoring all covariates. To obtain the adjusted estimate, however, we must
now consider an extended model defined as follows:
????????????????????????? (11.8)
(Like model (11.5), model (11.8) assumes no interaction involving T since no product terms of the form TCi are
included.)
Using this model, we can define confounding involving several variables as follows: Confounding is
present if the estimate of the regression coefficient (go of T in a regression model like (11:6), which ignores the
variables C1, C2 . . . Cp, is meaningfully different from the corresponding estimate of 131 based on a model like
(11.8), which controls for Cl, C2, ,,,,,Cp that is, if
???????????????????? (11.9)
2
Suppose n = 6 and we have the following data for (T, C, Y): (1, 0, 4), (1, 1, 5), (1, 2, 6), (0, 0, 1), (0, 1, 2), and (0, 2,
3). Then unweighted least-squares fitting gives Y = 1 + 3T + C when T and C are predictors, whereas Y = 2 + 3T
when C is ignored. Thus, P2 = 1 (# 0), yet there is no confounding, since p, = 3 =
3
As a caveat to the above recommendations, certain variables usually referred to as intervening variables should not
be considered as potential confounders (Kleinbaum, Kupper, and Morgenstern, 1982). A variable C is called
intervening between T and Y if T causes C and then C causes Y. Controlling intervening variables may spuriously
reduce or eliminate any manifestation in the data of a true association between T and Y.
where ????????? denotes the (adjusted) estimate of pi using (11.8) and 1E41 is the (crude) estimate of si using
(11.6).
One problem with applying the above definition, however, is that it add esses the question of
whether confounding is present without directly identifying specific variables to be controlled. 4 In other
words, when confounding is deemed to be present based on (11.9), it
may still be the case that only a subset of C1, C2, . .Cp is required for adequate control.
How does one identify such a subset? More specifically, why bother to identify such a subset
rather than simply to control for all variables C1, C2, . . Cp ?
The answer to the latter question is that, when addressing the control of covariates, the possible
gains in precision must be considered in addition to the control of confounding. In particular, a subset of
C, variables might be preferred to the entire set because the subset may provide equivalent control of
confounding (i.e., may give the same adjusted estimate) while providing greater precision in estimating
the adjusted association of interest. However, there is no guarantee that precision will be increased by
using a subset; in fact, precision may be reduced. In any case, confounding should take precedence over
precision in the sense that no subset should be considered unless it gives the same adjusted -effect
estimate as that obtained when controlling for all C,'s.
To illustrate, suppose p = 5; that is, we consider controlling for C1, C2,,,,,Cs using model (11.8).
Suppose also that the estimate of pi takes on the following values depending on which sets of C1, C2 . . , Cs are
controlled.
???????????????????
Then, because 16.0 is much different from 4.0, one can argue that confounding is present. Yet since 4.0 is not
meaningfully different from 4.3, it can also be argued that C 3, C4, and Cs do not need to be controlled, since
essentially the same (adjusted) estimate is obtained when controlling only for C I and C2 as when adjusting for all
Ci's.
Thus, for this example, we have identified two sets of C, variables that we can use for control. Which set
do we choose? The answer depends on an evaluation of precision. One approach is to compare interval
estimates for some parameter of interest, one interval being derived from a model that controls for C I and C2
only, and the other interval from a model that controls for C i through Cs. The logical parameter for this example
is the population regression coefficient, X131, of the variable T when controlling for a particular set of That
is, we may compare an interval estimate for pi when only CI and C2 are controlled to a corresponding interval
estimate for pi when Ci through C5 are controlled. The narrower interval of the two is then the interval
reflecting the most precision. For example, if the two 95% interval estimates are (2.6, 7.4) for gici,c2 and (1.7,
7.6) for ?????????????????????? then the former interval is narrower; in this case, some precision is gained by
dropping C3, C4, and Cs from the model.
4
Another problem concerns how to assess confounding when there are two or more study variables, say, T 1 and T2, of
interest. For this general situation, confounding may be defined to be present if (11.9) is satisfied for the coefficient of
any study variable of interest, given a model containing all such study variables and all control variables.
Unfortunately, this definition has the practical drawback of requiring several subjective decisions, one for each study
variable of interest.
(An alternative, but not exactly equivalent, approach to evaluating precision is to perform a statistical
test for the significance of the addition of C3 , C4, and C5 to a model containing T, C 1, and C2. The
null hypothesis for this test may be stated as Ho: $4 = 135 = 136 = 0 in the model (11.8) with p = 5. If
this test is not significant, then it may be argued that retaining C3, C4, and C5 does not provide
additional precision (i.e., explanation of variance). This would indicate that only C 1 and C2 should be
controlled for greater precision.
Because this testing approach will not always lead to the same conclusion as the approach
of estimating intervals, the investigator may need to choose between them. In most situations,
however, both approaches will usually lead to similar results.)
Now we shall address the question on identifying which set to control. We have seen, by example,
that we must first identify a baseline-adjusted estimate (i.e., a "gold standard") with which we can make
comparisons. The ideal gold standard is the regression coefficient estimate that controls for all C i 's being
considered. Then, any subset of C i 's that gives essentially the same adjusted estimate (i.e., an estimate
that is not meaningfully different from the gold standar d when only the Cis in that subset are controlled)
is a candidate set for control. It is even conceivable that several such candidates are possible
(Kleinbaum, Kupper, and Morgenstern, 1982, chap. 14).
Which set does one finally use? The answer, again, is based on precision: Use that set which gives
the most precision (e.g., the tightest confidence interval for the adjusted effect under study). (For
"political" reasons, that is, to convince people that all variables have been controlled, it might be better to
control for C i, C2 . . . C p unless some subset of C t 's leads to a large increase in precision.)
To illustrate, suppose that the candidate sets in Table 11 -1 can be identified when p = 5 in model
(11.8). All three proper subsets of C l, C2 , C s , C4 , and C s may be considered candidates for control since
they all give adjusted estimates roughly equal to the gold standard = 4.0. Of these candidates, the subset
involving Ci, C2, and C4 gives the best precision (narrowest confidence interval); therefore this subset can be
used both to control confounding and to enhance precision.
1-4-3 An Example Revisited
In Section 11-3-3 we considered a hypothetical study to assess the relationship between physical activity
level (PAL) and systolic blood pressure (SBP) while controlling for both AGE and SEX. A model that allows
for possible interactions of AGE and SEX with PAL was considered, and methods for testing for such
interactions were described. Assuming that no significant interaction effects were found, the resulting
reduced model is as follows:
???????????????????????????????
Given this no-interaction model, the next step is to assess confounding; that is, does the coefficient
of PAL change when AGE and/or SEX are dropped from the model? To answer this, we can examine the
estimate of the coefficient of PAL in four models, namely, one including both AGE and SEX, one
involving either AGE or SEX but not both, and one involving neither. The gold standard model for
comparison is the one (given above) that contains both control variables and PAL. Then, for example, if
the estimate of gi changes considerably when at least one control variable is dropped from this gold
standard model, we need to control for both AGE and SEX. However, if we obtain essentially the same
estimate of pi (as obtained using the gold standard model) when only AGE is in the model, we then dd not
need to retain SEX in the model to control for confounding. However, inclusion of the sex variable in
addition to AGE may increase or decrease precision. Thus, the decision as to whether to control for just
AGE or for both AGE and SEX would depend, for example, on a comparison of confidence intervals for
gi. If the confidence interval is considerably narrower when only AGE is controlled, then we wouldn't
retain SEX in the model.
Finally, once a decision is made about which variables are to be controlled (i.e., which is the best
model for providing a valid and precise estimate of the coefficient of PAL), we then make statistical
inferences about the true PAL—SBP relationship. Given the no-interaction model, this involves testing Ho
: pi = 0 in the best model and then obtaining an interval estimate of /31.
11-5 Summary and Conclusions
Confounding and interaction are two methodological concepts pertaining to the assessment of a
relationship between independent and dependent variables.
Interaction, which takes precedence over confounding, exists when the relationship of interest is
different at different levels of extraneous (control) variables. In linear regression, interaction is evaluated
using statistical tests about product terms involving basic independent variables in the model.
Confounding, which is not evaluated with statistical testing, is present when the eff of interest differs
depending on whether an extraneous variable is ignored or retained in analysis. In regression terms,
confounding is assessed by comparing crude versus adjust regression coefficients from different models.
When several potential confounders are being considered, it may be worthwhile identify
nonconfounders that can be dropped from the model to gain precision; this may n be possible (i.e.,
precision may be lost by dropping variables) in some situations.
When there is strong interaction involving a certain extraneous variable, the assessm of confounding
for that extraneous variable is irrelevant. Moreover, in such a situation assessment of confounding involving
other extraneous variables, though possible, is q complex and extremely subjective. Consequently, the
assessment of confounding is usually recommended when important interaction effects have been identified.