0% found this document useful (0 votes)

3 views

Models for Polytomous Responses AA 2016-2017

The document discusses polytomous logistic regression, focusing on nominal and ordinal response variables. It explains the baseline-category logit models for nominal responses and introduces various models for analyzing associations in tables with ordered categories. Additionally, it provides examples, including a study on alligator food choices and methods for assessing student perceptions in statistics classes.

Uploaded by

vincenzo.090

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

3 views

Models for Polytomous Responses AA 2016-2017

Uploaded by

vincenzo.090

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 48

Section 9: Polytomous Logistic Regression

1
Polytomous data

If the response of an individual or item in a study is

restricted to one of a fixed set of possible values, we sa
that the response is polytomous.

The k possible values of Y are called the response ca-

tegories.

Often the categories can be defined in a qualitative or

non-numerical way.

We need to develop satisfactory models that distinguish

several types of polytomous response. For instance, if
the categories are ordered, there is no compelling reason
for treat the extreme categories in the same way as the
intermediate ones.

However, if the categories are simply an unstructured

collection of labels, there is no reason a priori to select
a subset of categories for special treatment.

2
Nominal response variable: baseline-category logit mo-
dels
Let Y be a nominal response variable with J categories.
Logit models for nominal responses pair each respon-
se category with a baseline category. The choice of
baseline category is arbitrary.
Given a vector x of explanatory variables
J
X
πj (x) = P (Y = j|x) πj (x) = 1
j=1
If we have n independent observations based on these
probabilities, the probability distribution for the number
of outcomes that occur for each J types is a multinomial
with probabilities
(π1 (x), . . . , πJ (x)).
This model is basically just an extension of the binary
logistic regression model. It gives a simultaneous repre-
sentation of the odds of being in one category relative
to being in another category, for all pairs of categories.
Once the model specifies logits for a certain J − 1 pairs
of categories, the rest are redundant.
If the last category (J) is the baseline, the baseline
category logits model

πj (x)
log = αj + βj0 x j = 1, . . . , J − 1
πJ (x)
will describe the effect of x on the J − 1 logits.
3
Notes

Parameters in the (J − 1) equations determine parame-

ters for logits using all other pairs of response categories.
For instance, for an arbitrary pair of categories a and b:

πa πa /πJ πa πb
log = log = log − log =
πb πb /πJ πJ πJ

= (αa + βa x) − (αb + βb x)
= (αa − αb ) + (βa − βb )x

4
Alligator Food Choice Example
The data is taken from a study by the Florida Game and
Fresh Water Fish Commission of factors influencing the
primary food choice of alligators.
Primary food type has five categories: Fish, Inverte-
brate, Reptile, Birth and Other.
Explanatory variables are the Lake where alligators were
sampled and the Length of alligator.
food<-factor(c("fish","invert","rep","bird","other"),
levels=c("fish","invert","rep", "bird","other"))
size<-factor(c("<2.3",">2.3"),levels=c(">2.3","<2.3"))
gender<-factor(c("m","f"),levels=c("m","f"))
lake<-factor(c("hancock","oklawaha","trafford","george"),
levels=c("george","hancock", "oklawaha","trafford"))

table.7.1<-expand.grid(food=food,size=size,
gender=gender,lake=lake)

temp<-c(7,1,0,0,5,4,0,0,1,2,16,3,2,2,3,3,0,1,2,3,2,2,0,0,1,
13,7,6,0,0,3,9,1,0,2,0,1,0,1,0,3,7,1,0,1,8,6,6,3,5,2,4,1,1,
4,0,1,0,0,0,13,10,0,2,2,9,0,0,1,2,3,9,1,0,1,8,1,0,0,1)

table.7.1<-structure(.Data=table.7.1[rep(1:nrow(table.7.1),
temp),], row.names=1:219)
We fit several models
library(nnet)
5
fitS<-multinom(food~lake*size*gender,data=table.7.1)
fit0<-multinom(food~1,data=table.7.1) # null
fit1<-multinom(food~gender,data=table.7.1) # G
fit2<-multinom(food~size,data=table.7.1) # S
fit3<-multinom(food~lake,data=table.7.1) # L
fit4<-multinom(food~size+lake,data=table.7.1) # L+S
fit5<-multinom(food~size+lake+gender,data=table.7.1) #L+S+G
The likelihood ratio test for each model:
deviance(fit1)-deviance(fitS)
deviance(fit2)-deviance(fitS)
deviance(fit3)-deviance(fitS)
deviance(fit4)-deviance(fitS)
deviance(fit5)-deviance(fitS)
deviance(fit0)-deviance(fitS)
Collapsing over gender:

fitS<-multinom(food~lake*size,data=table.7.1) # saturated mode

fit0<-multinom(food~1,data=table.7.1) # null
fit1<-multinom(food~size,data=table.7.1) # S
fit2<-multinom(food~lake,data=table.7.1) # L
fit3<-multinom(food~size+lake,data=table.7.1) # L + S

deviance(fit1)-deviance(fitS)
deviance(fit2)-deviance(fitS)
deviance(fit3)-deviance(fitS)
deviance(fit0)-deviance(fitS)
According to the AIC the best model is fit3:
summary(fit3)
In this example the baseline category is the one tha
crosses “fish”, “ > 2.3” and “george”.

Results:

• In the George lake and small alligators, the odd to

choose an invertebrate rather than a fish is exp(1.46)
that is 4.3 times the estimated odd for large alli-
gators. So Length of alligators plays an important
role in determining their primary food choice.

• The estimated odds to choose an invertebrate ra-

ther than a fish are higher in Trafford and Oklawaha
lakes and lower in the Hancock lake all compared
with George lake.

6
Starting from these results we can evaluate all the re-
dundant odds ratios.

Fo example we can try to evaluate the odds of choosing

an “invertebrate” against “other” as:

πI log ππFI
πI

πO
log = = log − log =
πO πO
log πF π F π F

= (−1.55 + 1.465Size − 1.66ZH + 0.94ZO + 1.12ZT )−

(−1.90 + 0.335Size + 0.83ZH + 0.01ZO + 1.52ZT ) =
= 0.35 + 1.135 − 2.48ZH + 0.93ZO − 0.39ZT

7
Ordinal response variables: Log-Linear Association
models

Many tables are formed by cross-classifying variables wi-

th ordered categories. These can be categorical but
ordinal, such as Likert scales (for example, Strongly Di-
sagree, Disagree, Neutral, Agree, Strongly Agree) or
continuous variables that have been discretized, such as
income formed into intervals.

Tables with ordered categories allow for models with

different types of association built in, since concepts of
direct and inverse relationships make sense.

This permits parsimonious representation of a lack of

independence.

8
Ordinal response variables: 1. Linear by Linear (Uni-
form) association

Consider a table with rows and columns with ordinal

categories (BOTH) and assume that there exist kno-
wn scores {ui } (for rows) and {vj } (for columns) that
represent that ordering.

These scores could be:

• the actual values of a discrete underlying variable

• a score linked to an underlying continuous variable

• an equispaced representation of a non-numerical,

but ordinal scale (such as a Likert scale).

Most typically ui = i and vj = j.

The LbyL association model is

logµij = λ + λX Y
i + λj + θui vj

with constraints such as λX Y

i = λj = 0.

This can be seen as a special case of saturated model

in which λXY
ij = θui vj .

The uniform association model adds only one parameter

θ to the independence model, focusing all possible lack
of independence on that one parameter.
9
Tables with ordered categories: 1. Linear by Linear
(Uniform) association

• If θ = 0 independence holds.

• If θ > 0 the model implies that a higher expected

cell count occurs when ui and vj either go up TO-
GETHER or go down TOGETHER, so there is a
direct association relationship.

• If θ < 0 the model implies that higher expected

cell counts occur when ui is high and vj is low,
or vice versa, so there is an inverse association
relationship.

The θ parameter has a simple interpretation in terms of

odds ratios: the log odds ratio is directly proportional
to the product of the distance between the rows and
the distance between the columns.

So for example, for the 2 × 2 table using the cells inter-

secting rows a and c with columns b and d, then:

µab µcd
log = θ(uc − ua )(vd − vb )
µad µcb

This log odds ratio is stronger as |β| increases and for

pairs of categories that are farther apart. So, when
ui = i and vj = j the local odds ratios for adjacent rows
and adjacent columns have common value of eθ .
10
Tables with ordered categories: 2. Row and Column
Effects Models

The uniform association model assumes prespecified row

and column scores. Sometimes either the rows or co-
lumns (but not both) are not ordinal, so such scores
don’t exist for the nominal variable.

Another possibility is that equispaced scores are not ap-

propriate for a set of rows or columns, and it is conve-
nient to estimate appropriate scores based on the ob-
served data (for example, for the Likert scaled rows and
columns it might be that “Strongly disagree” is closer
to “Disagree” than “Disagree” is to “Neutral”).

Models that can fit tables of his type are the row effects
and column effects models.

11
The row effects model R has the form
logµij = λ + λX Y
i + λj + τi vj

Constraints are needed such as λX Y

I = λJ = τI = 0. The
{τi } are called row effects. This model has (I-1) more
parameters than the independence model.

Independence can be seen as a special case in which

τ 1 = τ2 = . . . = τI .

The row effects model treats the column as ordinal with

known scores and rows as nominal, since τ can take on
any values that sum to zero.

For this class of models for any pairs of rows r < s and
columns c < d the log of the odds ratio formed from the
2 × 2 table of those rows and columns is

µrcµsd
log = (τs − τr )(vd − vc )
µrd µsc

The log odds ratio is proportional to the distance bet-

ween the columns, with the constant of proportionality
being τs − τr .

12
The column effects model C takes the form
logµij = λ + λX Y
i + λj + ρj ui

where the ρ parameters sum to zero.

This model treats the rows as ordinal with known scores

and columns as nominal. Here the quantity ρd − ρc is a
measure of the closeness of the columns c and d with
respect to the conditional distribution of the rows given
the column.

13
A generalization of he row and column effects models
that allows for both row and column effects in the local
odds ratio is the row + column effects model (R+C)
logµij = λ + λX Y
i + λj + τi vj + ρj ui

The local log odds ratio for unit-spaced row and column
scores is
(τi+1 − τi ) + (ρj+1 − ρj )
incorporating row effects and column effects.

14
L×L model Example
library(gnm)
library(vcdExtra)
data(Mental) #or in the same way
dati<-expand.grid(mental=c("well","mild",
"moderate","impaired"),ses=1:6)
dati$Freq=c(64,94,58,46,57,94,54,40,57,105,65,60,
72,141,77,94,36,97,54,78,21,71,54,71)
Display the frequency table
Mental.tab <- xtabs(Freq ~ mental+ses, data=Mental)
Fit Independence model
indep <- glm(Freq ~ mental+ses,family = poisson, data = Mental)
deviance(indep) #or
o<-glm(Freq~factor(mental)+factor(ses), family=poisson, data=dati)
deviance(o)

Fit a Linear by Linear Model: use integer scores for rows

and cols
Cscore <- as.numeric(Mental$ses)
Rscore <- as.numeric(Mental$mental)

linlin <- glm(Freq ~ mental + ses + Rscore:Cscore,

family = poisson, data = Mental)

Or
linlin2<-glm(formula = Freq ~ factor(mental) + factor(ses) +
as.numeric(mental):as.numeric(ses),
family = poisson, data = dati)

Now compare models

anova(indep,linlin)
AIC(indep,linlin)

15
Row effects model Example
roweff <- glm(Freq ~ mental + ses + mental:Cscore,
family = poisson, data = Mental)

roweff <- glm(Freq ~ factor(mental)+factor(ses) + mental:Cscore,

family = poisson, data = dati)

16
Column effects model Example
coleff <- glm(Freq ~ mental + ses + Rscore:ses,
family = poisson, data = Mental)

coleff <- glm(Freq ~ factor(mental)+factor(ses) + Rscore:ses,

family = poisson, data = dati)

17
Exercise: student perception of statistics class
assessment methods

Aim: the study of the association between the me-

thods used in class assessment (Structured computer
assignments, Open-ended assignments, Article analysis,
Annotating output) and the amount students learned
(Didn’t learn anything, Learned a little bit, Learned
enough to be comfortable with topic, learned a great
deal).
dati<-expand.grid(response=gl(4,1),
assignments=gl(4,1,labels = c("Structured", "Open","ArtAnaly",
"Annotoutput")))
dati$Freq<-c(0,3,8,3,0,1,7,6,1,6,4,2,0,4,8,2)
# display the frequency table
(assign.tab <- xtabs(Freq ~response+assignment, data=dati))
chisq.test(assign.tab) #test for independence

In this specific case LbyL model, R model and R+C

model cannot be applied because the method used for
assessments is a NOMINAL variable.

The only model that makes sense is a column effects

model:
Rscore <- as.numeric(dati$response)

coleff <- glm(Freq ~ as.factor(response) + as.factor(assignments)

+ Rscore:assignments,family = poisson, data = dati)

18
Ordinal response variables: 1. Cumulative Logit Models
The logits of the first J − 1 cumulative probabilities are:

P (Y ≤ j|x)
logit[P (Y ≤ j|x)] = log =
1 − P (Y ≤ j|x)

π1 (x) + π2 (x) + . . . + πj (x)
= log j = 1, . . . , J − 1
πj+1 (x) + . . . + πJ (x)

A model for the j-th cumulative logit looks like an or-

dinary logit model for a binary response in which cate-
gories 1 to j combine to form a single category, and
categories j + 1 to J form a second category.
It is possible to consider parsimonious models that con-
sider all the J − 1 cumulative logits in a single model:
Proportional Odds Model.
A Proportional Odds Model assumes the following struc-
ture:
logit[P (Y ≤ j|x)] = αj + β T x j = 1, . . . , J − 1
It considers:

• different intercepts for each cumulative logit and

these intercepts will be an increasing function with
j;

• a parameter β describing the effect of X on the log

odds of response in category j or below; it assumes
19
an identical effect of X for all J − 1 cumulative
logits.

This means that when this model fits well, it requires a

single parameter rather than J −1 parameters to describe
the effect of X.

This class of models is called Proportional Odds Model

because it satisfies
logit[P (Y ≤ j|x1 )] − logit[P (Y ≤ j|x2 )] =

P (Y ≤ j|x1 )/P (Y > j|x1 )

= log = β T ( x1 − x 2 )
P (Y ≤ j|x2 )/P (Y > j|x2 )
in other words, the cumulative log odds is proportional
to the distance between x1 and x2 , that is, the odd to
give an answer ≤ j when X = x1 is exp[β(x1 − x2 )] times
the odd in X = x2 and this value will be equal for all
the logits.
Comments:

• When the model holds with β = 0, X and Y are

statistically independent;

• Explanatory variables in cumulative logit models

can be continuous, categorical or of both types.

• The ML fitting process uses an iterative algorithm

simultaneously for all j.

20
For simplicity, let’s consider only one predictor:
logit[P (Y ≤ j)] = αj + βx

Then the cumulative probabilities are given by:

P (Y ≤ j) = exp(αj + βx)/(1 + exp(αj + βx))
and since β is constant, the curves of cumulative pro-
babilities plotted against x are parallel.

21
Cheese-Tasting Example (McCullagh and Nelder,
1989)

In this example, subjects were randomly assigned to ta-

ste one of four different cheeses. Response categories
are 1=strong dislike to 9=excellent taste.

By inspection, we can see that D is the most preferable,

followed by A, C and B.

Let’s try to model these data by a proportional-odds

cumulative-logit model with three dummy codes to di-
stinguish among the four chesses.

22
• How many logit models?
(J −1)∗(k−1) where Jis the number of the response
categoris and K the number of regressors in the
model;

• The model will have 8 intercepts (one for each of

the logit equations) and 3 slopes, for a total of 11
free parameters.

• By comparison, the saturated model, which fits

a separate 9-category multinomial distribution to
each of the four cheeses, has 4 × (9 − 1) = 32 free
parameters.

• Therefore, the overall goodness-of-fit test will have

32-11 = 21 degrees of freedom.

23
The vglm() function

The VGAM library in R contains the vglm() function

useful in order to fit several models. Possible models
include the cumulative logit model (family function cu-
mulative) with proportional odds or partial proportional
odds or nonproportional odds, cumulative link models
(family function cumulative) with or without common
effects for each cutpoint, adjacent-categories logit mo-
dels (family function acat), and continuation-ratio logit
models (family functions cratio and sratio).

The vglm() function needs the response variable spe-

cified in its “cbinded” form (in its Full Disjunctive Co-
ding).

The syntax of the vglm() function is very similar to the

standard glm().
An important difference is that the weigths argument
unlike glm() has not to be a vector of frequencies but
weights defined a priori.

24
library(VGAM)
cheese <- read.table("cheese.dat.txt",
col.names=c("Cheese", "Response", "N"))
is.factor(cheese$Response)
cheese$Response<-factor(cheese$Response, ordered=T)
mod.sat<-vglm(Response~Cheese,cumulative,
weights=c(N+0.5),data=cheese)

mod.podds<-vglm(Response~Cheese,cumulative(parallel=TRUE),
weights=c(N+0.5),data=cheese)

summary(mod.sat)
summary(mod.podds)
matplot(t(mod.podds@predictors[seq(1,36,by=9),]),type="l",
ylab="Cumulative logits",main="Proportional odds model")
#Add a legend will be surely useful!
matplot(t((exp(mod.podds@predictors)/(1+exp(mod.podds@predictors)))
[seq(1,36,by=9),]),type="l",ylab="Cumulative Probability Curves",
main="Proportional odds model")

25
In this case, a positive coefficient β means that in-
creasing the value of X tends to lower the response
categories (i.e. produce greater dislike).
summary(mod.podds)

Call:
vglm(formula = Response ~ Cheese, family = cumulative(parallel = TRUE),
data = cheese, weights = c(N + 0.5))

Coefficients:
Estimate Std. Error z value
(Intercept):1 -4.84428 0.45697 -10.60089
(Intercept):2 -3.84779 0.37446 -10.27564
(Intercept):3 -2.86231 0.32751 -8.73959
(Intercept):4 -1.91322 0.29232 -6.54497
(Intercept):5 -0.73965 0.25589 -2.89044
(Intercept):6 0.10951 0.24755 0.44237
(Intercept):7 1.44853 0.28180 5.14020
(Intercept):8 2.89229 0.36928 7.83216
CheeseB 2.82260 0.38300 7.36978
CheeseC 1.44005 0.34794 4.13883
CheeseD -1.39122 0.35218 -3.95026

Residual deviance: 817.3119 on 277 degrees of freedom

Log-likelihood: -408.656 on 277 degrees of freedom

26
CheeseB 2.82260 0.38300 7.36978
CheeseC 1.44005 0.34794 4.13883
CheeseD -1.39122 0.35218 -3.95026

The second part of the output is the coefficient esti-

mates for the three dummy variables. The estimated
slope for the first dummy variable, labeled cheese B, is
2.82260. This indicates that cheese B does not taste
as good as cheese A. Looking at all three coefficients,
and noting that cheese A is the reference category such
that β2 compares cheese C to A and β3 cheese D to A,
we see that the implied ordering of cheeses in terms of
quality is D > A > C > B. Furthermore, D is signifi-
cantly better preferred than A, but A is not significantly
better than C.

The first part of the output includes the estimated in-

tercepts. The first parameter is the estimated log-
odds of falling into category 1 (strong dislike) versus
all other categories when all X-variables are zero. Be-
cause X1 = X2 = X3 = 0 when cheese=A, the esti-
mated log-odds of better taste for cheese A are exp(-
4.84428). From the above output, the first estimated
logit equation then is

P (Y ≤ 1)
logit[P (Y ≤ 1] = log =
P (Y > 1)

= −4.84428 + 2.82260X1 + 1.44005X2 − 1.39122X3

27
28
Ordinal response variables: 2. Adjacent-Category Logi-
ts models

Adjacent-Category Logits models can be defined as:

πj
logit[P (Y = j|Y = jorj+1)] = log , j = 1, . . . , J−1
πj+1

πj
log = αj + βx j = 1, . . . , J − 1
πj+1
with a common effect β.

Also in this case a set of logit will be defined and starting

from them it will be possible to derive all the J2 pairs

of response categories.

An Adjacent-Category Logits model can be seen as a

baseline logit where the baseline changes for each cate-
gory.

29
Job Satisfaction Example

Aim of this example is to study the relationship between

job satisfaction (Very Dissatisfied, Little Satisfied, Mo-
derately Satisfied, Very Satisfied) and income (< 5.000,
5.000 − 15.000, 15.000 − 25.000, > 25.000) stratified by
gender (1=female, 0=males), for black Americans.

For simplicity, we use job satisfaction scores and income

scores 1, 2, 3, 4.

The fitted model will be

log(πj /πj+1 ) = αj + β1 x + β2 g j = 1, 2, 3

IIt describes the odds of being very dissatisfied instead of

a little satisfied, a little instead of moderately satisfied,
and moderately instead of very satisfied. This model
is equivalent to the baseline-category logit model with
reference category 4, ovvero

log(πj /π4 ) = α∗j + β1 (4 − j)x + β2 (4 − j)g j = 1, 2, 3

30
In order to fit an Adjacent-Category Logit model in
R we have to specify the acat family specifying the Link
function applied to the ratios of the adjacent categories
probabilities (loge) and parallel=TRUE A logical if in the
formula some terms are assumed to have equal/unequal
coefficients.

table.7.8<-read.table("jobsat.txt", header=TRUE)
table.7.8$jobsatf<-ordered(table.7.8$jobsat,
labels=c("very diss","little sat","mod sat",
"very sat"))

table.7.8a<- data.frame(expand.grid(income=1:4,
gender=c(1,0)),unstack(table.7.8,freq~jobsatf))

library(VGAM)

fit.vglm<-vglm(cbind(very.diss,little.sat,
mod.sat,very.sat)~gender+income,
family= acat(link="loge",parallel=T,reverse=T),
data=table.7.8a)

summary(fit.vglm)

31
summary(fit.vglm)

Coefficients:
Estimate Std. Error z value
(Intercept):1 -0.550668 0.67945 -0.81046
(Intercept):2 -0.655007 0.52527 -1.24700
(Intercept):3 2.025934 0.57581 3.51842
gender 0.044694 0.31444 0.14214
income -0.388757 0.15465 -2.51372

Number of linear predictors: 3

Names of linear predictors:

log(P[Y=1]/P[Y=2]), log(P[Y=2]/P[Y=3]), log(P[Y=3]/P[Y=4])

Dispersion Parameter for acat family: 1

Residual deviance: 12.55018 on 19 degrees of freedom

The ML fit gives beta ˆ 1 = −0.389(SE = 0.155) and

β̂2 = 0.045(SE = 0.314). For this parameterization,
ˆ 1 < 0 means the odds of lower job satisfaction de-
beta
crease as income increases. Given gender, the estimated
odds of response in the lower of two adjacent catego-
ries multiplies by exp(−0.389) = 0.68 for each category
increase in income. The model describes 24 logits (th-
ree for each income × gender combination) with five
parameters. Its deviance G2 = 12.6 with df = 19. This
model with a linear trend for the income effect and a
lack of interaction between income and gender seems
adequate.

32
Ordinal response variables: 3. Continuation-Ratio Lo-
gits

Continuation-ratio logits can be defined as

πj
log j = 1, . . . , J − 1
πj+1 + πj+2 + . . . + πJ
or also

πj+1
log j = 1, . . . , J − 1
π 1 + π2 + . . . + πj
They are useful when the response variable represents a
sequential mechanism such as the survival as a function
of age.

Let ωj = P (Y = j|Y ≥ j), given the vector of explana-

tory variables x
πj (x)
ωj (x) = j = 1, . . . , J − 1
πj (x) + . . . + πJ (x)
h i
ωj (x)
and continuation-ratios became ordinary logits log 1−ωj (x)
.

33
Esempio: Streptococcus e grandezza delle tonsille

Aim of the study is to investigate the relationship bet-

ween tonsils size (Not enlarged, Enlarged, Greatly En-
larged) and the presence of Streptococcus (1 = yes, 0
= no). Let x be the indicator variable about the pre-
sence of Streptococcus pyogenes; then the continuatio
logit model will be

π1
log = α1 + βx
π 2 + π3

π2
log = α2 + βx
π3

where in the first part a common value of the cumulative

odds ratio will be estimated while in the second part we
will estimate a local odds rartio.

carrier<-c(1,0)
y1<-c(19,497)
y2<-c(29,560)
y3<-c(24,269)
tonsil<-cbind(carrier,y1,y2,y3)
tonsil<-as.data.frame(tonsil)
tonsil$carrier<-as.factor(tonsil$carrier)

library(VGAM)
fit.cratio<-vglm(cbind(y1,y2,y3)~carrier,
family=cratio(reverse=FALSE, parallel=TRUE),
34
data=tonsil)
summary(fit.cratio)
fitted(fit.cratio)

The model goodness of fit shows an adequacy of the

fitted model (deviance 0.01, df = 1); β̂ = −0.528(SE =
0.197)

For Streptococcus carriers the odd of having “Enlarged”

tonsils vs “Greatly Enlarged” is 0.59 (exp(-0.528)) the
odd of not carriers.
Section 9b:The Bradley-terry Model

35
Consider an experiment consisting of nij judges who
compare pairs of items Ti , i = 1, . . . , M +1. They
PP express
their preferences between Ti and Tj . Let N = i<j nij
be the total number of pairwise comparisons, and assu-
me independence for ratings of the same pair by diffe-
rent judges and for ratings of different pairs by the same
judge.
A model describing this experiment was proposed by
Bradley and Terry (1952) and Zermelo (1929). Let πi
be the worth of item Ti ,
πi
P [Ti > Tj ] = pi/ij =
π i + πj
i 6= j, where Ti > Tj means i is preferred over j. Suppose
that πi > 0. Let Yij be the number of times that Ti is
preferred over Tj in the nij comparisons of the pairs.
Then Yij ∼ Bin(nij , pi/ij ).

Maximum likelihood estimation of the parameters π1 , . . . , πM +1

involves maximizing,

M +1 yij nij −yij

Y nij πi πj
i<j
yij πi + πj πi + πj

By default, πM +1 ≡ 1 is used for identifiability, however,

this can be changed very easily.

36
Note that one can define linear predictors ηij of the form

πi πi
logit = log = λi − λj
πi + πj πj
.

The VGAM framework can handle the Bradley-Terry

model only for intercept models. It has
λj = ηj = logπj = β(1)j , j = 1, . . . , M.

As well as having many applications in the field of prefe-

rences, the Bradley-Terry model has many uses in mo-
delling “contests” between teams i and j, where only
one of the teams can win in each contest (ties are not
allowed under the classical model).

The R package BradleyTerry by D. Firth can fit the

Bradley-Terry model; see Firth (2005) for details.

37
Example: the brat() function in VGAM

Consider the effect of the food-enhancer monosodium

glutamate (MSG) on the flavour of apple sauce in the
data given in Table 4. Treatments 1, 2 and 3 are in-
creasing amounts of the substance, and Treatment 4
is a control with no MSG. Four independent compa-
risons were made of each of the six pairs. We apply
the vgam family function brat(), which implements the
Bradley-Terry model, to the apple sauce data.
amsg = matrix(c(NA, 3, 3, 3, 1, NA, 3, 4, 1, 1, NA, 0,
+ 1, 0, 4, NA), 4, 4, byrow = TRUE)

dimnames(amsg) = list(winner = as.character(1:4),

loser = as.character(1:4))

fit = vglm(Brat(amsg) ~ 1, brat)

summary(fit)

The first argument has to be specified in a Brat form: a

matrix of counts, which is considered M by M in dimen-
sion when there are ties, and M +1 by M +1 when there
are no ties. The rows are winners and the columns are
losers, e.g., the 2 − 1 element is how many times Com-
petitor 2 has beaten Competitor 1. The matrices are
best labelled with the competitors’ names.

38
Coef(fit)

alpha1 alpha2 alpha3

3.3576125 2.4456117 0.3693147

By default, the last reference group is baseline, so that

λ4 ≡ 1. We have λ̂1 ≈ 3.358, λ̂2 ≈ 2.446, λ̂3 ≈ 0.369,
therefore we conclude that Treatment 1 is the most
preferred, followed by Treatments 2 and 4, and lastly
Treatment 3. It appears that the more MSG, the worse
the taste, however, the control treatment tastes the
second worse. Finally,
InverseBrat(fitted(fit))

1 2 3 4
1 NA 0.5785771 0.9009064 0.7705165
2 0.42142293 NA 0.8688013 0.7097758
3 0.09909362 0.1311987 NA 0.2697077
4 0.22948346 0.2902242 0.7302923 NA

gives the estimated probabilities of Treatments i “bea-

ting” Treatments j, P̂ [i > j].

39
The Bradley-terry Model: the BradleyTerry2
package

In some application contexts there may be “player-specific”

explanatory variables available, and it is then natural to
consider model simplification of the form
p
X
λi = βr xir + Ui
r=1

in which ability of each player i is related to explana-

tory variables xi1 , . . . , xip through a linear predictor with
coefficients β1 , . . . , βp ; the {Ui } are independent errors.
See, for example, Springall (1973). The difference in
the abilities of player i and player j is modelled by

p
X p
X
λi = βr xir − βr xjr + Ui − Uj
r=1 r=1

where Ui ∼ N (0, σ 2 ) for all i. The Bradley-Terry mo-

del is then a generalized linear mixed model, which the
BTm function currently fits using the penalized quasi-
likelihood algorithm of Breslow and Clayton (1993).

The BTm function of the BradleyTerry package allows

such models to be specified in a natural way using the
standard S-language model formulae.
The simplest model, with just one predictor, asks for
random effects specification.
40
Example: the BTm() function in BradleyTerry2

The following comes from page 448 of Agresti (2002),

extracted from the larger table of Stigler (1994). The
data are counts of citations among four prominent jour-
nals of statistics:
> data(citations)
> citations

winner loser Freq

1 Biometrika Biometrika NA
2 Comm Statist Biometrika 33
3 JASA Biometrika 320
4 JRSS-B Biometrika 284
5 Biometrika Comm Statist 730
6 Comm Statist Comm Statist NA
7 JASA Comm Statist 813
8 JRSS-B Comm Statist 276
9 Biometrika JASA 498
10 Comm Statist JASA 68
11 JASA JASA NA
12 JRSS-B JASA 325
13 Biometrika JRSS-B 221
14 Comm Statist JRSS-B 17
15 JASA JRSS-B 142
16 JRSS-B JRSS-B NA

Here ‘winner’ means the cited journal, ‘loser’ the journal

in which the citation appears; thus, for example, Bio-
metrika was cited 498 times by papers in JASA during
the period under study.

41
The Bradley-Terry model can now be fitted by using
function BTm from the BradleyTerry package. Here we
fit the INTERCEPT model and store the result as an
object named citeModel:
> library(BradleyTerry2)

Convert frequencies to success/failure data:

> citations.sf <- countsToBinomial(citations)

> names(citations.sf)[1:2] <- c("journal1", "journal2")

> head(citations.sf)
journal1 journal2 win1 win2
1 Biometrika Comm Statist 730 33
2 Biometrika JASA 498 320
3 Biometrika JRSS-B 221 284
4 Comm Statist JASA 68 813
5 Comm Statist JRSS-B 17 276
6 JASA JRSS-B 142 325

42
Standard Bradley-Terry model fitted to these data

> citeModel <- BTm(cbind(win1, win2), journal1, journal2,

data = citations.sf)
> citeModel

Bradley Terry model fit by glm.fit

Call: BTm(outcome = cbind(win1, win2), player1 = journal1,

player2 = journal2, data = citations.sf)

Coefficients:
..Comm Statist ..JASA ..JRSS-B
-2.9491 -0.4796 0.2690

Degrees of Freedom: 6 Total (i.e. Null); 3 Residual

Null Deviance: 1925
Residual Deviance: 4.293 AIC: 46.39

The coefficients here are maximum likelihood estimates

of λ2 , λ3 , λ4 , with λ1 (the log-ability for Biometrika) set
to zero as an identifying convention.

43
If a different ‘reference’ journal is required, this can be
achieved using the optional refcat argument: for exam-
ple, making use of update to avoid re-specifying the
whole model,

> update(citeModel, refcat = "JASA")

Bradley Terry model fit by glm.fit

Call: BTm(outcome = cbind(win1, win2), player1 = journal1,

player2 = journal2, refcat = "JASA", data = citations.sf)

Coefficients [contrasts: ..=contr.treatment ]:

..Biometrika ..Comm Statist ..JRSS-B
0.4796 -2.4695 0.7485

Degrees of Freedom: 6 Total (i.e. Null); 3 Residual

Null Deviance: 1925
Residual Deviance: 4.293 AIC: 46.39

It is the same model in a different parameterization.

BTm(outcome = cbind(win1, win2), player1 = journal1, player2 = journal2,

formula = ~journal, id = "journal", refcat = "JASA", data = citations.sf)

44
Example by Gioè-Guastella (a.a 2016/2017)

library(VGAM)

data(football)
f<-football[-1]

ff1<-table(subset(f,result==1))
ff2<-table(subset(f,result==-1))
ff3<-ff1+ff2
ff3<-as.data.frame(ff3)
ff4<-matrix(ff3$Freq,29,29, byrow=T)
diag(ff4)<-rep(NA,29)
dimnames(ff4)<-list(ff3$home[1:29],ff3$home[1:29])
ff4<-t(ff4)

fit=vglm(Brat(ff4)~1,brat)

classifica<-sort((fit@coefficients))

Aermec AER485P1 Accessory Manual Eng
No ratings yet
Aermec AER485P1 Accessory Manual Eng
44 pages
Regresi Logistik
No ratings yet
Regresi Logistik
34 pages
Lecture 24: Ordinal Logistic Regression
No ratings yet
Lecture 24: Ordinal Logistic Regression
4 pages
Agresti Ordinal Tutorial
No ratings yet
Agresti Ordinal Tutorial
75 pages
Probit Logit Interpretation
No ratings yet
Probit Logit Interpretation
26 pages
Article: An Introduction Tos Logistic Regression Analysis and Reporting
No ratings yet
Article: An Introduction Tos Logistic Regression Analysis and Reporting
5 pages
Ordered Logit Models - Basic & Intermediate Topics
No ratings yet
Ordered Logit Models - Basic & Intermediate Topics
16 pages
Basic R Programming: Exercises
No ratings yet
Basic R Programming: Exercises
7 pages
Linear by Linear Association
0% (1)
Linear by Linear Association
5 pages
STAT659: Chapter 6
No ratings yet
STAT659: Chapter 6
30 pages
SAHADEB - Categorical - Data - Lecture3
No ratings yet
SAHADEB - Categorical - Data - Lecture3
84 pages
Listcoef
No ratings yet
Listcoef
76 pages
J X X R X R: (B) Multivariate Regression Models
No ratings yet
J X X R X R: (B) Multivariate Regression Models
7 pages
Logit Probit and Tobit Models For Catego PDF
No ratings yet
Logit Probit and Tobit Models For Catego PDF
19 pages
Fernando, Logit Tobit Probit March 2011
No ratings yet
Fernando, Logit Tobit Probit March 2011
19 pages
L9 Logistical Regression Models Updated
No ratings yet
L9 Logistical Regression Models Updated
10 pages
multinomial_ordinal_models
No ratings yet
multinomial_ordinal_models
16 pages
Ordered Logit Model
No ratings yet
Ordered Logit Model
4 pages
The Best Olr Method
No ratings yet
The Best Olr Method
12 pages
Alternatives To Logistic Regression (Brief Overview)
No ratings yet
Alternatives To Logistic Regression (Brief Overview)
5 pages
Multicategory Logit Models
No ratings yet
Multicategory Logit Models
49 pages
Logistic Regression
No ratings yet
Logistic Regression
25 pages
Logistic Regression-Part 4: Kami Memimpin We Lead
No ratings yet
Logistic Regression-Part 4: Kami Memimpin We Lead
45 pages
Section 9 Limited Dependent Variables
No ratings yet
Section 9 Limited Dependent Variables
17 pages
Chapter 3_Logit and Probit Models
No ratings yet
Chapter 3_Logit and Probit Models
34 pages
Ordered Response Models
No ratings yet
Ordered Response Models
15 pages
Rologit PDF
No ratings yet
Rologit PDF
9 pages
SOC6078 SOC6078 Advanced Statistics: 4. Models For Categorical Dependent Variables II Extending The Logit and Probit Models
No ratings yet
SOC6078 SOC6078 Advanced Statistics: 4. Models For Categorical Dependent Variables II Extending The Logit and Probit Models
15 pages
PD2004 9
No ratings yet
PD2004 9
26 pages
Assignment On Probit Model
No ratings yet
Assignment On Probit Model
17 pages
Chapter 15 Qualitative Response Regression Models Part 2
No ratings yet
Chapter 15 Qualitative Response Regression Models Part 2
31 pages
Chap9 Agresti
No ratings yet
Chap9 Agresti
12 pages
Loglinear Models: Angela Jeansonne
No ratings yet
Loglinear Models: Angela Jeansonne
20 pages
Logistic Regression (Peng Et Al)
No ratings yet
Logistic Regression (Peng Et Al)
13 pages
7 Short Logistic Regression
No ratings yet
7 Short Logistic Regression
13 pages
Binary Logistic Regression - 6.2
No ratings yet
Binary Logistic Regression - 6.2
34 pages
Multinomial Logit or Probit Model 2
No ratings yet
Multinomial Logit or Probit Model 2
13 pages
CH-4-Discrete Choice Models-short
No ratings yet
CH-4-Discrete Choice Models-short
58 pages
Logistic Regression: Psy 524 Ainsworth
No ratings yet
Logistic Regression: Psy 524 Ainsworth
37 pages
Econometrics - Qualitative Response Models
No ratings yet
Econometrics - Qualitative Response Models
17 pages
An Introduction To Logistic Regression
No ratings yet
An Introduction To Logistic Regression
13 pages
Econometrics 2 Module 6 Video 3 Canvas
No ratings yet
Econometrics 2 Module 6 Video 3 Canvas
10 pages
Multinomial & Ordinal LR Possion1
No ratings yet
Multinomial & Ordinal LR Possion1
63 pages
Probit Logit Indiana
No ratings yet
Probit Logit Indiana
62 pages
Regression3 Slides
No ratings yet
Regression3 Slides
47 pages
An Introduction To Logistic Regression in R
No ratings yet
An Introduction To Logistic Regression in R
25 pages
Probit Logit Ohio PDF
No ratings yet
Probit Logit Ohio PDF
16 pages
Ordered Probit and Logit Models
No ratings yet
Ordered Probit and Logit Models
5 pages
5.3) Ordinal logistic regression 2
No ratings yet
5.3) Ordinal logistic regression 2
40 pages
An Introduction To Logistic Regression: Johnwhitehead Department of Economics East Carolina University
No ratings yet
An Introduction To Logistic Regression: Johnwhitehead Department of Economics East Carolina University
48 pages
Generalized Ordered Logit/partial Proportional Odds Models For Ordinal Dependent Variables
No ratings yet
Generalized Ordered Logit/partial Proportional Odds Models For Ordinal Dependent Variables
25 pages
Categorical Dependent Variable Regression Models Using STATA, SAS, and SPSS
No ratings yet
Categorical Dependent Variable Regression Models Using STATA, SAS, and SPSS
32 pages
Rologit
No ratings yet
Rologit
10 pages
Modeling Ordinal Categorical Data (Agresti)
No ratings yet
Modeling Ordinal Categorical Data (Agresti)
71 pages
Logit and Probit Models
No ratings yet
Logit and Probit Models
44 pages
1 Loglinear Models For Contingency Tables
No ratings yet
1 Loglinear Models For Contingency Tables
12 pages
slides-7-iu
No ratings yet
slides-7-iu
48 pages
Topology Essentials
From Everand
Topology Essentials
Emil G. Milewski
5/5 (1)
Student's Solutions Manual and Supplementary Materials for Econometric Analysis of Cross Section and Panel Data, second edition
From Everand
Student's Solutions Manual and Supplementary Materials for Econometric Analysis of Cross Section and Panel Data, second edition
Jeffrey M. Wooldridge
No ratings yet
Co-Clustering: Models, Algorithms and Applications
From Everand
Co-Clustering: Models, Algorithms and Applications
Gérard Govaert
No ratings yet
Math for Computer Applications
From Everand
Math for Computer Applications
The Editors of REA
No ratings yet
Educ 203
No ratings yet
Educ 203
27 pages
Instruction Manual. FX200 Fluid Section FX200PU-CMX FX200PU-SMX
No ratings yet
Instruction Manual. FX200 Fluid Section FX200PU-CMX FX200PU-SMX
13 pages
Tenant Retention Strategy
No ratings yet
Tenant Retention Strategy
5 pages
Checklist For Final Setting Review Rev00
No ratings yet
Checklist For Final Setting Review Rev00
3 pages
Boiler Exhust Stand 2 REPORT
No ratings yet
Boiler Exhust Stand 2 REPORT
15 pages
Synopsis
No ratings yet
Synopsis
13 pages
T-61.246 Digital Signal Processing and Filtering T-61.246 Digitaalinen Signaalink Asittely Ja Suodatus Description of Example Problems
No ratings yet
T-61.246 Digital Signal Processing and Filtering T-61.246 Digitaalinen Signaalink Asittely Ja Suodatus Description of Example Problems
35 pages
16884
No ratings yet
16884
57 pages
Analysis of Gas Production Data Using Flowing Material Balance Method
No ratings yet
Analysis of Gas Production Data Using Flowing Material Balance Method
22 pages
Unit 8 Vectors Exam Review. Demo
No ratings yet
Unit 8 Vectors Exam Review. Demo
4 pages
Microbehunter 2012 02
No ratings yet
Microbehunter 2012 02
26 pages
Get Smart 4 British Edition - Test 2
No ratings yet
Get Smart 4 British Edition - Test 2
3 pages
Holiday Homework Grade VIII CAIE 2022 23
No ratings yet
Holiday Homework Grade VIII CAIE 2022 23
18 pages
Slide - 1 - Math-1151
No ratings yet
Slide - 1 - Math-1151
25 pages
1 - Neural Granger Causality
No ratings yet
1 - Neural Granger Causality
13 pages
Journal of Statistical Software: Elastic Net Regularization Paths For All Generalized Linear Models
No ratings yet
Journal of Statistical Software: Elastic Net Regularization Paths For All Generalized Linear Models
31 pages
COT Q3 Math
No ratings yet
COT Q3 Math
7 pages
Practice Test 50: Passage 1
No ratings yet
Practice Test 50: Passage 1
6 pages
Drager Sensor IR
No ratings yet
Drager Sensor IR
64 pages
Module 4 Feedback and Control System
No ratings yet
Module 4 Feedback and Control System
2 pages
June 2023 (v3) MS
No ratings yet
June 2023 (v3) MS
12 pages
Construc Tion: The Fullerton Bay Hotel
No ratings yet
Construc Tion: The Fullerton Bay Hotel
1 page
Workshop About Flowers and Reproduction
No ratings yet
Workshop About Flowers and Reproduction
3 pages
Genetics Project
No ratings yet
Genetics Project
9 pages
Mã 122
No ratings yet
Mã 122
5 pages
The Effect of Social Media Usages On Self - Esteem Among All Grade 7 Students in Linamon National High School
No ratings yet
The Effect of Social Media Usages On Self - Esteem Among All Grade 7 Students in Linamon National High School
7 pages
Evaporation of Titanium
No ratings yet
Evaporation of Titanium
11 pages
Download AP Comparative Government and Politics An Essential Coursebook 7th Edition Ethel Wood ebook All Chapters PDF
100% (13)
Download AP Comparative Government and Politics An Essential Coursebook 7th Edition Ethel Wood ebook All Chapters PDF
43 pages
1.24 Course Highlights - Heat Exchanger Control
No ratings yet
1.24 Course Highlights - Heat Exchanger Control
29 pages