0% found this document useful (0 votes)
13 views

Logistic Regression Using SPSS

univariat multivariat
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views

Logistic Regression Using SPSS

univariat multivariat
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 29

Logistic Regression Using SPSS for

Physical Therapy Research


Presented by: I Putu Gde Surya Adhitya, S.Ft., M.Fis., Ph.D.
Department of Physical Therapy, College of Medicine, Universitas Udayana
Overview
• Brief introduction of Logistic Regression.

• Logistic Regression Analysis Using SPSS for


Physical Therapy Research.
Brief introduction of Logistic Regression

Logistic Regression

- Logistic regression is utilized to predict a categorical


(often dichotomous) variable from a set of predictor
variables.

- The predicted dependent variable for a logistic regression is a


function of the likelihood that a specific subject would fall into
one of the categories.
Logistic Regression - Examples

- A researcher wants to understand whether exam performance


(passed or failed) can be predicted based on revision time, test
anxiety and lecture attendance.

- A researcher wants to understand whether drug use (yes or no)


can be predicted based on prior criminal convictions, drug use
amongst friends, income, age and gender.

- A researcher wants to understand whether ACL injury (yes or no)


can be predicted based on quadriceps muscles strength,
hamstrings muscles strength, type of sports, age, gender, and
body mass index.
Logistic Regression - Assumption

1. You should use a dichotomous scale to assess your dependent


variable.

2. One or more independent variables that can be categorical or


continuous are present.

3. The dependent variable should have categories that are mutually


exclusive and exhaustive, and the observations should be
independent.

4. Any continuous independent variables and the logit


transformation of the dependent variable must have a linear
connection => Test of Box-Tidwell
Box-Tidwell Test

- We include in the model the interactions between the


continuous predictors and their logs.

- If the interaction term is statistically significant, the original


continuous independent variable is not linearly related to the
logit of the dependent variable.

- Don’t worry about the significant interaction if the sample


sizes are large.
Performing the Analysis Using SPSS for Physical Therapy Research

Dataset

Please download the dataset using this link:

https://drive.google.com/file/d/1rXPcbkYIHN5iY0yEgo07DDh-v2x

AE8H4/view?usp

=sharing

And open it in SPSS


Dataset
1)The dependent variable, Meniscus, which is whether the participant has
meniscus injury;

2) The independent variable, age , which is the participant's age in years;

3)The independent variable, body mass index , which is the participant's weight
(kg) : height (m)2;

4)The independent variable, sex , which has two categories: "Male" and "Female";

5)The independent variable, education, which has two categories: “college or


above” and “senior high school”;

6) The independent variable, KOOSADLbaseline, which is the participant’s


activity daily score from 0 – 100, higher score represents better ADL;

7) The independent variable, KOOSsportbaseline, which is the participant’s sport


functions score from 0 – 100, higher score represents better sport functions.
Click Transform > Compute Variable:

- We want to compute the logs of any continuous independent


variable, in our case: age, BMI, KOOSADLbaseline, and
KOOSsportbaseline.

- For Age variable:


Type LN_age in target variable and LN(age) in Numeric
Expression

- Repeat the same procedure for the other two variables.


Click Analyze > Regression > Binary Logistic
Univariate
• A variable in univariate analysis is just a condition or subset that your data
falls into.
• For example, the analysis might look at a variable of “age” or it might look at
“BMI” or “education”.
• However, it doesn’t look at more than one variable at a time otherwise it
becomes bivariate analysis (or in the case of 3 or more variables it would be
called multivariate analysis).
• The variables are usually included in the multivariate analysis when a p-value
< 0.2 is observed in the univariate analysis.
Multivariate

Multivariate analysis is used to study more complex sets of data


than what univariate analysis methods can handle.

Multivariate analysis can reduce the likelihood of Type I errors


(false-positive)
For Box-Tidwell test

- Add the interaction term between each continues IV and its log.
In the Logistic Regression Window: Click on Categorical

- Transfer the categorical independent variable, gender, from


the Covariates: box to the Categorical Covariates: box, as shown
below, and then change the reference category to be the first, then
click on change:
In the Logistic Regression Window: Click on Options

- Check the appropriate statistics and plots needed for the


analysis as shown below:
SPSS output for Box-Tedwell Test

- If all of them are not significant, redo the analysis without the
interaction terms:
Redo the analysis: Click Analyze > Regression > Binary Logistic
Remove interaction terms from covariates:
SPSS output
This part of the output tells you about the cases that were included
and excluded from the analysis, the coding of the dependent variable,
and coding of any categorical variables listed on the categorical
subcommand.
SPSS output – Block 0
This part of the output describes a “null model”, which is model
with no predictors and just the intercept. This is why you will see
all of the variables that you put into the model in the table titled
“Variables not in the Equation”.
SPSS output – Block 1
The section contains what is frequently the most interesting part of the
output: the overall test of the model (in the “Omnibus Tests of Model
Coefficients” table) and the coefficients and odds ratios (in the
“Variables in the Equation” table).

The overall model is statistically significant, χ2(5) = 11.35, p = .045.


SPSS output – Block 1

This table contains the Cox & Snell R Square and Nagelkerke R Square values,
which are both methods of calculating the explained variation.

These values are sometimes referred to as pseudo R2 values (and will have lower
values than in multiple regression).

However, they are interpreted in the same manner, but with more caution.

Therefore, the explained variation in the dependent variable based on our model
ranges from 8.1% to 10.8%, depending on whether you
reference the Cox & Snell R2 or Nagelkerke R2 methods, respectively.
SPSS output – Block 1

The Hosmer-Lemeshow tests the null hypothesis that predictions made


by the model fit perfectly with observed group memberships. A chi-
square statistic is computed comparing the observed frequencies with
those expected under the linear model. A nonsignificant chi-square
indicates that the data fit the model well.
SPSS output – Block 1

Logistic regression estimates the probability of an event (in this case, having
meniscus injury) occurring. If the estimated probability of the event occurring is
greater than or equal to 0.5 (better than even chance), SPSS Statistics classifies the
event as occurring (e.g., meniscus injury being present).

If the probability is less than 0.5, SPSS Statistics classifies the event as not occurring
(e.g., no meniscus injury).

It is very common to use binomial logistic regression to predict whether cases can be
correctly classified (i.e., predicted) from the independent variables. Therefore, it
becomes necessary to have a method to assess the effectiveness of the predicted
classification against the actual classification.
SPSS output – Block 1

-With the independent variables added, the model now correctly classifies
62.9% of cases overall (see "Overall Percentage" row) => Percentage accuracy
in classification.

-58.5% of participants who had meniscus injury were also predicted by the model
to have meniscus injury(see the "Percentage Correct" column in the "Yes" row of
the observed categories). => Sensitivity

-66.7% of participants who did not have meniscus injury were correctly predicted by
the model not to have meniscus injury (see the "Percentage Correct" column in the
"No" row of the observed categories). => Specificity
SPSS output – Block 1

- The positive predictive value is the percentage of correctly predicted cases with
the observed characteristic compared to the total number of cases predicted as
having the characteristic. In our case, this is 100 x (50 ÷ (27 + 50)) which is
64.9%. That is, of all cases predicted as having meniscus injury, 64.9% were
correctly predicted.

- The negative predictive value is the percentage of correctly predicted cases


without the observed characteristic compared to the total number of cases
predicted as not having the characteristic. In our case, this is 100 x (38 ÷ (38 +
25)) which is 60.3%. That is, of all cases predicted as not having meniscus injury,
60.3% were correctly predicted.
SPSS output – Block 1

• From these results you can see that BMI (p = .004) added significantly to the
model/prediction, but age (p = .662), sex (p = .457), KOOSADLbaseline (p
= .343), and KOOSsportbaseline (p = .808) did not add significantly to the model.

• Interpretation: the table shows that the odds of having meniscus injury ("yes"
category) is 1.163 times greater every increase of 1 score of BMI after adjusted
by other variables (age, sex, KOOS ADL, and KOOS sport).
APA style write-up

• A logistic regression was performed to ascertain the effects of age,


BMI, sex, KOOSADLbaseline, and KOOSsportbaseline on the
likelihood that participants have meniscus injury. The logistic
regression model was statistically significant, χ2(5) = 11.35, p
= .045.
• Every increase of 1 score of BMI were 1.163 times more likely to
exhibit meniscus injury after adjusted by other variables (age, sex,
KOOS ADL, and KOOS sport).

You might also like