You are on page 1of 30

LOGISTICS REGRESSION

•Logistic regression is used to analyze relationships between a


dichotomous dependent variable and metric or dichotomous independent
variables.

•The variate or value produced by logistic regression is a probability


value between 0.0 and 1.0.

•If the probability for group membership in the modeled category is


above some cut point (the default is 0.50), the subject is predicted to be a
member of the modeled group
THE “LOGIT” MODEL EQUATION:

• ln[p/(1-p)] = a + BX
• or
• p/(1-p) = e^(a + BX)
• Where:
• “ln” is the natural logarithm, logexp, where e=2.71828
• “p” is the probability that Y for cases equals 1, p (Y=1)
• “1-p” is the probability that Y for cases equals 0,
• 1 – p(Y=1)
• “p/(1-p)” is the odds
• ln[p/1-p] is the log odds, or “logit”
BEGINNING LOGISTIC REGRESSION MODEL
• The SPSS output for logistic regression begins
with output for a model that contains no
independent variables. It labels this output
"Block 0: Beginning Block" and (if we request
the optional iteration history) reports the initial -
2 Log Likelihood, which we can think of as a
measure of the error associated trying to
predict the dependent variable without using
any information from the independent The initial -2 log
likelihood is 213.891.
variables.

We will not routinely request


the iteration history because
it does not usually yield us
additional useful
information.
ENDING LOGISTIC REGRESSION MODEL

• After the independent variables are entered in


Block 1, the -2 log likelihood is again measured
(180.267 in this problem).
• The difference between ending and beginning -2
log likelihood is the model chi-square that is
used in the test of overall statistical significance.
• In this problem, the model chi-square is 33.625
(213.891 – 180.267), which is statistically
significant at p<0.001. Model chi-square is
33.625, significant at
p < 0.001.
CASE- ANIMAL RESEARCH
CASE- ANIMAL RESEARCH
GENDER AS THE PREDICTOR VARIABLE
We can now use this model to predict the odds that a subject of a given gender will decide to
continue the research. The odds prediction equation is . If our subject is a woman (gender = 0), then
woman is only .429 as likely to decide to continue the research as she is to decide to stop the
research.
If our subject is a man (gender = 1), then odds is 1.448. That is, a man is 1.448 times more likely to
decide to continue the research than to decide to stop the research .
That tells us that the model predicts that the odds of deciding to continue the research are 3.376
times higher for men than they are for women .
• P=odds/(1+odds)
We can easily convert odds to probabilities.
For women
P=.429/1.429
=0.30 That is, our model predicts that 30% of women will decide to continue the
research.
For men, P=1.448/2.448=0.59
That is, our model predicts that 59% of men will decide to continue the research
INTERPRETATION
• Our model leads to the prediction that the probability of
deciding to continue the research is 30% for women and 59%
for men.
SENSITIVITY
• Sensitivity: It refers to the proportion of true positives or the proportion
of cases correctly identified by the test as meeting a certain condition.
• P (correct prediction | event did occur), i.e., the percentage of
occurrences correctly predicted
• P (predict Continue | subject voted to Continue)
• Of all those who voted to continue the research, for how many did we
correctly predict that.
68 68
  53%
68  60 128
SPECIFICITY
• Specificity: It refers to the proportion of true negatives or the proportion of
cases correctly identified by the test as not meeting a certain condition.
• P (correct prediction | event did not occur)
• P (predict Stop | subject voted to Stop)
• Of all those who voted to stop the research, for how many did we correctly
predict that.
140 140
  75%
140  47 187
MULTIPLE PREDICTORS
 Here we will also add the two emotional factors idealism and relativism.
 Persons who score high on the relativism dimension of this instrument reject the notion of universal moral principles,
preferring personal and situational analysis of behavior.

 Persons who score high on the idealism dimension believe that ethical behaviour will always lead only to good
consequences, never to bad consequences, and never to a mixture of good and bad consequences
• When there was only one variable gender that would affect the decision variable we had
-2LL = 399.913.
• Now here we have Added idealism and relativism so the value has dropped
-2LL to 346.503, a drop of 53.41.
• 2(2) = 399.913 – 346.503 = 53.41
To compare each of the cosmetic, theory, meat, and veterinary groups with the
medical group set up a dummy variable for each of the groups except the medical
group

• THE BLOCK 0 “VARIABLES NOT IN THE EQUATION” SHOW HOW MUCH THE -2LL WOULD DROP
IF A SINGLE PREDICTOR WERE ADDED TO THE MODEL (WHICH ALREADY HAS THE INTERCEPT)
DECISION =
IDEALISM, RELATIVISM, GENDER, PURPOSE

• Need 4 dummy variables to code the five purposes.


• Consider the Medical group a reference group.
• Dummy variables are: Cosmetic, Theory, Meat, Veterin.
• 0 = not in this group, 1 = in this group.
In the Classification Table, we see a small increase in our overall success rate, from 71%
to 72%.
I
• Sensitivity = 74/128 = 58%
• Specificity = 152/187 = 81%
• False Positive Rate = 35/109 = 32%
• False Negative Rate = 54/206 = 26%
CLASSIFICATION DECISION RULE
• Analyze, Regression, Binary Logistic
• Options
• Classification Cutoff = .4, Continue, OK
Value When Cutoff = .5 .4
Sensitivity 57% 74%
Specificity 81% 71%
False Positive Rate 32% 36%
False Negative Rate 26% 19%
Overall % Correct 72% 73%

You might also like