Thursday, 5 June 2014

Logistic regression in R

> library("MASS")
> data(menarche)

> str(menarche)
'data.frame':   25 obs. of  3 variables:
 $ Age     : num  9.21 10.21 10.58 10.83 11.08 ...
 $ Total   : num  376 200 93 120 90 88 105 111 100 93 ...
 $ Menarche: num  0 0 0 2 2 5 10 17 16 29 ...


> summary(menarche)
      Age            Total           Menarche    
 Min.   : 9.21   Min.   :  88.0   Min.   :   0.00
 1st Qu.:11.58   1st Qu.:  98.0   1st Qu.:  10.00
 Median :13.08   Median : 105.0   Median :  51.00
 Mean   :13.10   Mean   : 156.7   Mean   :  92.32
 3rd Qu.:14.58   3rd Qu.: 117.0   3rd Qu.:  92.00
 Max.   :17.58   Max.   :1049.0   Max.   :1049.00

> plot(Menarche/Total ~ Age, data=menarche)


> glm.out = glm(cbind(Menarche, Total-Menarche) ~ Age,family=binomial(logit), data=menarche)
> glm.out

Call:  glm(formula = cbind(Menarche, Total - Menarche) ~ Age, family = binomial(logit),
    data = menarche)

Coefficients:
(Intercept)          Age
    -21.226        1.632

Degrees of Freedom: 24 Total (i.e. Null);  23 Residual
Null Deviance:      3694
Residual Deviance: 26.7         AIC: 114.8


> plot(Menarche/Total ~ Age, data=menarche)
> lines(menarche$Age, glm.out$fitted, type="l", col="red")
> title(main="Menarche Data with Fitted Logistic Regression Line")






> summary(glm.out)

Call:
glm(formula = cbind(Menarche, Total - Menarche) ~ Age, family = binomial(logit),
    data = menarche)

Deviance Residuals:
    Min       1Q   Median       3Q      Max
-2.0363  -0.9953  -0.4900   0.7780   1.3675

Coefficients:
             Estimate Std. Error z value Pr(>|z|)  
(Intercept) -21.22639    0.77068  -27.54   <2e-16 ***
Age           1.63197    0.05895   27.68   <2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

(Dispersion parameter for binomial family taken to be 1)

    Null deviance: 3693.884  on 24  degrees of freedom
Residual deviance:   26.703  on 23  degrees of freedom
AIC: 114.76

Number of Fisher Scoring iterations: 4

No comments:

Post a Comment