Daren Tan | 20 Aug 08:34
Favicon

Understanding output of summary(glm(...))


Simple example of 5 groups of 4 replicates.

>set.seed(5)

>tmp <- rnorm(20)

>gp <- as.factor(rep(1:5,each=4))

>summary(glm(tmp ~ -1 + gp, data=data.frame(tmp, gp)))$coefficients          Estimate Std. Error       t value 
Pr(>|t|)gp1 -0.1604613084  0.4899868 -0.3274809061 0.7478301gp2  0.0002487984  0.4899868 
0.0005077655 0.9996016gp3  0.0695463698  0.4899868  0.1419352018 0.8890200gp4 -0.6121682841 
0.4899868 -1.2493567852 0.2306791gp5 -0.6999545014  0.4899868 -1.4285171713 0.1736348

>m <- data.frame(tmp, gp)
>sapply(gp, function(x) sd(m[m[,"gp"]==x,1])) [1] 1.169284 1.169284 1.169284 1.169284 1.142974
1.142974 1.142974 1.142974 [9] 0.862423 0.862423 0.862423 0.862423 0.535740 0.535740 0.535740
0.535740[17] 1.047538 1.047538 1.047538 1.047538
Why doesn't the standard deviation of each group correlates with the Pr e.g., gp = 4 has the smallest sd of
0.535740, but its Pr is not the lowest (i.e., only 0.23 vs 0.1736 of gp = 5). 

Another example with new tmp1

>tmp1
 [1]  9.577969  9.310792  9.666767  9.610164 10.181692 10.155899 10.025943 [8]  9.971243 10.177766  9.265793 
9.415818 10.099874 10.238829  9.575591[15]  9.560879  9.617891  9.617891 10.158160 10.592377 10.068443

>summary(glm(tmp1 ~ -1 + age, data=data.frame(as.vector(as.matrix(tmp1)), age)))$coefficients     
Estimate Std. Error  t value     Pr(>|t|)age1  9.541423  0.1611603 59.20456 3.380085e-19age2 10.083694 
0.1611603 62.56935 1.479781e-19age3  9.739813  0.1611603 60.43557 2.485380e-19age4  9.748297 
(Continue reading)

Bill.Venables | 20 Aug 09:03

Re: Understanding output of summary(glm(...))

The 'Std. Error' values listed in the coefficients table of the summary
have nothing to do with the sub-class standard deviations.  They are the
standard errors associated with the estimates of the class means (the
way you have fitted the model) and as the design has equal replication
and the estimated standard errors are based on the pooled estimate of
variance from all samples, they are equal.  That's why.

Your second 'example' was incomplete and I couldn't follow it, but the
answer is almost certainly "hell no!".

Finally, a question for you.  Why do you use glm(...) when all you are
doing is fitting linear models?  Either lm(...) or aov(...) would have
been much more sensible.  

Bill Venables
http://www.cmis.csiro.au/bill.venables/ 

-----Original Message-----
From: r-help-bounces <at> r-project.org [mailto:r-help-bounces <at> r-project.org]
On Behalf Of Daren Tan
Sent: Wednesday, 20 August 2008 4:37 PM
To: r-help <at> stat.math.ethz.ch
Subject: [R] Understanding output of summary(glm(...))

Simple example of 5 groups of 4 replicates.

>set.seed(5)

>tmp <- rnorm(20)

(Continue reading)


Gmane