Comments, and errata list for Bingham and Fry (2010): Regression
This page is for use with the course TMA4267 Linear Statistical Models at NTNU.
Some of the comments below is regarding my view on how the topic should be organized. In particular I find chapter 3 hard to lecture, and have chosen to lecture 4.3-4.4 before chapter 3. In addition I have in my lectures restructured 3.4-3.6.
If you have comments - or want to report errors, please email the lecturer at Mette.Langaas@math.ntnu.no.
General remarks
- Note that punctuation mark is used as the product symbol, so that "A*B" would read "A.B". This may at first look at bit puzzling, in particular when used with a product of two fractions.
- As an estimator for the variance "S^2" is used, with the definition "S^2=1/n S_XX". That is, not the unbiased version with 1/(n-1).
Chapter 1: Linear Regression
- Page 5: "Thus the slope b is given by the sample correlation coefficient" is not very precise. The slope b is a scaled version of the sample correlation coefficient. The sample correlation coefficient is defined on page 7.
- Page 5: Note 1.1. Very strange notation for the confidence intervals for the intercept and slope. Suggestion: remove the "a=" and the "b=" in the start of the two equations, to only give the limits for the confidence intervals.
- Page 20: a little more than halfway down, in the expression after "so M(t1,t2) is equal to" there is a factor rho missing from the second term. Reads "-t2*(sigma2/sigma1)*mu1" but should read "-t2*(sigma2/sigma1)*rho*mu1".
Chapter 2: The Analysis of Variance
- Page 37: f(x) for the Fisher distribution. The Beta function arguments should be "B(m/2,n/2)" and not "B(n/2,n/2)".
- Page 38: Theorem 2.2 works with N(0,sigma^2) data, but the Proof below is for N(0,1) data. Just make the f(x) in the proof into a N(0,sigma^2) by adding 1/sigma in the normalizing constant and 1/sigma^2 in the exponent, and this will then also follow on to the f(y).
- Page 38: I found the first four lines of the proof difficult to read because of the mix of notation. The determinant is previously denoted det(A), while |A| is used as the absolute value of A, but here |A| is probably used for the determinant. There is also an inconsistency wrt the absolute value. Suggestion for alternative writing of line 3 of the proof: 1=det(A)*det(A^T)=[det(A)]^2, so that det(A)=det(A^T)=+-1, and abs(det(A))=1.
- Page 40: The equation marked (*) should read "Z^TAZ+Z^TBZ" and not "Z^TAZ+Z^TBX". It is possible to make the proof a bit simpler, not using the A and B, notation. Come to class if you would like to see the simplified version.
- Page 41: Theorem 2.6 ii should read S^2/sigma^2 (the sigma^2 is missing), but is apparent from the proof below.
- Page 43: In the H0 equation mu_i=mu the parameter mu is not previously defined. The population grand mean mu is the weighted sum of the population group means mu_i, mu=sum(n_i*mu_i)/n.
Chapter 3: Multiple Regression
Multiple linear regression is covered, but in many places information in 4.3-4.4 is needed to perform the calculations or understand the implications of chapter 3. I would suggest you read 4.3-4.4 before chapter 3.
In addition, please observe that we in TMA4267 use Cov(Y) to mean the variance-covariance matrix of the random vector Y, while the book use var(Y) for the same matrix.
- Page 61: In the second last equation the epsilon_i should be epsilon, since index i is not used.
- Page 70: You need to know Proposition 4.4 on page 106 to find E(betahat) and Cov(betahat).
- Page 86-88: here we need to have read 4.3 and 4.4 to know about the mulitivariate normal and the covariance=0 implies independence property of the multivariate normal.
Chapter 4:
- Page 111: the notation Sigma>0 means that Sigma is positive definite.
- Page 115: Last line of proof of 4.18: last dx should be dy.
- Page 116: in example 4.19 we will use that the element of Sigma on the diagonal is sigma_11 and not sigma_11^2. Furter that sigma_1^2=sigma_11. This is common notation in other text books.
- Page 119: In the Proof of 4.26 there is an equation starting with E(BX)=B(EX), then the (mu_1 mu_2) vector should have an transpose to make it into column vector of dimension p times 1.