Exercise 1

filepath <- "https://www.math.ntnu.no/emner/TMA4315/2018h/bonedata.txt"
bone <- read.table(filepath, header = TRUE)

library(lme4)
fit2 <- lmer(bone ~ redage + (1 + redage | boy), data = bone)
summary(fit2)
## Linear mixed model fit by REML ['lmerMod']
## Formula: bone ~ redage + (1 + redage | boy)
##    Data: bone
## 
## REML criterion at convergence: 231
## 
## Scaled residuals: 
##      Min       1Q   Median       3Q      Max 
## -2.80183 -0.30917 -0.09912  0.39174  2.07317 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev. Corr
##  boy      (Intercept) 6.2267   2.4953       
##           redage      1.2011   1.0960   0.13
##  Residual             0.1794   0.4236       
## Number of obs: 80, groups:  boy, 20
## 
## Fixed effects:
##             Estimate Std. Error t value
## (Intercept)  50.0713     0.5600  89.416
## redage        1.8630     0.2593   7.185
## 
## Correlation of Fixed Effects:
##        (Intr)
## redage 0.122

a)

Define

\[ \mathbf{y}_i = \begin{pmatrix} y_{i1} \\ y_{i2} \\ y_{i3} \\ y_{i4} \end{pmatrix} = \begin{pmatrix} \texttt{bone}_{i1} \\ \texttt{bone}_{i2} \\ \texttt{bone}_{i3} \\ \texttt{bone}_{i4} \\ \end{pmatrix}, \ X_i = \begin{pmatrix} 1 & \texttt{redage}_{i1} \\ 1 & \texttt{redage}_{i2} \\ 1 & \texttt{redage}_{i3} \\ 1 & \texttt{redage}_{i4} \\ \end{pmatrix}, \ \pmb{\beta} = \begin{pmatrix} \beta_0 \\ \beta_1 \end{pmatrix}, \ U_i = X_i, \ \pmb{\gamma}_i = \begin{pmatrix} \gamma_{0i} \\ \gamma_{1i} \end{pmatrix}, \ \varepsilon_i = \begin{pmatrix} \varepsilon_{i1} \\ \varepsilon_{i2} \\ \varepsilon_{i3} \\ \varepsilon_{i4} \\ \end{pmatrix} \]

\(\mathbf{y}_i\) is the response for person \(i\), \(X_i\) is the design matrix for the fixed part, \(U_i\) for the random part. We get the model on matrix form:

\[\mathbf{y}_i = X_i \begin{pmatrix} \beta_0 \\ \beta_1 \end{pmatrix} + U_i \begin{pmatrix} \gamma_{0i} \\ \gamma_{1i} \end{pmatrix} + \varepsilon_i\]

The fitted values are in this model 20 non-parallel lines (as we have both random intercept and random slope).

library(reshape2); library(ggplot2)

fit2df <- data.frame(boy = paste0("boy", 1:20),
                     low = coef(fit2)$boy[,1] + coef(fit2)$boy[,2]*min(bone$redage),
                     high = coef(fit2)$boy[,1] + coef(fit2)$boy[,2]*max(bone$redage))
fit2df <- melt(fit2df, id.vars = "boy", value.name = "bone", variable.name = "redage")
fit2df$redage <- (as.numeric(fit2df$redage)-1)*(max(bone$redage)-min(bone$redage)) + min(bone$redage)
ggplot(data = fit2df, mapping = aes(x = redage, y = bone, colour = boy)) + geom_line()

Such model is appropriate when it is the distribution of the intercepts and slopes which is of primary interest, not the intercept and slope for particular units.

The model assumptions are that the random vectors \(\pmb{\gamma}_i\) and \(\varepsilon_i\) are independent and multivariate normal with expectation zero:

\[ \pmb{\gamma}_i \sim N_q(0, \mathbf{Q}) = N_q \left(\begin{pmatrix} 0 \\ 0 \end{pmatrix}, \begin{pmatrix} \tau_0^2 & \tau_{01} \\ \tau_{10} & \tau_1^2 \end{pmatrix} \right) \]

\[\varepsilon_i \sim N(0, \Sigma_i) \]

where \(\Sigma_i\) can be general, but is often on the form \(\sigma^2I\) where \(I\) is the identity matrix.

b)

Since \(\mathbf{b}_i\) and \(\varepsilon_i\) are independent and both are multivariate normal, \(\mathbf{y}_i\) is also multivariate normal.

The expected value of \(\mathbf{y}_i\) is E\((X_i\beta + U_i \pmb{\gamma}_i + \varepsilon_i) =\) E\((X_i\beta) + Z_i \cdot 0 + 0 = X_i\beta\). \(X_i\) is the design matrix for person \(i\).

As \(\pmb{\gamma}_i\) and \(\varepsilon_i\) are independent, the covariance matrix of the response is

\[V_i = Cov(\mathbf{y}_i) = Cov(U_i \pmb{\gamma}_i) + Cov(\varepsilon_i) = U_i Cov(\pmb{\gamma}_i)U_i^T + Cov(\varepsilon_i) = U_i\mathbf{Q}U_i^T + \Sigma_i\]

\(U_i = X_i\) is already defined and known, the values of \(\hat{\beta}\) are known, \(\hat{\Sigma_i}\) is \(I_4\) times the residual variance (0.1794), and \(\hat{\mathbf{Q}}\) is found by using the correlation of the random effects and their standard deviations under “Random effects” in the summary.

X_i <- U_i <- matrix(c(1, 1, 1, 1, -0.75, -0.25, 0.25, 0.75), ncol = 2)
beta <- matrix(fixef(fit2), ncol = 1)
Q <- matrix(summary(fit2)$varcor$boy, ncol = 2)
sigma_i <- sigma(fit2)^2 * diag(4)

y_mean <- X_i %*% beta

y_cov <- U_i %*% Q %*% t(U_i) + sigma_i

c)

Since \(\mathbf{y}_1, \dots, \mathbf{y}_{20}\) are independent and only \(\mathbf{y}_i\) is correlated with \(\pmb{\gamma}_i\), E\((\pmb{\gamma}_i | \mathbf{y}_1, \dots, \mathbf{y}_{20}) =\) E\((\pmb{\gamma}_i | \mathbf{y}_i)\).

But \((\pmb{\gamma}_i, \mathbf{y}_i)^T\) is multivariate normal with expectation \((0, X_i\beta)^T\) and covariance matrix

\[\begin{pmatrix} Q & QU_i^T \\ U_iQ & U_iQU_i^T + \Sigma_i \end{pmatrix}\]

Hence

\[E(\pmb{\gamma}_i | \mathbf{y}_i) = 0 + QU_i^T(U_iQU_i^T + \Sigma_i)^{-1}(\mathbf{Y}_i - X_i\beta) \]

and thus \(\pmb{\gamma}_i\) is estimated by plugging the REML estimates into

\[\hat{Q}U_i^T(U_i\hat{Q}U_i^T + \hat{\Sigma}_i)^{-1}(\mathbf{Y}_i - X_i\hat{\beta})\]

# choosing i = 2
i <- 2
Y_i <- as.matrix(subset(bone, boy == i)$bone)


Q %*% t(U_i) %*% solve(U_i %*% Q %*% t(U_i) + sigma_i) %*% (Y_i - X_i %*% 
    beta)
##            [,1]
## [1,] -2.6033335
## [2,] -0.5358597
# can check and see if the estimate coincides with the estimate from
# lmer
ranef(fit2)$boy[which(row.names(ranef(fit2)$boy) == i), ]  # and it does
##   (Intercept)     redage
## 2   -2.603334 -0.5358597
library(mvtnorm)
x1 <- seq(-6, 6, length.out = 100)
x2 <- seq(-3, 3, length.out = 100)
x <- expand.grid(x1, x2)
y <- dmvnorm(x, mean = c(0, 0), sigma = Q)

df <- data.frame(x1 = x[, 1], x2 = x[, 2], y = y)

ggplot(data = df, aes(x = x1, y = x2)) + geom_raster(aes(fill = y)) + 
    scale_fill_gradientn(colours = terrain.colors(10)) + geom_contour(aes(z = y), 
    color = "black") + geom_point(aes(x = 0, y = 0), pch = 4, col = "red", 
    size = 3)

Exercise 2

## Linear mixed model fit by REML ['lmerMod']
## Formula: ccpd ~ YEAR + AVETD + (1 + YEAR | fstate)
##    Data: medicare
## 
## REML criterion at convergence: 5185
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -2.1929 -0.6034  0.0180  0.6183  3.5134 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev. Corr
##  fstate   (Intercept) 5811948  2410.8       
##           YEAR          69022   262.7   0.42
##  Residual              184566   429.6       
## Number of obs: 324, groups:  fstate, 54
## 
## Fixed effects:
##             Estimate Std. Error t value
## (Intercept)  7419.85     386.05  19.220
## YEAR          706.04      39.55  17.850
## AVETD2        567.72     183.92   3.087
## AVETD3       1008.34     244.25   4.128
## 
## Correlation of Fixed Effects:
##        (Intr) YEAR   AVETD2
## YEAR    0.170              
## AVETD2 -0.488  0.168       
## AVETD3 -0.468  0.239  0.781

a)

\[y_i = X_i\beta + U_i\gamma_i + \varepsilon_i, \ i = 1, \dots, 54 \]

where

\[X_i = \begin{pmatrix} 1 & 1 & \mathbf{1}_{AVETD \in \{7, 8, 9\}} & \mathbf{1}_{AVETD \in \{10, 11, \dots\}} \\ 1 & 2 & \mathbf{1}_{AVETD \in \{7, 8, 9\}} & \mathbf{1}_{AVETD \in \{10, 11, \dots\}} \\ \vdots & \vdots & \vdots & \vdots \\ 1 & 6 & \mathbf{1}_{AVETD \in \{7, 8, 9\}} & \mathbf{1}_{AVETD \in \{10, 11, \dots\}} \\ \end{pmatrix}\]

and

\[U_i = \begin{pmatrix} 1 & 1 \\ 1 & 2 \\ \vdots & \vdots \\ 1 & 6 \end{pmatrix}\]

and has dimenstions \(6 \times 4\) and \(6 \times 2\), respectively. The indicator function \(\mathbf{1}_A\) is defines as 1 when \(A\) is true, and 0 else. The fixed effects parameters are collected in the \(4 \times 1\) vector \(\beta = (\beta_0, \beta_1, \beta_2, \beta_3)^T\), and the random effects in the \(2 \times 1\) vector \(\gamma_i = (\gamma_{0i}, \gamma_{1i})^T\), \(i = 1, \dots, 54\). \(\gamma_i\) is multinormal (2-dimensional) with expectation \((0,0)^T\) and covariance matrix \(Q\), and is independent of the errors \(\varepsilon_i = (\varepsilon_{i1}, \dots, \varepsilon_{i6}\) where all elements are \(N(0, \sigma^2)\).

b)

\[ \frac{\hat{\beta}_1-\beta_1}{\widehat{SD(\hat{\beta_1})}} \approx N(0, 1) \]

Hence, we get \(706.04 \pm 1.96 \cdot 39.55 = [628.522, 783.558]\).

c)

A model not containing the random effect YEAR is a simplification of the covariance structure. This can be performed by fitting models containing YEAR and not containing YEAR and comparing the values of \(-2 \log(L)\). But the approximating distribution is a linear combination of \(\chi^2\)-distributions, in this case \(\frac{1}{2}\chi_1^2 + \frac{1}{2}\chi_2^2\). We can calculate the p-values for each distribution individually and then sum them and divide by 2. Note that with very small p-values we do not have these problems with incorrect p-values; we reject \(H_0\) either way. NOTE: We ask about this so you are aware that we cannot blindly use p-values like in linear models, but the details with the \(\chi^2\)-distributions are outside the scope of the course.

# In R:
fit1h0 <- lmer(ccpd ~ YEAR + AVETD + (1 | fstate), data = medicare)
fit1h1 <- lmer(ccpd ~ YEAR + AVETD + (1 + YEAR | fstate), data = medicare)

cat("H0: ", -2 * c(logLik(fit1h0)), ", H1: ", -2 * c(logLik(fit1h1)), 
    sep = "")

statistic <- -2 * c(logLik(fit1h0)) + 2 * c(logLik(fit1h1))

cat("Statistic = ", statistic, ", p-value = 0.5*", pchisq(statistic, 
    1, lower.tail = FALSE), " + 0.5*", pchisq(statistic, 2, lower.tail = FALSE), 
    " = ", 0.5 * pchisq(statistic, 1, lower.tail = FALSE) + 0.5 * pchisq(statistic, 
        2, lower.tail = FALSE), sep = "")
## H0: 5325.549, H1: 5184.98Statistic = 140.5687, p-value = 0.5*1.9992e-32 + 0.5*2.991555e-31 = 1.595738e-31

d)

The covariance matrix of \(y_i\) is Cov\((U_i\gamma_i + \varepsilon_i) = U_i Cov(\gamma_i)U_i^T + \sigma^2 I_6 = U_iQU_i^T + \sigma^2I_6\) which becomes

\[ \begin{pmatrix} 1 & 1 \\ 1 & 2 \\ \vdots & \vdots \\ 1 & 6 \\ \end{pmatrix} \begin{pmatrix} \tau_0^2 & \tau_{01} \\ \tau_{10} & \tau_1^2 \end{pmatrix} \begin{pmatrix} 1 & 1 & \cdots & 1 \\ 1 & 2 & \cdots & 6 \end{pmatrix} + \sigma^2I_6 \]

e)

The hypothesis implies a simplification of the fixed effect structure. This can be performed by fitting the model from part a) by maximum likelihood, and also the simplified model

\[ y_{ij} = \beta_0 + j \times \beta_1 + \beta_2(\texttt{AVETD2} + 2\texttt{AVETD3}) + \gamma_{0i} + j \times \gamma_{1i} + \varepsilon_{ij}, \]

\[ j = 1, \dots, 324, \ i = 1, \dots, 54 \]

also by maximum likelihood. Then you compare the values of \(-2 \log(L)\). The approximating distribution is a \(\chi_1^2\)-distribution, since the hypothesis represents one restriction (i.e., restrictions on only one parameter, in this case \(\beta_3\) (or \(\beta_2\))).

A Wald test is also possible, with \(d = 0\), \(\beta = (\beta_1, \beta_2)^T\) and \(C = (-2, 1)\), and the statistic is assumed \(\chi_1^2\)-distributed.

medicare$AVETD_H0 <- with(medicare, as.numeric(AVETD == 2) + 2 * as.numeric(AVETD == 
    3))
fit1h0 <- lmer(ccpd ~ YEAR + AVETD_H0 + (1 + YEAR | fstate), data = medicare)
fit1h1 <- lmer(ccpd ~ YEAR + AVETD + (1 + YEAR | fstate), data = medicare)

anova(fit1h0, fit1h1)  # default is to refit model(s) using ML, not REML
## Data: medicare
## Models:
## fit1h0: ccpd ~ YEAR + AVETD_H0 + (1 + YEAR | fstate)
## fit1h1: ccpd ~ YEAR + AVETD + (1 + YEAR | fstate)
##        Df    AIC    BIC  logLik deviance  Chisq Chi Df Pr(>Chisq)
## fit1h0  7 5245.9 5272.3 -2615.9   5231.9                         
## fit1h1  8 5247.6 5277.8 -2615.8   5231.6 0.2967      1      0.586
d <- 0
C <- c(-2, 1)
beta <- fixef(fit1)[3:4]
Sigma <- vcov(fit1)[3:4, 3:4]

wald <- as.numeric(t(C %*% beta - d) %*% solve(C %*% Sigma %*% C) %*% 
    (C %*% beta - d))
pval <- pchisq(wald, 1, lower.tail = FALSE)

cat("Wald = ", wald, ", p-value = ", pval, sep = "")
## Wald = 0.2956192, p-value = 0.5866419

Both tests give p-values indicating that we keep the null-hypothesis, i.e., replace \(\beta_3\) by \(2\beta_2\).