Imputation by a two-level normal model using pan

Imputes univariate missing data using a two-level normal model with homogeneous within group variances. Aggregated group effects (i.e. group means) can be automatically created and included as predictors in the two-level regression (see argument type). This function needs the pan package.

Usage

mice.impute.2l.pan(
  y,
  ry,
  x,
  type,
  intercept = TRUE,
  paniter = 500,
  groupcenter.slope = FALSE,
  ...
)

Arguments

y: Incomplete data vector of length n
ry: Vector of missing data pattern (FALSE=missing, TRUE=observed)
x: Matrix (n x p) of complete covariates.
type: Vector of length ncol(x) identifying random and class variables. Random effects are identified by a '2'. The group variable (only one is allowed) is coded as '-2'. Random effects also include the fixed effect. If for a covariates X1 group means shall be calculated and included as further fixed effects choose '3'. In addition to the effects in '3', specification '4' also includes random effects of X1.
intercept: Logical determining whether the intercept is automatically added.
paniter: Number of iterations in pan. Default is 500.
groupcenter.slope: If TRUE, in case of group means (type is '3' or'4') group mean centering for these predictors are conducted before doing imputations. Default is FALSE.
...: Other named arguments.

Value

A vector of length nmis with imputations.

Details

Implements the Gibbs sampler for the linear two-level model with homogeneous within group variances which is a special case of a multivariate linear mixed effects model (Schafer & Yucel, 2002). For a two-level imputation with heterogeneous within-group variances see mice.impute.2l.norm. % The random intercept is automatically added in % mice.impute.2l.norm().

Note

This function does not implement the where functionality. It always produces nmis imputation, irrespective of the where argument of the mice function.

References

Schafer J L, Yucel RM (2002). Computational strategies for multivariate linear mixed-effects models with missing values. Journal of Computational and Graphical Statistics. 11, 437-457.

Van Buuren, S., Groothuis-Oudshoorn, K. (2011). mice: Multivariate Imputation by Chained Equations in R. Journal of Statistical Software, 45(3), 1-67. doi:10.18637/jss.v045.i03

Author

Alexander Robitzsch (IPN - Leibniz Institute for Science and Mathematics Education, Kiel, Germany), robitzsch@ipn.uni-kiel.de

Alexander Robitzsch (IPN - Leibniz Institute for Science and Mathematics Education, Kiel, Germany), robitzsch@ipn.uni-kiel.de.

Examples

# simulate some data
# two-level regression model with fixed slope

# number of groups
G <- 250
# number of persons
n <- 20
# regression parameter
beta <- .3
# intraclass correlation
rho <- .30
# correlation with missing response
rho.miss <- .10
# missing proportion
missrate <- .50
y1 <- rep(rnorm(G, sd = sqrt(rho)), each = n) + rnorm(G * n, sd = sqrt(1 - rho))
x <- rnorm(G * n)
y <- y1 + beta * x
dfr0 <- dfr <- data.frame("group" = rep(1:G, each = n), "x" = x, "y" = y)
dfr[rho.miss * x + rnorm(G * n, sd = sqrt(1 - rho.miss)) < qnorm(missrate), "y"] <- NA

# empty imputation in mice
imp0 <- mice(as.matrix(dfr), maxit = 0)
predM <- imp0$predictorMatrix
impM <- imp0$method

# specify predictor matrix and method
predM1 <- predM
predM1["y", "group"] <- -2
predM1["y", "x"] <- 1 # fixed x effects imputation
impM1 <- impM
impM1["y"] <- "2l.pan"

# multilevel imputation
imp1 <- mice(as.matrix(dfr),
  m = 1, predictorMatrix = predM1,
  method = impM1, maxit = 1
)
#> 
#>  iter imp variable
#>   1   1  y

# multilevel analysis
library(lme4)
#> Loading required package: Matrix
#> 
#> Attaching package: ‘Matrix’
#> The following objects are masked from ‘package:tidyr’:
#> 
#>     expand, pack, unpack
mod <- lmer(y ~ (1 + x | group) + x, data = complete(imp1))
#> boundary (singular) fit: see help('isSingular')
summary(mod)
#> Linear mixed model fit by REML ['lmerMod']
#> Formula: y ~ (1 + x | group) + x
#>    Data: complete(imp1)
#> 
#> REML criterion at convergence: 13017.8
#> 
#> Scaled residuals: 
#>     Min      1Q  Median      3Q     Max 
#> -3.4070 -0.6606 -0.0021  0.6676  3.8857 
#> 
#> Random effects:
#>  Groups   Name        Variance  Std.Dev. Corr
#>  group    (Intercept) 0.3033140 0.55074      
#>           x           0.0001242 0.01115  1.00
#>  Residual             0.7051156 0.83971      
#> Number of obs: 5000, groups:  group, 250
#> 
#> Fixed effects:
#>             Estimate Std. Error t value
#> (Intercept) -0.02473    0.03680  -0.672
#> x            0.29373    0.01208  24.315
#> 
#> Correlation of Fixed Effects:
#>   (Intr)
#> x 0.057 
#> optimizer (nloptwrap) convergence code: 0 (OK)
#> boundary (singular) fit: see help('isSingular')
#> 

# Examples of predictorMatrix specification

# random x effects
# predM1["y","x"] <- 2

# fixed x effects and group mean of x
# predM1["y","x"] <- 3

# random x effects and group mean of x
# predM1["y","x"] <- 4

Imputation by a two-level normal model using `pan`