Skip to contents

Calculates imputations for univariate missing data by Bayesian linear regression, also known as the normal model.

Usage

mice.impute.norm(y, ry, x, wy = NULL, ...)

Arguments

y

Vector to be imputed

ry

Logical vector of length length(y) indicating the the subset y[ry] of elements in y to which the imputation model is fitted. The ry generally distinguishes the observed (TRUE) and missing values (FALSE) in y.

x

Numeric design matrix with length(y) rows with predictors for y. Matrix x may have no missing values.

wy

Logical vector of length length(y). A TRUE value indicates locations in y for which imputations are created.

...

Other named arguments.

Value

Vector with imputed data, same type as y, and of length sum(wy)

Details

Imputation of y by the normal model by the method defined by Rubin (1987, p. 167). The procedure is as follows:

  1. Calculate the cross-product matrix \(S=X_{obs}'X_{obs}\).

  2. Calculate \(V = (S+{diag}(S)\kappa)^{-1}\), with some small ridge parameter \(\kappa\).

  3. Calculate regression weights \(\hat\beta = VX_{obs}'y_{obs}.\)

  4. Draw a random variable \(\dot g \sim \chi^2_\nu\) with \(\nu=n_1 - q\).

  5. Calculate \(\dot\sigma^2 = (y_{obs} - X_{obs}\hat\beta)'(y_{obs} - X_{obs}\hat\beta)/\dot g.\)

  6. Draw \(q\) independent \(N(0,1)\) variates in vector \(\dot z_1\).

  7. Calculate \(V^{1/2}\) by Cholesky decomposition.

  8. Calculate \(\dot\beta = \hat\beta + \dot\sigma\dot z_1 V^{1/2}\).

  9. Draw \(n_0\) independent \(N(0,1)\) variates in vector \(\dot z_2\).

  10. Calculate the \(n_0\) values \(y_{imp} = X_{mis}\dot\beta + \dot z_2\dot\sigma\).

Using mice.impute.norm for all columns emulates Schafer's NORM method (Schafer, 1997).

References

Rubin, D.B (1987). Multiple Imputation for Nonresponse in Surveys. New York: John Wiley & Sons.

Schafer, J.L. (1997). Analysis of incomplete multivariate data. London: Chapman & Hall.

Author

Stef van Buuren, Karin Groothuis-Oudshoorn