Skip to contents

Imputes univariate missing data using linear discriminant analysis

Usage

mice.impute.lda(y, ry, x, wy = NULL, ...)

Arguments

y

Vector to be imputed

ry

Logical vector of length length(y) indicating the the subset y[ry] of elements in y to which the imputation model is fitted. The ry generally distinguishes the observed (TRUE) and missing values (FALSE) in y.

x

Numeric design matrix with length(y) rows with predictors for y. Matrix x may have no missing values.

wy

Logical vector of length length(y). A TRUE value indicates locations in y for which imputations are created.

...

Other named arguments. Not used.

Value

Vector with imputed data, of type factor, and of length sum(wy)

Details

Imputation of categorical response variables by linear discriminant analysis. This function uses the Venables/Ripley functions lda() and predict.lda() to compute posterior probabilities for each incomplete case, and draws the imputations from this posterior.

This function can be called from within the Gibbs sampler by specifying "lda" in the method argument of mice(). This method is usually faster and uses fewer resources than calling the function, but the statistical properties may not be as good (Brand, 1999). mice.impute.polyreg.

Warning

The function does not incorporate the variability of the discriminant weight, so it is not 'proper' in the sense of Rubin. For small samples and rare categories in the y, variability of the imputed data could therefore be underestimated.

Added: SvB June 2009 Tried to include bootstrap, but disabled since bootstrapping may easily lead to constant variables within groups.

References

Van Buuren, S., Groothuis-Oudshoorn, K. (2011). mice: Multivariate Imputation by Chained Equations in R. Journal of Statistical Software, 45(3), 1-67. doi:10.18637/jss.v045.i03

Brand, J.P.L. (1999). Development, Implementation and Evaluation of Multiple Imputation Strategies for the Statistical Analysis of Incomplete Data Sets. Ph.D. Thesis, TNO Prevention and Health/Erasmus University Rotterdam. ISBN 90-74479-08-1.

Venables, W.N. & Ripley, B.D. (1997). Modern applied statistics with S-PLUS (2nd ed). Springer, Berlin.

Author

Stef van Buuren, Karin Groothuis-Oudshoorn, 2000