Skip to contents

Imputes missing data in a categorical variable using polytomous regression

Usage

mice.impute.polr(
  y,
  ry,
  x,
  wy = NULL,
  nnet.maxit = 100,
  nnet.trace = FALSE,
  nnet.MaxNWts = 1500,
  polr.to.loggedEvents = FALSE,
  ...
)

Arguments

y

Vector to be imputed

ry

Logical vector of length length(y) indicating the the subset y[ry] of elements in y to which the imputation model is fitted. The ry generally distinguishes the observed (TRUE) and missing values (FALSE) in y.

x

Numeric design matrix with length(y) rows with predictors for y. Matrix x may have no missing values.

wy

Logical vector of length length(y). A TRUE value indicates locations in y for which imputations are created.

nnet.maxit

Tuning parameter for nnet().

nnet.trace

Tuning parameter for nnet().

nnet.MaxNWts

Tuning parameter for nnet().

polr.to.loggedEvents

A logical indicating whether each fallback to the multinom() function should be written to loggedEvents. The default is FALSE.

...

Other named arguments.

Value

Vector with imputed data, same type as y, and of length sum(wy)

Details

The function mice.impute.polr() imputes for ordered categorical response variables by the proportional odds logistic regression (polr) model. The function repeatedly applies logistic regression on the successive splits. The model is also known as the cumulative link model.

By default, ordered factors with more than two levels are imputed by mice.impute.polr.

The algorithm of mice.impute.polr uses the function polr() from the MASS package.

In order to avoid bias due to perfect prediction, the algorithm augment the data according to the method of White, Daniel and Royston (2010).

The call to polr might fail, usually because the data are very sparse. In that case, multinom is tried as a fallback. If the local flag polr.to.loggedEvents is set to TRUE, a record is written to the loggedEvents component of the mids object. Use mice(data, polr.to.loggedEvents = TRUE) to set the flag.

Note

In December 2019 Simon White alerted that the polr could always fail silently. I can confirm this behaviour for versions mice 3.0.0 - mice 3.6.6, so any method requests for polr in these versions were in fact handled by multinom. See https://github.com/amices/mice/issues/206 for details.

References

Van Buuren, S., Groothuis-Oudshoorn, K. (2011). mice: Multivariate Imputation by Chained Equations in R. Journal of Statistical Software, 45(3), 1-67. doi:10.18637/jss.v045.i03

Brand, J.P.L. (1999) Development, implementation and evaluation of multiple imputation strategies for the statistical analysis of incomplete data sets. Dissertation. Rotterdam: Erasmus University.

White, I.R., Daniel, R. Royston, P. (2010). Avoiding bias due to perfect prediction in multiple imputation of incomplete categorical variables. Computational Statistics and Data Analysis, 54, 2267-2275.

Venables, W.N. & Ripley, B.D. (2002). Modern applied statistics with S-Plus (4th ed). Springer, Berlin.

Author

Stef van Buuren, Karin Groothuis-Oudshoorn, 2000-2010