Imputes univariate missing data using random forests.
Usage
mice.impute.rf(
y,
ry,
x,
wy = NULL,
ntree = 10,
rfPackage = c("ranger", "randomForest", "literanger"),
...
)
Arguments
- y
Vector to be imputed
- ry
Logical vector of length
length(y)
indicating the the subsety[ry]
of elements iny
to which the imputation model is fitted. Thery
generally distinguishes the observed (TRUE
) and missing values (FALSE
) iny
.- x
Numeric design matrix with
length(y)
rows with predictors fory
. Matrixx
may have no missing values.- wy
Logical vector of length
length(y)
. ATRUE
value indicates locations iny
for which imputations are created.- ntree
The number of trees to grow. The default is 10.
- rfPackage
A single string specifying the backend for estimating the random forest. The default backend is the
ranger
package. An alternative isliteranger
which predicts faster but does not support all forest types and split rules fromranger
. Also implemented as an alternative is therandomForest
package, which used to be the default in mice 3.13.10 and earlier.- ...
Other named arguments passed down to
mice:::install.on.demand()
,randomForest::randomForest()
,randomForest:::randomForest.default()
,ranger::ranger()
, andliteranger::train()
.
Details
Imputation of y
by random forests. The method
calls randomForrest()
which implements Breiman's random forest
algorithm (based on Breiman and Cutler's original Fortran code)
for classification and regression. See Appendix A.1 of Doove et al.
(2014) for the definition of the algorithm used.
Note
An alternative implementation was independently
developed by Shah et al (2014). This were available as
functions CALIBERrfimpute::mice.impute.rfcat
and
CALIBERrfimpute::mice.impute.rfcont
(now archived).
Simulations by Shah (Feb 13, 2014) suggested that
the quality of the imputation for 10 and 100 trees was identical,
so mice 2.22 changed the default number of trees from ntree = 100
to
ntree = 10
.
References
Doove, L.L., van Buuren, S., Dusseldorp, E. (2014), Recursive partitioning for missing data imputation in the presence of interaction Effects. Computational Statistics & Data Analysis, 72, 92-104.
Shah, A.D., Bartlett, J.W., Carpenter, J., Nicholas, O., Hemingway, H. (2014), Comparison of random forest and parametric imputation models for imputing missing data using MICE: A CALIBER study. American Journal of Epidemiology, doi:10.1093/aje/kwt312 .
Van Buuren, S. (2018). Flexible Imputation of Missing Data. Second Edition. Chapman & Hall/CRC. Boca Raton, FL.
See also
mice
, mice.impute.cart
,
randomForest
,
ranger
,
train
Other univariate imputation functions:
mice.impute.cart()
,
mice.impute.lasso.logreg()
,
mice.impute.lasso.norm()
,
mice.impute.lasso.select.logreg()
,
mice.impute.lasso.select.norm()
,
mice.impute.lda()
,
mice.impute.logreg()
,
mice.impute.logreg.boot()
,
mice.impute.mean()
,
mice.impute.midastouch()
,
mice.impute.mnar.logreg()
,
mice.impute.mpmm()
,
mice.impute.norm()
,
mice.impute.norm.boot()
,
mice.impute.norm.nob()
,
mice.impute.norm.predict()
,
mice.impute.pmm()
,
mice.impute.polr()
,
mice.impute.polyreg()
,
mice.impute.quadratic()
,
mice.impute.ri()