This function converts imputed data stored in long format into
an object of class mids
. The original incomplete dataset
needs to be available so that we know where the missing data are.
The function is useful to convert back operations applied to
the imputed data back in a mids
object. It may also be
used to store multiply imputed data sets from other software
into the format used by mice
.
Arguments
- long
A multiply imputed data set in long format, for example produced by a call to
complete(..., action = 'long', include = TRUE)
, or by other software.- where
A data frame or matrix with logicals of the same dimensions as
data
indicating where in the data the imputations should be created. The default,where = is.na(data)
, specifies that the missing data should be imputed. Thewhere
argument may be used to overimpute observed data, or to skip imputations for selected missing values. Note: Imputation methods that generate imptutations outside ofmice
, likemice.impute.panImpute()
may depend on a complete predictor space. In that case, a customwhere
matrix can not be specified.- .imp
An optional column number or column name in
long
, indicating the imputation index. The values are assumed to be consecutive integers between 0 andm
. Values1
throughm
correspond to the imputation index, value0
indicates the original data (with missings). By default, the procedure will search for a variable named".imp"
.- .id
An optional column number or column name in
long
, indicating the subject identification. If not specified, then the function searches for a variable named".id"
. If this variable is found, the values in the column will define the row names in thedata
element of the resultingmids
object.
Note
The function expects the input data long
to be sorted by
imputation number (variable ".imp"
by default), and in the
same sequence within each imputation block.
Examples
# impute the nhanes dataset
imp <- mice(nhanes, print = FALSE)
# extract the data in long format
X <- complete(imp, action = "long", include = TRUE)
# create dataset with .imp variable as numeric
X2 <- X
# nhanes example without .id
test1 <- as.mids(X)
is.mids(test1)
#> [1] TRUE
identical(complete(test1, action = "long", include = TRUE), X)
#> [1] TRUE
# nhanes example without .id where .imp is numeric
test2 <- as.mids(X2)
is.mids(test2)
#> [1] TRUE
identical(complete(test2, action = "long", include = TRUE), X)
#> [1] TRUE
# nhanes example, where we explicitly specify .id as column 2
test3 <- as.mids(X, .id = ".id")
is.mids(test3)
#> [1] TRUE
identical(complete(test3, action = "long", include = TRUE), X)
#> [1] TRUE
# nhanes example with .id where .imp is numeric
test4 <- as.mids(X2, .id = 6)
is.mids(test4)
#> [1] TRUE
identical(complete(test4, action = "long", include = TRUE), X)
#> [1] TRUE
# example without an .id variable
# variable .id not preserved
X3 <- X[, -6]
test5 <- as.mids(X3)
is.mids(test5)
#> [1] TRUE
identical(complete(test5, action = "long", include = TRUE)[, -6], X[, -6])
#> [1] TRUE
# as() syntax has fewer options
test7 <- as(X, "mids")
test8 <- as(X2, "mids")
test9 <- as(X2[, -6], "mids")
rev <- ncol(X):1
test10 <- as(X[, rev], "mids")
# where argument copies also observed data into $imp element
where <- matrix(TRUE, nrow = nrow(nhanes), ncol = ncol(nhanes))
colnames(where) <- colnames(nhanes)
test11 <- as.mids(X, where = where)
identical(complete(test11, action = "long", include = TRUE), X)
#> [1] TRUE