Handling Missing Data in R with MICE
R
with mice
This site contains materials for the Biostatistics Workshop Handling
missing data in R
with mice
the 45th Annual Meeting of the
Statistical Society of Canada, dated Sunday, June 11, 2017, located in
Winnipeg E3 - 270 (EITC).
Nearly all data analytic procedures in R are designed for complete data, and many will fail if the data contain missing values. Typically, procedures simply ignore any incomplete rows in the data, or use ad-hoc procedures like replacing missing values with some sort of “best value”. However, such fixes may introduce biases in the ensuing statistical analysis.
Multiple imputation is a principled solution for this problem. The aim
of this workshop is to enable participants to perform and evaluate
multiple imputation using the R package mice
.
The workshop will consist of 5 sessions, each of which comprises a
lecture followed by a computer practical using R
:
Please remember to bring your own laptop computer and make sure that you have write-access to that machine (some corporate computers do not allow write access) or that you have the following software and packages pre-installed.
R
from the R-Project
websiteRStudio Desktop (Free License)
from RStudio’s
website. This
is not necessary, per se, but it is highly recommended as RStudio
delivers a tremendous improvement to the user experience of base
R
.mice
,
and
lattice
RStudio
by navigating
to Tools > Install Packages
in the upper menu and entering
mice, lattice
into the Packages
field. Make sure that the button
Install dependencies
is selected. Once done, click Install
and
you’re all set.R
or RStudio
, copy, paste and enter the
following code in the console window (by default the top-right
window in RStudio
/ the only window in R
):install.packages("mice")
install.packages("lattice")