Scatterplot of observed and imputed dataSource:
Plotting methods for imputed data using lattice.
xyplot() produces a conditional scatterplots. The function
automatically separates the observed (blue) and imputed (red) data. The
function extends the usual features of lattice.
Formula that selects the data to be plotted. This argument follows the lattice rules for formulas, describing the primary variables (used for the per-panel display) and the optional conditioning variables (which define the subsets plotted in different panels) to be used in the plot.
The formula is evaluated on the complete data set in the
longform. Legal variable names for the formula include
names(x$data)plus the two administrative factors
Extended formula interface: The primary variable terms (both the LHS
x) may consist of multiple terms separated by a ‘+’ sign, e.g.,
y1 + y2 ~ x | a * b. This formula would be taken to mean that the user wants to plot both
y1 ~ x | a * band
y2 ~ x | a * b, but with the
y1 ~ xand
y2 ~ xin separate panels. This behavior differs from standard lattice. Only combine terms of the same type, i.e. only factors or only numerical variables. Mixing numerical and categorical data occasionally produces odds labeling of vertical axis.
An expression evaluating to a logical vector indicating which two groups are distinguished (e.g. using different colors) in the display. The environment in which this expression is evaluated in the response indicator
na.group = NULLcontrasts the observed and missing data in the LHS
yvariable of the display, i.e. groups created by
is.na(y). The expression
ycreates the groups according to
is.na(y). The expression
y1 & y2creates groups by
is.na(y1) & is.na(y2), and
y1 | y2creates groups as
is.na(y1) | is.na(y2), and so on.
This is the usual
groupsarguments in lattice. It differs from
na.groupsbecause it evaluates in the completed data
data.frame(complete(x, "long", inc=TRUE))(as usual), whereas
na.groupsevaluates in the response indicator. See
xyplotfor more details. When both
na.groupstakes precedence, and
A named list containing the graphical parameters. The default function
mice.themeproduces a short list of default colors, line width, and so on. The extensive list may be obtained from
trellis.par.get(). Global graphical parameters like
cexin high-level calls are still honored, so first experiment with the global parameters. Many setting consists of a pair. For example,
mice.themedefines two symbol colors. The first is for the observed data, the second for the imputed data. The theme settings only exist during the call, and do not affect the trellis graphical parameters.
Further arguments, usually not directly processed by the high-level functions documented here, but instead passed on to other functions.
The high-level functions documented here, as well as other high-level
Lattice functions, return an object of class
update method can be used to
subsequently update components of the object, and the
na.groups may be used to specify (combinations of)
missingness in any of the variables. The argument
groups can be used
to specify groups based on the variable values themselves. Only one of both
may be active at the same time. When both are specified,
takes precedence over
na.groups together to plots parts of the
data. For example, select the first imputed data set by by
Graphical parameters like
cex can be
specified in the arguments list to alter the plotting symbols. If
length(col)==2, the color specification to define the observed and
col is the color of the 'observed' data,
col is the color of the missing or imputed data. A convenient color
col=mdc(1:2), a transparent blue color for the observed
data, and a transparent red color for the imputed data. A good choice is
col=mdc(1:2), pch=20, cex=1.5. These choices can be set for the
duration of the session by running
The first two arguments (
data) are reversed
compared to the standard Trellis syntax implemented in lattice. This
reversal was necessary in order to benefit from automatic method dispatch.
In mice the argument
x is always a
mids object, whereas
in lattice the argument
x is always a formula.
In mice the argument
data is always a formula object, whereas in
lattice the argument
data is usually a data frame.
All other arguments have identical interpretation.
Sarkar, Deepayan (2008) Lattice: Multivariate Data Visualization with R, Springer.
van Buuren S and Groothuis-Oudshoorn K (2011).
Imputation by Chained Equations in
R. Journal of Statistical
Software, 45(3), 1-67. doi:10.18637/jss.v045.i03
imp <- mice(boys, maxit = 1) #> #> iter imp variable #> 1 1 hgt wgt bmi hc gen phb tv reg #> 1 2 hgt wgt bmi hc gen phb tv reg #> 1 3 hgt wgt bmi hc gen phb tv reg #> 1 4 hgt wgt bmi hc gen phb tv reg #> 1 5 hgt wgt bmi hc gen phb tv reg # xyplot: scatterplot by imputation number # observe the erroneous outlying imputed values # (caused by imputing hgt from bmi) xyplot(imp, hgt ~ age | .imp, pch = c(1, 20), cex = c(1, 1.5)) # same, but label with missingness of wgt (four cases) xyplot(imp, hgt ~ age | .imp, na.group = wgt, pch = c(1, 20), cex = c(1, 1.5))