Find index of matched donor units

## Usage

``matchindex(d, t, k = 5L)``

## Arguments

d

Numeric vector with values from donor cases.

t

Numeric vector with values from target cases.

k

Integer, number of unique donors from which a random draw is made. For `k = 1` the function returns the index in `d` corresponding to the closest unit. For multiple imputation, the advice is to set values in the range of `k = 5` to `k = 10`.

## Value

An integer vector with `length(t)` elements. Each element is an index in the array `d`.

## Details

For each element in `t`, the method finds the `k` nearest neighbours in `d`, randomly draws one of these neighbours, and returns its position in vector `d`.

Fast predictive mean matching algorithm in seven steps:

1. Shuffle records to remove effects of ties

2. Obtain sorting order on shuffled data

3. Calculate index on input data and sort it

4. Pre-sample vector `h` with values between 1 and `k`

For each of the `n0` elements in `t`:

5. find the two adjacent neighbours

6. find the `h_i`'th nearest neighbour

7. store the index of that neighbour

Return vector of `n0` positions in `d`.

We may use the function to perform predictive mean matching under a given predictive model. To do so, specify both `d` and `t` as predictions from the same model. Suppose that `y` contains the observed outcomes of the donor cases (in the same sequence as `d`), then `y[matchindex(d, t)]` returns one matched outcome for every target case.

See https://github.com/amices/mice/issues/236. This function is a replacement for the `matcher()` function that has been in default in `mice` since version `2.22` (June 2014).

## Author

Stef van Buuren, Nasinski Maciej, Alexander Robitzsch

## Examples

``````set.seed(1)

# Inputs need not be sorted
d <- c(-5, 5, 0, 10, 12)
t <- c(-6, -4, 0, 2, 4, -2, 6)

# Index (in vector a) of closest match
idx <- matchindex(d, t, 1)
idx
#> [1] 1 1 3 3 2 3 2

# To check: show values of closest match

# Random draw among indices of the 5 closest predictors
matchindex(d, t)
#> [1] 3 1 5 5 2 3 1

# An example
train <- mtcars[1:20, ]
test <- mtcars[21:32, ]
fit <- lm(mpg ~ disp + cyl, data = train)
d <- fitted.values(fit)
t <- predict(fit, newdata = test)  # note: not using mpg
idx <- matchindex(d, t)

# Borrow values from train to produce 12 synthetic values for mpg in test.
# Synthetic values are plausible values that could have been observed if