Find index of matched donor units

## Arguments

- d
Numeric vector with values from donor cases.

- t
Numeric vector with values from target cases.

- k
Integer, number of unique donors from which a random draw is made. For

`k = 1`

the function returns the index in`d`

corresponding to the closest unit. For multiple imputation, the advice is to set values in the range of`k = 5`

to`k = 10`

.

## Details

For each element in `t`

, the method finds the `k`

nearest
neighbours in `d`

, randomly draws one of these neighbours, and
returns its position in vector `d`

.

Fast predictive mean matching algorithm in seven steps:

1. Shuffle records to remove effects of ties

2. Obtain sorting order on shuffled data

3. Calculate index on input data and sort it

4. Pre-sample vector `h`

with values between 1 and `k`

For each of the `n0`

elements in `t`

:

5. find the two adjacent neighbours

6. find the `h_i`

'th nearest neighbour

7. store the index of that neighbour

Return vector of `n0`

positions in `d`

.

We may use the function to perform predictive mean matching under a given
predictive model. To do so, specify both `d`

and `t`

as
predictions from the same model. Suppose that `y`

contains the observed
outcomes of the donor cases (in the same sequence as `d`

), then
`y[matchindex(d, t)]`

returns one matched outcome for every
target case.

See https://github.com/amices/mice/issues/236.
This function is a replacement for the `matcher()`

function that has
been in default in `mice`

since version `2.22`

(June 2014).

## Examples

```
set.seed(1)
# Inputs need not be sorted
d <- c(-5, 5, 0, 10, 12)
t <- c(-6, -4, 0, 2, 4, -2, 6)
# Index (in vector a) of closest match
idx <- matchindex(d, t, 1)
idx
#> [1] 1 1 3 3 2 3 2
# To check: show values of closest match
# Random draw among indices of the 5 closest predictors
matchindex(d, t)
#> [1] 3 1 5 5 2 3 1
# An example
train <- mtcars[1:20, ]
test <- mtcars[21:32, ]
fit <- lm(mpg ~ disp + cyl, data = train)
d <- fitted.values(fit)
t <- predict(fit, newdata = test) # note: not using mpg
idx <- matchindex(d, t)
# Borrow values from train to produce 12 synthetic values for mpg in test.
# Synthetic values are plausible values that could have been observed if
# they had been measured.
train$mpg[idx]
#> [1] 22.8 15.2 16.4 18.7 14.3 30.4 22.8 22.8 18.7 21.0 17.3 24.4
# Exercise: Create a distribution of 1000 plausible values for each of the
# twelve mpg entries in test, and count how many times the true value
# (which we know here) is located within the inter-quartile range of each
# distribution. Is your count anywhere close to 500? Why? Why not?
```