Pool lme4::lmer model output across imputed datasets

This function combines estimates, standard errors and p-values across imputed datasets, for a single model (i.g., one vertex). The function was largely taken from mice::pool() and mice::summary.mipo() code. It averages the estimates of the complete data model, and computes relevant statistics, following Rubin's rules (Rubin, 1987, p. 76).

P-values are estimated using the *t-as-z* approach at the moment. This is known to the anti-conservative for small sample sizes. However we preferred a relatively lenient (and computationally inexpensive) solution at this stage. We will be addressing Type I error mores strictly at the cluster forming stage.

The residuals of the model are currently simply averaged across imputed datasets, for lack of a better idea of how to combine them.

Usage

vw_pool(out_stats, m)

Arguments

out_stats

: Output of single_lmm, i.e.: a list with two elements:

"stats": a dataframe with estimates, SEs and p-values for each fixed effect term)
"resid": a vector of residuals for the given model.

m

: Number of imputed dataset (to avoid recomputing it)

Value

A list containing the pooled coefficients, SEs, t- and p- values.

Note

Used inside run_vw_lmm.

References

Rubin, D.B. (1987). Multiple Imputation for Nonresponse in Surveys. New York: John Wiley and Sons.

Author

Serena Defina, 2024.