Skip to contents

Core computational workhorse called by run_vw_fed_aggr for each chunk of vertices. Given the pre-computed static lambda profile lambda_prof and the chunk-specific sufficient statistics (\(X^\top Y\), \(\mathbf{1}^\top Y\), \(Y^\top Y\)), this function selects the optimal \(\lambda\) per vertex and returns fixed-effect estimates, standard errors, p-values, and variance components.

Usage

chunk_dlmm(
  lambda_prof,
  X1,
  XtY,
  sumY,
  YtY,
  n_terms,
  chunk_size,
  n_sites,
  REML,
  df
)

Arguments

lambda_prof

List. Output of the static lambda profile loop in run_vw_fed_aggr. Each element is a named list with fields lambda, w (site weights), R (Cholesky factor of \(A\)), logdetA, and logdetV.

X1

List of length \(K\). Each element is the column vector of row-sums of the design matrix for site \(k\) (\(X_k^\top \mathbf{1}_{n_k}\)).

XtY

List of length \(K\). Each element is a \(p \times C\) matrix \(X_k^\top Y_k\) for the current chunk of \(C\) vertices.

sumY

List of length \(K\). Each element is a length-\(C\) vector \(\mathbf{1}^\top Y_k\) (column sums of the outcome chunk).

YtY

List of length \(K\). Each element is a length-\(C\) vector of element-wise squared column sums \(Y_k^\top Y_k\).

n_terms

Integer. Number of fixed-effect terms \(p\).

chunk_size

Integer. Number of vertices in the current chunk \(C\).

n_sites

Integer. Number of sites \(K\).

REML

Logical. Use REML (TRUE) or ML (FALSE) likelihood.

df

Integer. Residual degrees of freedom (\(N - p\) for REML, \(N\) for ML).

Value

A named list with four matrices, all with \(C\) columns:

coef

\(p \times C\) fixed-effect estimates.

se

\(p \times C\) standard errors.

pval

\(p \times C\) two-sided p-values.

bw_var

\(2 \times C\) variance components; row 1 = \(\hat\sigma^2_\varepsilon\), row 2 = \(\hat\tau^2\).

Details

For each candidate \(\lambda\) the dynamic quantities are: $$B(\lambda) = X^\top V^{-1} Y = \sum_k \bigl(X_k^\top Y_k - w_k \mathbf{x}_{k1} \mathbf{1}_k^\top Y_k \bigr)$$ $$C(\lambda) = Y^\top V^{-1} Y = \sum_k \bigl(Y_k^\top Y_k - w_k (\mathbf{1}_k^\top Y_k)^2 \bigr)$$ where \(w_k = \lambda / (1 + n_k \lambda)\) is the site weight. The profile (RE)ML log-likelihood is evaluated at each candidate \(\lambda\) and the maximiser selected per vertex via max.col.

Standard errors are derived from \(\text{Var}(\hat\beta) = \hat\sigma^2 A^{-1}\), with \(A^{-1}\) obtained via chol2inv from the pre-computed Cholesky factor.