Algorithms¶
This section summarizes the core algorithms implemented in
uniPairs.estimator.
Summary¶
We describe three procedures:
TripletScan: interaction screening via Triplet regressions
uniPairs-2stage: implemented in
UniPairsTwoStageuniPairs: implemented in
UniPairsOneStage
TripletScan¶
Algorithm: TripletScan
Input
Standardized design matrix \(X \in \mathbb{R}^{n \times p}\)
Response vector \(Y \in \mathbb{R}^n\)
Pair index set \(\mathcal{P}\)
Procedure
For each \((j,k) \in \mathcal{P}\):
Fit the local regression
\[Y = \beta_{0,jk} + \beta_{j,jk} X_j + \beta_{k,jk} X_k + \beta_{jk,jk} (X_j \odot X_k) + \varepsilon\]Record the two-sided t-test p-value \(p_{jk}\) for \(\beta_{jk,jk}\).
Sort p-values increasingly and define \(\ell_r = \log \widehat p_{(r)}\).
Apply the largest log-gap rule:
\[\widehat r = \arg\max_{1 \le r < M} \left( \ell_{r+1} - \ell_r \right), \quad \widehat{\Gamma} = \{ (j,k) \in \mathcal{P} : p_{jk} \le p_{(\widehat r)} \}\]
Output
Selected interaction set \(\widehat{\Gamma}\)
uniPairs-2stage¶
Algorithm: uniPairs-2stage
Input
Design matrix \(X \in \mathbb{R}^{n \times p}\)
Response \(Y \in \mathbb{R}^n\)
Hierarchy level \(h \in \{\text{strong}, \text{weak}, \text{none}\}\)
Procedure
Standardize each column of \(X\).
Fit UniLasso on \((X, Y)\) to obtain: - Main-effects active set \(\widehat S_M\) - Prevalidated predictions \(\widehat Y^{(1)}_{\mathrm{PV}}\)
Run TripletScan on \((X, Y)\) to obtain \(\widehat{\Gamma}\).
Restrict eligible interaction pairs \(\mathcal{E}\) according to hierarchy level \(h\) and \(\widehat S_M\).
Compute residuals:
\[R = Y - \widehat Y^{(1)}_{\mathrm{PV}} .\]Fit a Lasso of \(R\) on selected interactions
\[\{ X_j \odot X_k :(j,k) \in \widehat{\Gamma} \cap \mathcal{E} \}.\]Recover coefficients on the original scale and obtain final active sets \(\widehat S_M^{\text{final}}\) and \(\widehat S_I^{\text{final}}\).
Output
The predictive function
uniPairs (one-stage)¶
Algorithm: uniPairs
Input
Design matrix \(X \in \mathbb{R}^{n \times p}\)
Response \(Y \in \mathbb{R}^n\)
Procedure
Standardize each column of \(X\).
Run TripletScan on \((X, Y)\) to obtain \(\widehat{\Gamma}\).
Form the augmented design matrix
\[\widetilde X = [X, X_{\widehat{\Gamma}}], \quad X_{\widehat{\Gamma}} = \{ X_j \odot X_k : (j,k) \in \widehat{\Gamma} \}\]Fit UniLasso on \((\widetilde X, Y)\).
Recover coefficients on the original scale and obtain active sets \(\widehat S_M\) and \(\widehat S_I\).
Output
The predictive function
\[\widehat f(x) = \widehat\alpha_0 + \sum_{j \in \widehat S_M} \widehat\alpha_j x_j + \sum_{(j,k) \in \widehat S_I} \widehat\alpha_{jk} x_j x_k .\]
Practical Recommendation¶
In practice, uniPairs-2stage is recommended as the default choice when main effects are believed to be present and strong or weak hierarchy assumptions are appropriate.
The uniPairs one-stage procedure provides a flexible alternative when departures from hierarchy are expected.