| Title: | Rashomon Set of Optimal Trees |
|---|---|
| Description: | Implements a general framework for globally optimizing user-specified objective functionals over interpretable binary weight functions represented as sparse decision trees, called ROOT (Rashomon Set of Optimal Trees). It searches over candidate trees to construct a Rashomon set of near-optimal solutions and derives a summary tree highlighting stable patterns in the optimized weights. ROOT includes a built-in generalizability mode for identifying subgroups in trial settings for transportability analyses (Parikh et al. (2025) <doi:10.1080/01621459.2025.2495319>). |
| Authors: | Yiren Hou [aut] (ORCID: <https://orcid.org/0009-0005-0422-4268>, Equal contribution), Peter Liu [aut, cre] (ORCID: <https://orcid.org/0009-0000-2691-5637>, Equal contribution), Sean McGrath [aut] (ORCID: <https://orcid.org/0000-0002-7281-3516>), Harsh Parikh [aut] (ORCID: <https://orcid.org/0000-0003-1927-8646>) |
| Maintainer: | Peter Liu <[email protected]> |
| License: | MIT + file LICENSE |
| Version: | 0.2.0 |
| Built: | 2026-05-25 06:16:48 UTC |
| Source: | https://github.com/peterliu599/root |
A high-level wrapper around ROOT() for identifying and
characterizing subgroups that are insufficiently represented in a
randomized controlled trial (RCT) relative to a target population. The
function returns an interpretable decision tree describing which subgroups
should be included () or excluded () from the
analysis, along with the corresponding target average treatment effect estimates.
characterizing_underrep( data, global_objective_fn = NULL, generalizability_path = FALSE, leaf_proba = 0.25, seed = 123, num_trees = 10, vote_threshold = 2/3, explore_proba = 0.05, feature_est = "Ridge", feature_est_args = list(), top_k_trees = FALSE, k = 10, cutoff = "baseline", max_depth = 8L, min_leaf_n = 2L, max_rejects_per_node = 10L, verbose = FALSE, positivity_trim = NULL, split_strategy = "midpoint", quantile_range = c(0.1, 0.9) )characterizing_underrep( data, global_objective_fn = NULL, generalizability_path = FALSE, leaf_proba = 0.25, seed = 123, num_trees = 10, vote_threshold = 2/3, explore_proba = 0.05, feature_est = "Ridge", feature_est_args = list(), top_k_trees = FALSE, k = 10, cutoff = "baseline", max_depth = 8L, min_leaf_n = 2L, max_rejects_per_node = 10L, verbose = FALSE, positivity_trim = NULL, split_strategy = "midpoint", quantile_range = c(0.1, 0.9) )
data |
A |
global_objective_fn |
A function with signature
|
generalizability_path |
Logical. If |
leaf_proba |
A |
seed |
Random seed for reproducibility. |
num_trees |
Number of trees to grow in the ROOT forest. More trees explore the tree space more thoroughly but increase computation time. |
vote_threshold |
Controls how Rashomon-set tree votes are aggregated
into |
explore_proba |
Exploration probability in tree growth. Controls the
explore-exploit trade-off: with probability |
feature_est |
Either |
feature_est_args |
List of extra arguments passed to
|
top_k_trees |
Logical; if |
k |
Number of trees to retain when |
cutoff |
Numeric or |
max_depth |
Maximum depth of each tree grown during the forest
construction stage. A node at |
min_leaf_n |
Minimum number of observations required in a node for
splitting to be attempted. If a node contains fewer than
|
max_rejects_per_node |
Maximum number of consecutive rejected splits
(splits that do not improve the objective) allowed at a single node
before the node is forced to become a leaf. This prevents infinite
recursion in pathological cases. Default |
verbose |
Logical; if |
positivity_trim |
Controls pre-trimming of target observations with
near-zero trial selection probability. Only used when
|
split_strategy |
Character; either |
quantile_range |
Numeric vector of length 2 giving the lower and upper
bounds of the quantile range when
|
A characterizing_underrep S3 object (a list) with:
root |
The |
combined |
The input |
leaf_summary |
A |
In the context of generalizing treatment effects from a trial to a target population, a subgroup is considered underrepresented (or insufficiently represented) when it occupies a region of the covariate space that both (a) has limited overlap between the trial and the target population, and (b) exhibits heterogeneous treatment effects.
Formally, the contribution of a unit with covariates to the
variance of the target average treatment effect (TATE) estimator depends on
both the selection ratio
and the conditional average treatment effect. Subgroups where
is small, and conditional average treatment effect deviates from the
overall TATE, contribute disproportionately to estimator variance. These are
the subgroups that characterizing_underrep() identifies and
characterizes. The sample average treatment effect (SATE) is a
finite sample equivalent version of the TATE.
When generalizability_path = TRUE, this function implements the
two-stage approach of Parikh et al. (2025):
Design stage: ROOT learns binary weights that
minimize the variance of the weighted target average treatment effect
(WTATE) estimator, subject to interpretability constraints (tree
structure). The resulting decision tree characterizes which subgroups
are well-represented () and which are underrepresented
().
Analysis stage: The WTATE is estimated on the refined target population that excludes the underrepresented subgroups. This estimand trades some generality for greater precision and credibility.
The key estimands are:
SATE (Sample Average Treatment Effect): the treatment effect for the full target population based on the trial sample, which may be imprecise if certain subgroups are underrepresented. It is a finite sample equivalent version of the TATE.
WTATE (Weighted Target Average Treatment Effect): the treatment effect restricted to the sufficiently represented subpopulation, estimated with lower variance.
When generalizability_path = FALSE, this function behaves as a
convenience wrapper around ROOT() for arbitrary binary weight
optimization. The user can supply a custom objective function via
global_objective_fn; ROOT will learn an interpretable tree-based
weight function minimizing that objective. See
vignette("optimization_path_example") for an example.
When generalizability_path = TRUE, data must contain the
following standardized columns:
Y: numeric outcome variable.
Tr: binary treatment indicator (0 = control, 1 = treated).
S: binary sample indicator (1 = trial/RCT, 0 = target
population).
All remaining columns are treated as pretreatment covariates
available for splitting.
Parikh H, Ross RK, Stuart E, Rudolph KE (2025). "Who Are We Missing?: A Principled Approach to Characterizing the Underrepresented Population." Journal of the American Statistical Association. doi:10.1080/01621459.2025.2495319
ROOT for the underlying optimization engine;
vignette("generalizability_path_example") for a detailed worked
example of the generalizability workflow;
vignette("optimization_path_example") for general optimization mode.
# --- Generalizability analysis --- # diabetes_data has columns Y, Tr, S, and covariates data(diabetes_data, package = "ROOT") char_fit <- characterizing_underrep( data = diabetes_data, generalizability_path = TRUE, num_trees = 20, top_k_trees = TRUE, k = 10, seed = 123 ) # View the characterization tree plot(char_fit) # Inspect which subgroups are underrepresented char_fit$leaf_summary # Treatment effect estimates (SATE and WTATE) char_fit$root$estimate# --- Generalizability analysis --- # diabetes_data has columns Y, Tr, S, and covariates data(diabetes_data, package = "ROOT") char_fit <- characterizing_underrep( data = diabetes_data, generalizability_path = TRUE, num_trees = 20, top_k_trees = TRUE, k = 10, seed = 123 ) # View the characterization tree plot(char_fit) # Inspect which subgroups are underrepresented char_fit$leaf_summary # Treatment effect estimates (SATE and WTATE) char_fit$root$estimate
A toy dataset for illustrating ROOT examples and tests.
data(diabetes_data)data(diabetes_data)
A data.frame with one row per individual and the columns:
Indicator in 0/1: age >= 45.
Indicator in 0/1: on a diet program.
Indicator in 0/1: race is Black.
Sample indicator in 0/1: 1 means RCT or source, 0 means target.
Indicator in 0/1: male.
Treatment assignment in 0/1.
Observed outcome (numeric or 0/1).
Visualizes the decision tree derived from the ROOT analysis. Highlights
which subgroups are represented where w = 1 versus underrepresented
where w = 0 in generalizability mode, or simply w(x) in {0,1}
in general optimization mode.
## S3 method for class 'characterizing_underrep' plot( x, main = "Final Characterized Tree from Rashomon Set", cex.main = 1.2, ... )## S3 method for class 'characterizing_underrep' plot( x, main = "Final Characterized Tree from Rashomon Set", cex.main = 1.2, ... )
x |
A |
main |
Character string for the plot title. Default is
|
cex.main |
Numeric scaling factor for the title text size. Default is |
... |
Additional arguments passed to |
NULL. The plot is drawn to the active graphics device.
char.output = characterizing_underrep(diabetes_data,generalizability_path = TRUE, seed = 123) plot(char.output)char.output = characterizing_underrep(diabetes_data,generalizability_path = TRUE, seed = 123) plot(char.output)
Visualizes the decision tree that characterizes the weighted subgroup
(the weight function in {0,1}) identified by ROOT(),
using rpart.plot::prp().
## S3 method for class 'ROOT' plot(x, ...)## S3 method for class 'ROOT' plot(x, ...)
x |
A |
... |
Additional arguments passed to |
No return value; the plot is drawn to the active graphics device.
ROOT.output = ROOT(diabetes_data,generalizability_path = TRUE, seed = 123) plot(ROOT.output)ROOT.output = ROOT(diabetes_data,generalizability_path = TRUE, seed = 123) plot(ROOT.output)
Print the ROOT summary which includes unweighted and (when in
generalizability mode) weighted estimates with standard errors, as reported by
summary.ROOT().
## S3 method for class 'characterizing_underrep' print(x, ...)## S3 method for class 'characterizing_underrep' print(x, ...)
x |
A |
... |
Currently unused. Included for S3 compatibility. |
Delegates core statistics and estimands to print(x$root).
object returned invisibly. Printed output is a readable brief summary.
Average treatment effect.
Randomized controlled trial.
Standard error.
Transported average treatment effect.
Weighted transported average treatment effect.
Sample average treatment effect.
char.output = characterizing_underrep(diabetes_data,generalizability_path = TRUE, seed = 123) print(char.output)char.output = characterizing_underrep(diabetes_data,generalizability_path = TRUE, seed = 123) print(char.output)
Provides a human-readable brief summary of a ROOT object, including:
the summary characterization tree f,
in generalizability mode (generalizability_path = TRUE), the
unweighted and weighted estimands with their standard errors
and an explanatory note for the weighted standard error (SE).
## S3 method for class 'ROOT' print(x, ...)## S3 method for class 'ROOT' print(x, ...)
x |
A |
... |
Currently unused and included for S3 compatibility. |
object returned invisibly. Printed output is for inspection.
Average treatment effect.
Randomized controlled trial.
Standard error.
Transported average treatment effect.
Weighted transported average treatment effect.
Sample average treatment effect.
When generalizability_path = TRUE, the unweighted estimand corresponds
to a SATE-type quantity and the weighted estimand to a WTATE-type
quantity for the transported target population. When generalizability_path = FALSE,
ROOT is used for general functional optimization and no causal labels
are imposed.
ROOT.output = ROOT(diabetes_data,generalizability_path = TRUE, seed = 123) print(ROOT.output)ROOT.output = ROOT(diabetes_data,generalizability_path = TRUE, seed = 123) print(ROOT.output)
ROOT (Rashomon Set of Optimal Trees) is a general-purpose functional
optimization algorithm that learns interpretable, tree-structured binary
weight functions . Given a dataset and a
global objective function , ROOT searches over the space of
decision trees to find weight assignments that minimize the objective function.
ROOT( data, global_objective_fn = NULL, generalizability_path = FALSE, leaf_proba = 0.25, seed = NULL, num_trees = 10, vote_threshold = 2/3, explore_proba = 0.05, feature_est = "Ridge", feature_est_args = list(), top_k_trees = FALSE, k = 10, cutoff = "baseline", max_depth = 8L, min_leaf_n = 2L, max_rejects_per_node = 10L, verbose = FALSE, positivity_trim = NULL, split_strategy = "midpoint", quantile_range = c(0.1, 0.9) )ROOT( data, global_objective_fn = NULL, generalizability_path = FALSE, leaf_proba = 0.25, seed = NULL, num_trees = 10, vote_threshold = 2/3, explore_proba = 0.05, feature_est = "Ridge", feature_est_args = list(), top_k_trees = FALSE, k = 10, cutoff = "baseline", max_depth = 8L, min_leaf_n = 2L, max_rejects_per_node = 10L, verbose = FALSE, positivity_trim = NULL, split_strategy = "midpoint", quantile_range = c(0.1, 0.9) )
data |
A data.frame containing the dataset. In general optimization mode ( In generalizability mode ( |
global_objective_fn |
A function with signature
|
generalizability_path |
|
leaf_proba |
A numeric tuning parameter that increases the chance
a node stops splitting by selecting a synthetic |
seed |
An optional numeric seed for reproducibility. |
num_trees |
An integer number of trees to grow. More trees explore the solution space more thoroughly. Default 10. |
vote_threshold |
Controls how per-observation votes from the Rashomon
set trees are aggregated into the final binary weight
|
explore_proba |
A numeric giving the exploration probability at
leaves. With probability |
feature_est |
Either |
feature_est_args |
A list of additional arguments passed to a
user-supplied |
top_k_trees |
|
k |
An integer giving the number of top trees when
|
cutoff |
A numeric or |
max_depth |
Maximum depth of each tree grown during the forest
construction stage. A node at |
min_leaf_n |
Minimum number of observations required in a node for
splitting to be attempted. If a node contains fewer than
|
max_rejects_per_node |
Maximum number of consecutive rejected splits
(splits that do not improve the objective) allowed at a single node
before the node is forced to become a leaf. This prevents infinite
recursion in pathological cases. Default |
verbose |
|
positivity_trim |
Controls pre-trimming of target observations with
near-zero trial selection probability
Note that trimming changes the estimand: the WTATE is now estimated over
the subpopulation |
split_strategy |
Character; either |
quantile_range |
Numeric vector of length 2 giving the lower and upper
bounds of the quantile range when
|
An object of class "ROOT" (a list) with elements:
D_rash |
Data frame containing the Rashomon-set tree votes and the
final aggregated weight |
D_forest |
Data frame with all forest-level working columns. |
w_forest |
List of per-tree results from the tree-building routine. |
rashomon_set |
Integer vector of indices identifying which trees were selected into the Rashomon set. |
global_objective_fn |
The objective function used. |
f |
The characteristic (summary) tree fitted to |
testing_data |
Data frame of observations used for optimization
(trial units when |
estimate |
(Only when |
generalizability_path |
Logical flag echoing the input. |
positivity_trim_info |
(Only when |
ROOT solves the functional optimization problem:
where maps a -dimensional
covariate vector to a binary include/exclude decision. The key challenge is
that, unlike standard tree algorithms, the global loss
is not decomposable as a sum of losses over
independent subsets of the data. This means conventional greedy,
divide-and-conquer tree-building strategies do not apply. ROOT addresses
this through a randomization-based tree construction with an
explore-exploit strategy.
The algorithm proceeds in several stages:
Feature importance estimation: Split probabilities are estimated using Ridge regression, Gradient Boosting Machine (GBM), or a user-supplied function, biasing the search toward covariates likely to be informative.
Stochastic tree construction: num_trees trees are
grown. At each internal node, a feature is drawn according to the
estimated split probabilities (or a "leaf" token is drawn, terminating
the branch). Splits are made at the midpoint of the selected feature's
empirical distribution. An explore-exploit strategy assigns leaf
weights: with probability explore_proba a random weight is
chosen; otherwise the greedy optimal weight (reducing the global
objective) is used.
Rashomon set selection: Trees are ranked by their global
objective values. The top-k trees (or all trees below a cutoff)
form the Rashomon set: a collection of near-optimal but
potentially different models, each providing a characterization
of the optimal weight function.
Aggregation: Per-observation votes from the Rashomon set
are combined (by default, majority vote) to produce the final weight
vector w_opt.
Characteristic tree: A single summary decision tree is
fitted to the aggregated w_opt assignments, providing a concise,
interpretable description of the weight function.
When generalizability_path = TRUE, ROOT implements the methodology
of Parikh et al. (2025) for characterizing underrepresented subgroups in
trial-to-target generalizability analyses. In this mode:
data must contain columns Y (outcome), Tr
(treatment, 0/1), and S (sample indicator, 1 = trial, 0 =
target).
ROOT internally computes transportability scores based on
inverse-probability weighting (IPW), estimates the selection model
, and constructs Horvitz-Thompson-style
influence scores.
The default objective minimizes the variance of the weighted target average treatment effect (WTATE) estimator. This objective accounts for both the selection odds (trial participation probability) and treatment effect heterogeneity, so that subgroups are flagged as underrepresented only when they both lack trial representation and exhibit effect modification.
The output includes the unweighted sample average treatment effect (SATE) and the WTATE with standard errors.
See characterizing_underrep for a higher-level wrapper that
additionally produces a leaf-level summary table, and
vignette("generalizability_path_example") for a worked example.
When generalizability_path = FALSE, ROOT operates as a general
functional optimizer. The user supplies any data.frame and
(optionally) a custom global_objective_fn. If no objective is
supplied, ROOT uses a default variance-based loss operating on the
vsq column (per-unit variance proxy). See
vignette("optimization_path_example") for an example.
Parikh H, Ross RK, Stuart E, Rudolph KE (2025). "Who Are We Missing?: A Principled Approach to Characterizing the Underrepresented Population." Journal of the American Statistical Association. doi:10.1080/01621459.2025.2495319
Crump RK, Hotz VJ, Imbens GW, Mitnik OA (2009). "Dealing with Limited Overlap in Estimation of Average Treatment Effects." Biometrika, 96(1), 187–199.
characterizing_underrep for a higher-level wrapper with
leaf-summary output; vignette("generalizability_path_example") for
the generalizability workflow;
vignette("optimization_path_example") for general optimization.
# --- Generalizability mode --- data(diabetes_data, package = "ROOT") root_fit <- ROOT( data = diabetes_data, generalizability_path = TRUE, num_trees = 20, top_k_trees = TRUE, k = 10, seed = 123 ) # --- General optimization mode (custom objective) --- my_objective <- function(D) { w <- D$w if (sum(w) == 0) return(Inf) sqrt(sum(w * D$vsq) / sum(w)^2) } set.seed(123) n_assets <- 100 # Asset features volatility <- runif(n_assets, 0.05, 0.40) # annualised volatility beta <- runif(n_assets, 0.5, 1.8) # market beta sector <- sample(c("Tech", "Finance", "Energy", "Health"), n_assets, replace = TRUE) # Simulate returns: r_i = beta_i * r_market + epsilon_i market <- rnorm(1000, 0.0005, 0.01) returns_mat <- sapply(seq_len(n_assets), function(i) beta[i] * market + rnorm(1000, 0, volatility[i] / sqrt(252)) ) # Per-asset return variance (the objective proxy ROOT will minimize) vsq <- apply(returns_mat, 2, var) my_data <- data.frame( vsq = vsq, vol = volatility, beta = beta, sector = as.integer(factor(sector)) ) opt_fit <- ROOT( data = my_data, global_objective_fn = my_objective, num_trees = 20, seed = 42 )# --- Generalizability mode --- data(diabetes_data, package = "ROOT") root_fit <- ROOT( data = diabetes_data, generalizability_path = TRUE, num_trees = 20, top_k_trees = TRUE, k = 10, seed = 123 ) # --- General optimization mode (custom objective) --- my_objective <- function(D) { w <- D$w if (sum(w) == 0) return(Inf) sqrt(sum(w * D$vsq) / sum(w)^2) } set.seed(123) n_assets <- 100 # Asset features volatility <- runif(n_assets, 0.05, 0.40) # annualised volatility beta <- runif(n_assets, 0.5, 1.8) # market beta sector <- sample(c("Tech", "Finance", "Energy", "Health"), n_assets, replace = TRUE) # Simulate returns: r_i = beta_i * r_market + epsilon_i market <- rnorm(1000, 0.0005, 0.01) returns_mat <- sapply(seq_len(n_assets), function(i) beta[i] * market + rnorm(1000, 0, volatility[i] / sqrt(252)) ) # Per-asset return variance (the objective proxy ROOT will minimize) vsq <- apply(returns_mat, 2, var) my_data <- data.frame( vsq = vsq, vol = volatility, beta = beta, sector = as.integer(factor(sector)) ) opt_fit <- ROOT( data = my_data, global_objective_fn = my_objective, num_trees = 20, seed = 42 )
Summarizes the ROOT summary which includes unweighted and (when in
generalizability mode) weighted estimates with standard errors, as reported by
summary.ROOT(). Provides a brief overview of terminal rules from the
annotated summary tree when available.
## S3 method for class 'characterizing_underrep' summary(object, ...)## S3 method for class 'characterizing_underrep' summary(object, ...)
object |
A |
... |
Currently unused. Included for S3 compatibility. |
Delegates core statistics and estimands to summary(object$root).
Previews up to ten terminal rules when a summary tree exists.
object returned invisibly. Printed output is a readable summary.
Average treatment effect.
Randomized controlled trial.
Standard error.
Transported average treatment effect.
Weighted transported average treatment effect.
Sample average treatment effect.
char.output = characterizing_underrep(diabetes_data,generalizability_path = TRUE, seed = 123) summary(char.output)char.output = characterizing_underrep(diabetes_data,generalizability_path = TRUE, seed = 123) summary(char.output)
Provides a readable summary of a ROOT object, including:
the summary characterization tree f,
whether the user supplied a custom global_objective_fn (Yes/No), and
in generalizability mode (generalizability_path = TRUE), the
unweighted and weighted estimands with their standard errors.
## S3 method for class 'ROOT' summary(object, ...)## S3 method for class 'ROOT' summary(object, ...)
object |
A |
... |
Currently unused and included for S3 compatibility. |
object returned invisibly. Printed output is for inspection.
Average treatment effect.
Randomized controlled trial.
Standard error.
Transported average treatment effect.
Weighted transported average treatment effect.
Sample average treatment effect.
When generalizability_path = TRUE, the unweighted estimand corresponds
to a SATE-type quantity and the weighted estimand to a WTATE-type
quantity for the transported target population. When generalizability_path = FALSE,
ROOT is used for general functional optimization and no causal labels
are imposed; the summary focuses on the tree and diagnostics.
The summary also reports:
the number of trees grown,
the size of the Rashomon set,
the percentage of observations with ensemble vote w_opt == 1.
ROOT.output = ROOT(diabetes_data,generalizability_path = TRUE, seed = 123) summary(ROOT.output)ROOT.output = ROOT(diabetes_data,generalizability_path = TRUE, seed = 123) summary(ROOT.output)