A question about this setup recently came up. In the single-armed setting, https://arxiv.org/abs/2103.11066 shows that you can do better by estimating the cost-benefit ratio directly when costs are uncertain.
However, the plug-in approach can also do well. maq can estimate Qini curves for these multi-armed "plug-in" policies out-of-the-box by just separately estimating the reward of the policy ($V_\rho$ in the paper), and the cost of the policy ($G_\rho$ in the paper). These can then be stitched together.
Here's a crude example.
library(maq)
# A toy RCT with realized outcomes and cost
n <- 3000
p <- 5
X <- matrix(runif(n * p), n, p)
W <- as.factor(sample(c("0", "1", "2"), n, replace = TRUE))
Y <- X[, 1] + X[, 2] * (W == "1") + 1.5 * X[, 3] * (W == "2") + rnorm(n)
C <- 2 * X[, 2] * (W == "1") + 2.5 * X[, 3] * (W == "2") + runif(n)
train <- sample(1:n, n/2)
# Fit a cate and cost forest on training set
tau.forest <- grf::multi_arm_causal_forest(X[train, ], Y[train], W[train])
cost.forest <- grf::multi_arm_causal_forest(X[train, ], C[train], W[train])
# Predict CATE and cost on held out evaluation data.
test <- -train
tau.hat <- predict(tau.forest, X[test, ], drop = TRUE)$predictions
cost.hat <- predict(cost.forest, X[test, ], drop = TRUE)$predictions
# Truncate possibly negative cost predictions
cost.hat[cost.hat < 0] = 0.0000001
# Get estimates of V_\rho
IPW.scores <- get_ipw_scores(Y[test], W[test])
qini.gain <- maq(tau.hat, cost.hat, IPW.scores)
# Get estimates of G_\rho
IPW.cost.scores <- get_ipw_scores(C[test], W[test])
qini.cost <- maq(tau.hat, cost.hat, IPW.cost.scores)
# Calling summary() on a qini curve gives the full underlying path:
head(summary(qini.gain))
# Combining
df = cbind(spend = summary(qini.cost)$gain, summary(qini.gain)[,-1])
head(df)
A question about this setup recently came up. In the single-armed setting, https://arxiv.org/abs/2103.11066 shows that you can do better by estimating the cost-benefit ratio directly when costs are uncertain.
However, the plug-in approach can also do well. maq can estimate Qini curves for these multi-armed "plug-in" policies out-of-the-box by just separately estimating the reward of the policy ($V_\rho$ in the paper), and the cost of the policy ($G_\rho$ in the paper). These can then be stitched together.
Here's a crude example.