Session title: Causal Inference and Machine Learning
Organizer: Ryan Tibshirani (CMU)
Chair: Vincent Joseph Dorie (Columbia)
Time: June 5th, 8:30am – 10:00am
Location: VEC 1402
Speech 1: Nonparametric causal effects based on incremental propensity score interventions
Speaker: Edward Kennedy (CMU)
Most work in causal inference considers deterministic interventions that set each unit’s treatment to some fixed value. However, under positivity violations these interventions can lead to non-identification, inefficiency, and effects with little practical relevance. Further, corresponding effects in longitudinal studies are highly sensitive to the curse of dimensionality, resulting in widespread use of unrealistic parametric models. We propose a novel solution to these problems: incremental interventions that shift propensity score values rather than set treatments to fixed values. Incremental interventions have several crucial advantages. First, they avoid positivity assumptions entirely. Second, they require no parametric assumptions and yet still admit a simple characterization of longitudinal effects, independent of the number of timepoints. For example, they allow longitudinal effects to be visualized with a single curve instead of lists of coefficients. After characterizing these incremental interventions and giving identifying conditions for corresponding effects, we also develop general efficiency theory, propose efficient nonparametric estimators that can attain fast convergence rates even when incorporating flexible machine learning, and propose a bootstrap-based confidence band and simultaneous test of no treatment effect. Finally we explore finite-sample performance via simulation, and apply the methods to study time-varying sociological effects of incarceration on entry into marriage.
Speech 2: Quasi-Oracle Estimation of Heterogeneous Causal Effects
Speaker: Stefan Wager (Stanford)
Abstract: Many scientific and engineering challenges, ranging from personalized medicine to customized marketing recommendations, require an understanding of treatment effect heterogeneity. In this paper, we develop a class of two-step algorithms for heterogeneous treatment effect estimation in observational studies. We first estimate marginal effects and treatment propensities to form an objective function that isolates the heterogeneous treatment effects, and then optimize the learned objective. This approach has several advantages over existing methods. From a practical perspective, our method is very flexible and easy to use: In both steps, we can use any method of our choice, e.g., penalized regression, a deep net, or boosting; moreover, these methods can be fine-tuned by cross-validating on the learned objective. Meanwhile, in the case of penalized kernel regression, we show that our method has a quasi-oracle property, whereby even if our pilot estimates for marginal effects and treatment propensities are not particularly accurate, we achieve the same regret bounds as an oracle who has a-priori knowledge of these nuisance components. We implement variants of our method based on both penalized regression and convolutional neural networks, and find promising performance relative to existing baselines.
Speech 3: Off-policy Learning in Theory and in the Wild
Speaker: Yu-Xiang Wang (Amazon/UCSB)
Abstract: The talk considers the problem of offline policy learning for automated decision systems under the contextual bandits model, where we aim at evaluating the performance of a given policy (a decision algorithm) and also learning a better policy using logged historical data consisting of context, actions, rewards and probabilities of the actions taken. This is a generalization of the Average Treatment Effect (ATE) estimation problem and has some interesting new set of desiderata to consider.
In the first part of the talk, I will compare and contrast off-policy evaluation and ATE estimation and clarify how different assumptions change the corresponding minimax risk in estimating the “causal effect”. In addition, I will talk about how one can achieve significantly better finite sample performance than asymptotically optimal estimators through the SWITCH estimator.
In the second part of the talk, I will talk about off-policy evaluation and learning in a real industry environment. I will highlight several intersting challenges there including partially logged probilities, unobserved decision variables (Simpson’s paradox), effect of model bias and so on. We then propose and recommend practical ways to deal with these challenges under different circumstances.