Robust and Agnostic Learning of Conditional Distributional Treatment Effects (2205.11486v2)
Abstract: The conditional average treatment effect (CATE) is the best measure of individual causal effects given baseline covariates. However, the CATE only captures the (conditional) average, and can overlook risks and tail events, which are important to treatment choice. In aggregate analyses, this is usually addressed by measuring the distributional treatment effect (DTE), such as differences in quantiles or tail expectations between treatment groups. Hypothetically, one can similarly fit conditional quantile regressions in each treatment group and take their difference, but this would not be robust to misspecification or provide agnostic best-in-class predictions. We provide a new robust and model-agnostic methodology for learning the conditional DTE (CDTE) for a class of problems that includes conditional quantile treatment effects, conditional super-quantile treatment effects, and conditional treatment effects on coherent risk measures given by $f$-divergences. Our method is based on constructing a special pseudo-outcome and regressing it on covariates using any regression learner. Our method is model-agnostic in that it can provide the best projection of CDTE onto the regression model class. Our method is robust in that even if we learn these nuisances nonparametrically at very slow rates, we can still learn CDTEs at rates that depend on the class complexity and even conduct inferences on linear projections of CDTEs. We investigate the behavior of our proposal in simulations, as well as in a case study of 401(k) eligibility effects on wealth.
- A. Ahmadi-Javid. Entropic value-at-risk: A new coherent risk measure. Journal of Optimization Theory and Applications, 155(3):1105–1123, 2012.
- C. Ai and X. Chen. Efficient estimation of models with conditional moment restrictions containing unknown functions. Econometrica, 71(6):1795–1843, 2003.
- Coherent measures of risk. Mathematical finance, 9(3):203–228, 1999.
- Deep-treat: Learning optimal personalized treatments from observational data using neural networks. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 32, 2018.
- S. Athey and G. Imbens. Recursive partitioning for heterogeneous causal effects. Proceedings of the National Academy of Sciences, 113(27):7353–7360, 2016.
- S. Athey and S. Wager. Policy learning with observational data. Econometrica, 89(1):133–161, 2021.
- Generalized random forests. The Annals of Statistics, 47(2):1148–1178, 2019.
- EconML: A Python Package for ML-Based Heterogeneous Treatment Effects Estimation. https://github.com/microsoft/EconML, 2019. Version 0.13.
- Program evaluation and causal inference with high-dimensional data. Econometrica, 85(1):233–298, 2017.
- A. Bennett and N. Kallus. The variational method of moments. arXiv preprint arXiv:2012.09422, 2020.
- Deep generalized method of moments for instrumental variable analysis. Advances in neural information processing systems, 32, 2019.
- What mean impacts miss: Distributional effects of welfare reform experiments. American Economic Review, 96(4):988–1012, 2006.
- G. Chamberlain. Efficiency bounds for semiparametric regression. Econometrica: Journal of the Econometric Society, pages 567–596, 1992.
- X. Chen and D. Pouzo. Efficient estimation of semiparametric conditional moment models with possibly nonsmooth residuals. Journal of Econometrics, 152(1):46–60, 2009.
- V. Chernozhukov and C. Hansen. The effects of 401 (k) participation on the wealth distribution: an instrumental quantile regression analysis. Review of Economics and statistics, 86(3):735–751, 2004.
- Inference on counterfactual distributions. Econometrica, 81(6):2205–2268, 2013.
- Double/debiased machine learning for treatment and structural parameters, 2018.
- Network and panel quantile effects via distribution regression. Journal of Econometrics, 2020.
- Nonparametric tests for treatment effect heterogeneity. The Review of Economics and Statistics, 90(3):389–405, 2008.
- Estimating structural target functions using machine learning and influence functions. arXiv preprint arXiv:2008.06461, 2020.
- Minimax estimation of conditional moment models. Advances in Neural Information Processing Systems, 33:12248–12262, 2020.
- Doubly-valid/doubly-sharp sensitivity analysis for causal inference with unmeasured confounding. arXiv preprint arXiv:2112.11449, 2021.
- S. Firpo. Efficient semiparametric estimation of quantile treatment effects. Econometrica, 75(1):259–276, 2007.
- D. J. Foster and V. Syrgkanis. Orthogonal statistical learning. arXiv preprint arXiv:1901.09036, 2019.
- Conditional generative adversarial networks for individualized treatment effect estimation and treatment selection. Frontiers in genetics, page 1578, 2020.
- Bayesian regression tree models for causal inference: Regularization, confounding, and heterogeneous effects (with discussion). Bayesian Analysis, 15(3):965–1056, 2020.
- Making the most out of programme evaluations and social experiments: Accounting for heterogeneity in programme impacts. The Review of Economic Studies, 64(4):487–535, 1997.
- J. L. Hill. Bayesian nonparametric modeling for causal inference. Journal of Computational and Graphical Statistics, 20(1):217–240, 2011.
- Treatment effects beyond the mean using distributional regression: Methods and guidance. PloS one, 15(2):e0226514, 2020.
- H. Ichimura and W. K. Newey. The influence function of semiparametric estimators. Quantitative Economics, 13(1):29–61, 2022.
- K. Imai and M. Ratkovic. Estimating treatment effect heterogeneity in randomized program evaluation. The Annals of Applied Statistics, 7(1):443–470, 2013.
- Controlling for unmeasured confounding in panel data using minimal bridge functions: From two-way fixed effects to factor models. arXiv preprint arXiv:2108.03849, 2021.
- Learning representations for counterfactual inference. In International conference on machine learning, pages 3020–3029. PMLR, 2016.
- N. Kallus. Treatment effect risk: Bounds and inference. arXiv preprint arXiv:2201.05893, 2022.
- N. Kallus and X. Mao. Debiased inference on identified linear functionals of underidentified nuisances via penalized minimax estimation. arXiv preprint arXiv:2208.08291, 2022.
- Localized debiased machine learning: Efficient inference on quantile treatment effects and beyond. arXiv preprint arXiv:1912.12945, 2019.
- E. H. Kennedy. Optimal doubly robust estimation of heterogeneous causal effects. arXiv preprint arXiv:2004.14497, 2020.
- E. H. Kennedy. Semiparametric doubly robust targeted double machine learning: a review. arXiv preprint arXiv:2203.06469, 2022.
- Assessing and reporting heterogeneity in treatment effects in clinical trials: a proposal. Trials, 11(1):1–11, 2010.
- Non-parametric inference adaptive to intrinsic dimension. arXiv preprint arXiv:1901.03719, 2019.
- R. Koenker. Quantile regression, volume 38. Cambridge university press, 2005.
- R. Koenker and G. Bassett Jr. Regression quantiles. Econometrica: journal of the Econometric Society, pages 33–50, 1978.
- Metalearners for estimating heterogeneous treatment effects using machine learning. Proceedings of the national academy of sciences, 116(10):4156–4165, 2019.
- L. Leqi and E. H. Kennedy. Median optimal treatment regimes. arXiv preprint arXiv:2103.01802, 2021.
- N. Meinshausen and G. Ridgeway. Quantile regression forests. Journal of Machine Learning Research, 7(6), 2006.
- J. Neyman. Optimal asymptotic tests of composite hypotheses. Probability and statsitics, pages 213–234, 1959.
- X. Nie and S. Wager. Quasi-oracle estimation of heterogeneous treatment effects. Biometrika, 108(2):299–319, 2021.
- T. Olma. Nonparametric estimation of truncated conditional expectation functions. arXiv preprint arXiv:2109.06150, 2021.
- Orthogonal random forest for causal inference. In International Conference on Machine Learning, pages 4932–4941. PMLR, 2019.
- Conditional distributional treatment effect with kernel conditional mean embeddings and u-statistic regression. In International Conference on Machine Learning, pages 8401–8412. PMLR, 2021.
- S. Passi and S. Barocas. Problem formulation and fairness. In Proceedings of the Conference on Fairness, Accountability, and Transparency, pages 39–48, 2019.
- Scikit-learn: Machine learning in Python. Journal of Machine Learning Research, 12:2825–2830, 2011.
- 401 (k) plans and tax-deferred saving. Studies in the Economics of Aging, pages 105–142, 1994.
- Estimating exposure effects by modelling the expectation of exposure conditional on confounders. Biometrics, pages 479–495, 1992.
- Minimax estimation of a functional on a structured high-dimensional model. The Annals of Statistics, 45(5):1951–1987, 2017.
- R. T. Rockafellar. Conjugate duality and optimization. SIAM, 1974.
- A. Ruszczyński and A. Shapiro. Optimization of convex risk functions. Mathematics of operations research, 31(3):433–452, 2006.
- Adapting neural networks for the estimation of treatment effects. Advances in neural information processing systems, 32, 2019.
- B. W. Silverman. Density estimation for statistics and data analysis. Routledge, 2018.
- C. J. Stone. Optimal global rates of convergence for nonparametric regression. The annals of statistics, pages 1040–1053, 1982.
- M. J. Van der Laan and J. M. Robins. Unified methods for censored longitudinal data and causality, volume 5. Springer, 2003.
- Weak Convergence and Empirical Processes: With Applications to Statistics. Springer Series in Statistics. Springer, 1996.
- S. Vansteelandt and M. Joffe. Structural nested models and g-estimation: the partially realized promise. Statistical Science, 29(4):707–731, 2014.
- S. Wager and S. Athey. Estimation and inference of heterogeneous treatment effects using random forests. Journal of the American Statistical Association, 113(523):1228–1242, 2018.
- T. Zhou and D. Carlson. Estimating potential outcome distributions with collaborating causal networks. arXiv preprint arXiv:2110.01664, 2021.
- Nathan Kallus (133 papers)
- Miruna Oprescu (16 papers)