Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
157 tokens/sec
GPT-4o
43 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Adversarial Distribution Balancing for Counterfactual Reasoning (2311.16616v1)

Published 28 Nov 2023 in cs.LG

Abstract: The development of causal prediction models is challenged by the fact that the outcome is only observable for the applied (factual) intervention and not for its alternatives (the so-called counterfactuals); in medicine we only know patients' survival for the administered drug and not for other therapeutic options. Machine learning approaches for counterfactual reasoning have to deal with both unobserved outcomes and distributional differences due to non-random treatment administration. Unsupervised domain adaptation (UDA) addresses similar issues; one has to deal with unobserved outcomes -- the labels of the target domain -- and distributional differences between source and target domain. We propose Adversarial Distribution Balancing for Counterfactual Reasoning (ADBCR), which directly uses potential outcome estimates of the counterfactuals to remove spurious causal relations. We show that ADBCR outcompetes state-of-the-art methods on three benchmark datasets, and demonstrate that ADBCR's performance can be further improved if unlabeled validation data are included in the training procedure to better adapt the model to the validation domain.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (43)
  1. Counterfactual representation learning with balancing weights. In International Conference on Artificial Intelligence and Statistics, pages 1972–1980. PMLR, 2021.
  2. Recursive partitioning for heterogeneous causal effects. Proceedings of the National Academy of Sciences, 113(27):7353–7360, 2016.
  3. A theory of learning from different domains. Machine learning, 79(1):151–175, 2010.
  4. Representation learning: A review and new perspectives. IEEE transactions on pattern analysis and machine intelligence, 35(8):1798–1828, 2013.
  5. Estimating counterfactual treatment outcomes over time through adversarially balanced representations. arXiv preprint arXiv:2002.04083, 2020.
  6. Effects of early intervention on cognitive function of low birth weight preterm infants. The Journal of pediatrics, 120(3):350–359, 1992.
  7. Bart: Bayesian additive regression trees. The Annals of Applied Statistics, 4(1):266–298, 2010.
  8. Alicia Curth and Mihaela van der Schaar. Nonparametric estimation of heterogeneous treatment effects: From theory to learning algorithms. In International Conference on Artificial Intelligence and Statistics, pages 1810–1818. PMLR, 2021.
  9. Really doing great at estimating cate? a critical look at ml benchmarking practices in treatment effect estimation. In Thirty-fifth Conference on Neural Information Processing Systems Datasets and Benchmarks Track (Round 2), 2021.
  10. Fast computation of wasserstein barycenters. In International conference on machine learning, pages 685–693. PMLR, 2014.
  11. Instrumental variables as bias amplifiers with general outcome and confounding. Biometrika, 104(2):291–302, 2017.
  12. Automated versus do-it-yourself methods for causal inference: Lessons learned from a data analysis competition. Statistical Science, 34(1):43–68, 2019.
  13. Adversarial balancing-based representation learning for causal effect inference with observational data. Data Mining and Knowledge Discovery, 35(4):1713–1738, 2021.
  14. Unsupervised domain adaptation by backpropagation. In International conference on machine learning, pages 1180–1189. PMLR, 2015.
  15. Domain-adversarial training of neural networks. The journal of machine learning research, 17(1):2096–2030, 2016.
  16. A kernel two-sample test. The Journal of Machine Learning Research, 13(1):723–773, 2012.
  17. Learning disentangled representations for counterfactual regression. In International Conference on Learning Representations, 2019a.
  18. Counterfactual regression with importance sampling weights. In IJCAI, pages 5880–5887, 2019b.
  19. Jennifer L Hill. Bayesian nonparametric modeling for causal inference. Journal of Computational and Graphical Statistics, 20(1):217–240, 2011.
  20. Paul W Holland. Statistics and causal inference. Journal of the American statistical Association, 81(396):945–960, 1986.
  21. Approximate nearest neighbors: towards removing the curse of dimensionality. In Proceedings of the thirtieth annual ACM symposium on Theory of computing, pages 604–613, 1998.
  22. Learning representations for counterfactual inference. In International conference on machine learning, pages 3020–3029. PMLR, 2016.
  23. Generalization bounds and representation learning for estimation of potential outcomes and causal effects. arXiv preprint arXiv:2001.07426, 2020.
  24. Nathan Kallus. Deepmatch: Balancing deep covariate representations for causal inference using adversarial training. In International Conference on Machine Learning, pages 5067–5077. PMLR, 2020.
  25. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014.
  26. Metalearners for estimating heterogeneous treatment effects using machine learning. Proceedings of the national academy of sciences, 116(10):4156–4165, 2019.
  27. Causal effect inference with deep latent-variable models. Advances in neural information processing systems, 30, 2017.
  28. Alfred Müller. Integral probability metrics and their generating classes of functions. Advances in Applied Probability, 29(2):429–443, 1997.
  29. Adversarial balancing for causal inference. arXiv preprint arXiv:1810.07406, 2018.
  30. Judea Pearl. Causal inference in statistics: An overview. Statistics surveys, 3:96–146, 2009.
  31. Scikit-learn: Machine learning in Python. Journal of Machine Learning Research, 12:2825–2830, 2011.
  32. The central role of the propensity score in observational studies for causal effects. Biometrika, 70(1):41–55, 1983.
  33. Donald B Rubin. Causal inference using potential outcomes: Design, modeling, decisions. Journal of the American Statistical Association, 100(469):322–331, 2005.
  34. Maximum classifier discrepancy for unsupervised domain adaptation. In Proceedings of the IEEE conference on computer vision and tern recognition, pages 3723–3732, 2018.
  35. Bites: balanced individual treatment effect for survival data. Bioinformatics, 38:i60–i67, 2022.
  36. Perfect match: A simple method for learning representations for counterfactual inference with neural networks. arXiv preprint arXiv:1810.00656, 2018.
  37. Estimating individual treatment effect: generalization bounds and algorithms. In International Conference on Machine Learning, pages 3076–3085. PMLR, 2017.
  38. Adapting neural networks for the estimation of treatment effects. Advances in neural information processing systems, 32, 2019.
  39. Adversarial discriminative domain adaptation. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 7167–7176, 2017.
  40. Estimation and inference of heterogeneous treatment effects using random forests. Journal of the American Statistical Association, 113(523):1228–1242, 2018.
  41. Representation learning for treatment effect estimation from observational data. Advances in Neural Information Processing Systems, 31, 2018.
  42. Ganite: Estimation of individualized treatment effects using generative adversarial nets. In International Conference on Learning Representations, 2018.
  43. Learning overlapping representations for the estimation of individualized treatment effects. In International Conference on Artificial Intelligence and Statistics, pages 1005–1014. PMLR, 2020.
Citations (1)

Summary

We haven't generated a summary for this paper yet.