Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 158 tok/s
Gemini 2.5 Pro 47 tok/s Pro
GPT-5 Medium 29 tok/s Pro
GPT-5 High 29 tok/s Pro
GPT-4o 80 tok/s Pro
Kimi K2 174 tok/s Pro
GPT OSS 120B 436 tok/s Pro
Claude Sonnet 4.5 36 tok/s Pro
2000 character limit reached

C-XGBoost: A tree boosting model for causal effect estimation (2404.00751v1)

Published 31 Mar 2024 in stat.ML, cs.LG, and stat.ME

Abstract: Causal effect estimation aims at estimating the Average Treatment Effect as well as the Conditional Average Treatment Effect of a treatment to an outcome from the available data. This knowledge is important in many safety-critical domains, where it often needs to be extracted from observational data. In this work, we propose a new causal inference model, named C-XGBoost, for the prediction of potential outcomes. The motivation of our approach is to exploit the superiority of tree-based models for handling tabular data together with the notable property of causal inference neural network-based models to learn representations that are useful for estimating the outcome for both the treatment and non-treatment cases. The proposed model also inherits the considerable advantages of XGBoost model such as efficiently handling features with missing values requiring minimum preprocessing effort, as well as it is equipped with regularization techniques to avoid overfitting/bias. Furthermore, we propose a new loss function for efficiently training the proposed causal inference model. The experimental analysis, which is based on the performance profiles of Dolan and Mor{\'e} as well as on post-hoc and non-parametric statistical tests, provide strong evidence about the effectiveness of the proposed approach.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (29)
  1. Generalization bounds and representation learning for estimation of potential outcomes and causal effects. The Journal of Machine Learning Research, 23(1):7489–7538, 2022.
  2. Donald B Rubin. Causal inference using potential outcomes: Design, modeling, decisions. Journal of the American Statistical Association, 100(469):322–331, 2005.
  3. Judea Pearl. Causality. Cambridge university press, 2009.
  4. A survey on deep learning: Algorithms, techniques, and applications. ACM Computing Surveys (CSUR), 51(5):1–36, 2018.
  5. A review of machine learning and deep learning applications. In 2018 Fourth international conference on computing communication control and automation (ICCUBEA), pages 1–6. IEEE, 2018.
  6. Adapting neural networks for the estimation of treatment effects. Advances in neural information processing systems, 32, 2019.
  7. Estimating individual treatment effect: generalization bounds and algorithms. In International Conference on Machine Learning, pages 3076–3085. PMLR, 2017.
  8. An improved neural network model for treatment effect estimation. In IFIP International Conference on Artificial Intelligence Applications and Innovations, pages 147–158. Springer, 2022a.
  9. Integrating nearest neighbors with neural network models for treatment effect estimation. International Journal of Neural Systems, 2023.
  10. Why do tree-based models still outperform deep learning on typical tabular data? Advances in Neural Information Processing Systems, 35:507–520, 2022.
  11. Supervised feature selection with neuron evolution in sparse neural networks. Transactions on Machine Learning Research, 2023, 2023.
  12. An evaluation framework for comparing causal inference models. In Proceedings of the 12th Hellenic Conference on Artificial Intelligence, pages 1–9, 2022b.
  13. Metalearners for estimating heterogeneous treatment effects using machine learning. Proceedings of the national academy of sciences, 116(10):4156–4165, 2019.
  14. Leo Breiman. Random forests. Machine learning, 45:5–32, 2001.
  15. Limits of estimating heterogeneous treatment effects: Guidelines for practical algorithm design. In International Conference on Machine Learning, pages 129–138. PMLR, 2018.
  16. Estimation and inference of heterogeneous treatment effects using random forests. Journal of the American Statistical Association, 113(523):1228–1242, 2018.
  17. Xgboost: A scalable tree boosting system. In Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining, pages 785–794, 2016.
  18. Cycle-balanced representation learning for counterfactual inference. In Proceedings of the 2022 SIAM International Conference on Data Mining (SDM), pages 442–450. SIAM, 2022.
  19. Causal effect inference with deep latent-variable models. Advances in neural information processing systems, 30, 2017.
  20. Benchmarking framework for performance-evaluation of causal inference analysis. arXiv preprint arXiv:1802.05046, 2018.
  21. Infant mortality statistics from the 1996 period linked birth/infant death data set. Monthly Vital Statistics Report, 46(12), 1998.
  22. Learning representations for counterfactual inference. In International conference on machine learning, pages 3020–3029. PMLR, 2016.
  23. Benchmarking optimization software with performance profiles. Mathematical programming, 91(2):201–213, 2002.
  24. A new class of spectral conjugate gradient methods based on a modified secant equation for unconstrained optimization. Journal of computational and applied mathematics, 239:396–405, 2013.
  25. Smoothing and stationarity enforcement framework for deep learning time-series forecasting. Neural Computing and Applications, 33(20):14021–14035, 2021.
  26. JL Hodges and Erich L Lehmann. Rank methods for combination of independent experiments in analysis of variance. In Selected Works of EL Lehmann, pages 403–418. Springer, 2012.
  27. Helmut Finner. On a monotonicity problem in step-down multiple test procedures. Journal of the American Statistical Association, 88(423):920–923, 1993.
  28. Mutual information-based neighbor selection method for causal effect estimation. Neural Computing and Applications, pages 1–15, 2024.
  29. An advanced explainable and interpretable ML-based framework for educational data mining. In International Conference in Methodologies and intelligent Systems for Techhnology Enhanced Learning, pages 87–96. Springer, 2023.

Summary

We haven't generated a summary for this paper yet.

Dice Question Streamline Icon: https://streamlinehq.com

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Lightbulb Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets

This paper has been mentioned in 3 tweets and received 12 likes.

Upgrade to Pro to view all of the tweets about this paper: