Papers
Topics
Authors
Recent
Detailed Answer
Quick Answer
Concise responses based on abstracts only
Detailed Answer
Well-researched responses based on abstracts and relevant paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses
Gemini 2.5 Flash
Gemini 2.5 Flash 75 tok/s
Gemini 2.5 Pro 51 tok/s Pro
GPT-5 Medium 20 tok/s Pro
GPT-5 High 18 tok/s Pro
GPT-4o 95 tok/s Pro
Kimi K2 193 tok/s Pro
GPT OSS 120B 467 tok/s Pro
Claude Sonnet 4 37 tok/s Pro
2000 character limit reached

C-XGBoost: A tree boosting model for causal effect estimation (2404.00751v1)

Published 31 Mar 2024 in stat.ML, cs.LG, and stat.ME

Abstract: Causal effect estimation aims at estimating the Average Treatment Effect as well as the Conditional Average Treatment Effect of a treatment to an outcome from the available data. This knowledge is important in many safety-critical domains, where it often needs to be extracted from observational data. In this work, we propose a new causal inference model, named C-XGBoost, for the prediction of potential outcomes. The motivation of our approach is to exploit the superiority of tree-based models for handling tabular data together with the notable property of causal inference neural network-based models to learn representations that are useful for estimating the outcome for both the treatment and non-treatment cases. The proposed model also inherits the considerable advantages of XGBoost model such as efficiently handling features with missing values requiring minimum preprocessing effort, as well as it is equipped with regularization techniques to avoid overfitting/bias. Furthermore, we propose a new loss function for efficiently training the proposed causal inference model. The experimental analysis, which is based on the performance profiles of Dolan and Mor{\'e} as well as on post-hoc and non-parametric statistical tests, provide strong evidence about the effectiveness of the proposed approach.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (29)
  1. Generalization bounds and representation learning for estimation of potential outcomes and causal effects. The Journal of Machine Learning Research, 23(1):7489–7538, 2022.
  2. Donald B Rubin. Causal inference using potential outcomes: Design, modeling, decisions. Journal of the American Statistical Association, 100(469):322–331, 2005.
  3. Judea Pearl. Causality. Cambridge university press, 2009.
  4. A survey on deep learning: Algorithms, techniques, and applications. ACM Computing Surveys (CSUR), 51(5):1–36, 2018.
  5. A review of machine learning and deep learning applications. In 2018 Fourth international conference on computing communication control and automation (ICCUBEA), pages 1–6. IEEE, 2018.
  6. Adapting neural networks for the estimation of treatment effects. Advances in neural information processing systems, 32, 2019.
  7. Estimating individual treatment effect: generalization bounds and algorithms. In International Conference on Machine Learning, pages 3076–3085. PMLR, 2017.
  8. An improved neural network model for treatment effect estimation. In IFIP International Conference on Artificial Intelligence Applications and Innovations, pages 147–158. Springer, 2022a.
  9. Integrating nearest neighbors with neural network models for treatment effect estimation. International Journal of Neural Systems, 2023.
  10. Why do tree-based models still outperform deep learning on typical tabular data? Advances in Neural Information Processing Systems, 35:507–520, 2022.
  11. Supervised feature selection with neuron evolution in sparse neural networks. Transactions on Machine Learning Research, 2023, 2023.
  12. An evaluation framework for comparing causal inference models. In Proceedings of the 12th Hellenic Conference on Artificial Intelligence, pages 1–9, 2022b.
  13. Metalearners for estimating heterogeneous treatment effects using machine learning. Proceedings of the national academy of sciences, 116(10):4156–4165, 2019.
  14. Leo Breiman. Random forests. Machine learning, 45:5–32, 2001.
  15. Limits of estimating heterogeneous treatment effects: Guidelines for practical algorithm design. In International Conference on Machine Learning, pages 129–138. PMLR, 2018.
  16. Estimation and inference of heterogeneous treatment effects using random forests. Journal of the American Statistical Association, 113(523):1228–1242, 2018.
  17. Xgboost: A scalable tree boosting system. In Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining, pages 785–794, 2016.
  18. Cycle-balanced representation learning for counterfactual inference. In Proceedings of the 2022 SIAM International Conference on Data Mining (SDM), pages 442–450. SIAM, 2022.
  19. Causal effect inference with deep latent-variable models. Advances in neural information processing systems, 30, 2017.
  20. Benchmarking framework for performance-evaluation of causal inference analysis. arXiv preprint arXiv:1802.05046, 2018.
  21. Infant mortality statistics from the 1996 period linked birth/infant death data set. Monthly Vital Statistics Report, 46(12), 1998.
  22. Learning representations for counterfactual inference. In International conference on machine learning, pages 3020–3029. PMLR, 2016.
  23. Benchmarking optimization software with performance profiles. Mathematical programming, 91(2):201–213, 2002.
  24. A new class of spectral conjugate gradient methods based on a modified secant equation for unconstrained optimization. Journal of computational and applied mathematics, 239:396–405, 2013.
  25. Smoothing and stationarity enforcement framework for deep learning time-series forecasting. Neural Computing and Applications, 33(20):14021–14035, 2021.
  26. JL Hodges and Erich L Lehmann. Rank methods for combination of independent experiments in analysis of variance. In Selected Works of EL Lehmann, pages 403–418. Springer, 2012.
  27. Helmut Finner. On a monotonicity problem in step-down multiple test procedures. Journal of the American Statistical Association, 88(423):920–923, 1993.
  28. Mutual information-based neighbor selection method for causal effect estimation. Neural Computing and Applications, pages 1–15, 2024.
  29. An advanced explainable and interpretable ML-based framework for educational data mining. In International Conference in Methodologies and intelligent Systems for Techhnology Enhanced Learning, pages 87–96. Springer, 2023.

Summary

We haven't generated a summary for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

Lightbulb On Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

Don't miss out on important new AI/ML research

See which papers are being discussed right now on X, Reddit, and more:

“Emergent Mind helps me see which AI papers have caught fire online.”

Philip

Philip

Creator, AI Explained on YouTube