Effective Bayesian Causal Inference via Structural Marginalisation and Autoregressive Orders (2402.14781v3)
Abstract: The traditional two-stage approach to causal inference first identifies a single causal model (or equivalence class of models), which is then used to answer causal queries. However, this neglects any epistemic model uncertainty. In contrast, Bayesian causal inference does incorporate epistemic uncertainty into query estimates via Bayesian marginalisation (posterior averaging) over all causal models. While principled, this marginalisation over entire causal models, i.e., both causal structures (graphs) and mechanisms, poses a tremendous computational challenge. In this work, we address this challenge by decomposing structure marginalisation into the marginalisation over (i) causal orders and (ii) directed acyclic graphs (DAGs) given an order. We can marginalise the latter in closed form by limiting the number of parents per variable and utilising Gaussian processes to model mechanisms. To marginalise over orders, we use a sampling-based approximation, for which we devise a novel auto-regressive distribution over causal orders (ARCO). Our method outperforms state-of-the-art in structure learning on simulated non-linear additive noise benchmarks, and yields competitive results on real-world data. Furthermore, we can accurately infer interventional distributions and average causal effects.
- Bayes{DAG}: Gradient-Based Posterior Inference for Causal Discovery. In Thirty-seventh Conference on Neural Information Processing Systems, 2023.
- Emergence of scaling in random networks. Science, 286, 1999.
- Differentiable Causal Discovery from Interventional Data. In H Larochelle, M Ranzato, R Hadsell, M F Balcan, and H Lin, editors, Advances in Neural Information Processing Systems. Curran Associates, Inc., 2020.
- Differentiable {DAG} Sampling. In International Conference on Learning Representations, 2022.
- BCD Nets: Scalable Variational Approaches for Bayesian Causal Discovery. In Thirty-Fifth Conference on Neural Information Processing Systems, 2021.
- James Cussens. Maximum likelihood pedigree reconstruction using integer programming. In WCB@ ICLP, pages 8–19, 2010.
- Efficient structure learning of bayesian networks using constraints. The Journal of Machine Learning Research, 12:663–689, 2011.
- P. Erdös and A. Rényi. On random graphs i. Publicationes Mathematicae Debrecen, 6:290, 1959.
- Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. In S. Bengio, H. Wallach, H. Larochelle, K. Grauman, N. Cesa-Bianchi, and R. Garnett, editors, Advances in Neural Information Processing Systems. Curran Associates, Inc., 2018.
- Valid Inference after Causal Discovery. arXiv:2208.05949, 2022.
- David Heckerman. A bayesian approach to learning causal networks. In Proceedings of Eleventh Conference on Uncertainty in Artificial Intelligence. Morgan Kaufmann, 1995.
- A Bayesian Approach to Causal Discovery. Computation, Causation, and Discovery, 1997.
- Exact Bayesian structure discovery in Bayesian networks. Journal of Machine Learning Research, 2004.
- D. Koller and N. Friedman. Being Bayesian about network structure. A Bayesian approach to structure discovery in Bayesian networks. Machine Learning, 2003.
- Stochastic Beams and Where To Find Them: The {G}umbel-Top-k Trick for Sampling Sequences Without Replacement. In Kamalika Chaudhuri and Ruslan Salakhutdinov, editors, Proceedings of the 36th International Conference on Machine Learning, volume 97. PMLR, 2019. URL https://proceedings.mlr.press/v97/kool19a.html.
- Gradient-Based Neural DAG Learning. In International Conference on Learning Representations, 2020.
- DiBS: Differentiable Bayesian Structure Learning. Advances in Neural Information Processing Systems, 2021.
- Kevin P. Murphy. Probabilistic Machine Learning: An introduction. MIT Press, 2021.
- Kevin P. Murphy. Probabilistic Machine Learning: Advanced Topics. MIT Press, 2023.
- Structure Discovery in Bayesian Networks by Sampling Partial Orders. Journal of Machine Learning Research, 2016.
- OEIS Foundation Inc. Number of acyclic digraphs (or dags) with n labeled nodes, 2024. Entry A003024 in The On-Line Encyclopedia of Integer Sequences, https://oeis.org/A003024.
- Deep Structural Causal Models for Tractable Counterfactual Inference. In Advances in Neural Information Processing Systems, 2020.
- Judea Pearl. Causality. Cambridge University Press, 2009. ISBN 9780511803161.
- Exact maximum margin structure learning of bayesian networks. arXiv preprint arXiv:1206.6431, 2012.
- Constant-time predictive distributions for Gaussian processes. In Jennifer Dy and Andreas Krause, editors, Proceedings of the 35th International Conference on Machine Learning, Proceedings of Machine Learning Research. PMLR, 2018.
- Beware of the Simulated DAG! Causal Discovery Benchmarks May Be Easy To Game. In M. Ranzato Vaughan, A. Beygelzimer, Y. Dauphin, P.S. Liang, and J. Wortman, editors, Advances in Neural Information Processing Systems. Curran Associates, Inc., 2021.
- Ordering-Based Search: A Simple and Effective Algorithm for Learning Bayesian Networks. Proceedings of the 21st Conference on Uncertainty in Artificial Intelligence, UAI 2005, 2012.
- Active Bayesian Causal Inference. In S Koyejo, S Mohamed, A Agarwal, D Belgrave, K Cho, and A Oh, editors, Advances in Neural Information Processing Systems. Curran Associates, Inc., 2022.
- Towards Scalable Bayesian Learning of Causal DAGs. In H Larochelle, M Ranzato, R Hadsell, M F Balcan, and H Lin, editors, Advances in Neural Information Processing Systems. Curran Associates, Inc., 2020.
- DAG-GNN: DAG Structure Learning with Graph Neural Networks. In Kamalika Chaudhuri and Ruslan Salakhutdinov, editors, Proceedings of the 36th International Conference on Machine Learning, Proceedings of Machine Learning Research. PMLR, 2019.
- DAGs with NO TEARS: Continuous Optimization for Structure Learning. In S Bengio, H Wallach, H Larochelle, K Grauman, N Cesa-Bianchi, and R Garnett, editors, Advances in Neural Information Processing Systems 31. Curran Associates, Inc., 2018.