A fast score-based search algorithm for maximal ancestral graphs using entropy (2402.04777v1)

Published 7 Feb 2024 in stat.ML, cs.LG, math.ST, and stat.TH

Abstract: \emph{Maximal ancestral graph} (MAGs) is a class of graphical model that extend the famous \emph{directed acyclic graph} in the presence of latent confounders. Most score-based approaches to learn the unknown MAG from empirical data rely on BIC score which suffers from instability and heavy computations. We propose to use the framework of imsets \citep{studeny2006probabilistic} to score MAGs using empirical entropy estimation and the newly proposed \emph{refined Markov property} \citep{hu2023towards}. Our graphical search procedure is similar to \citet{claassen2022greedy} but improved from our theoretical results. We show that our search algorithm is polynomial in number of nodes by restricting degree, maximal head size and number of discriminating paths. In simulated experiment, our algorithm shows superior performance compared to other state of art MAG learning algorithms.

Authors (2)

Zhongyi Hu (14 papers)
Robin Evans (17 papers)

Summary

The paper introduces an entropy-based score method for MAGs that replaces traditional BIC, simplifying model evaluation.
It demonstrates polynomial-time performance under sparsity conditions, ensuring computational efficiency in large-scale simulations.
Empirical tests show superior accuracy and stability over state-of-the-art methods, advancing causal discovery research.

Introduction

Maximal ancestral graphs (MAGs) play a critical role in representing causal structures, particularly in scenarios involving hidden variables. Unlike directed acyclic graphs (DAGs), MAGs can express indirect causal relationships and account for unobserved confounding. Recent developments have expanded upon constraint-based and hybrid learning methods, relying on enhancements to the classic PC and FCI algorithms. However, score-based alternatives, despite their accuracy, struggle with computational intensity and stability issues. This paper introduces a novel score-based search algorithm utilizing entropy estimation within an imset (independence model set) framework for learning MAGs.

Theoretical Contributions

Central to the proposed algorithm is the adoption of empirical entropy to assess model fit, replacing the traditional Bayesian information criterion (BIC) used in previous score-based methods. The paper argues that BIC can be suboptimal due to its complex computations, especially when the model is not a DAG. The introduced approach, stemming from the refined Markov property, aims to simplify score computation and factorize distributions in MAG models more effectively. A theoretical underpinning is established, demonstrating that the algorithm operates in polynomial time under certain sparsity conditions of the graph.

Empirical Findings

Simulations form a significant component of the paper's empirical validation, where the algorithm demonstrates superior performance over state-of-the-art MAG learning algorithms. The data from these simulations substantiate the claim that using entropy-based scoring enhances stability and the handling of complex structures with numerous nodes. Results indicate that the proposed method consistently outperforms existing techniques in both accuracy and computational efficiency.

Conclusion and Outlook

Conclusively, the paper's proposed entropy-based, score-driven approach presents an innovative direction for causal discovery research involving MAGs. Future studies can explore further optimizations of the algorithm, particularly in pruning the search space and enhancing the efficiency of entropy computation. The validation of the algorithm in simulated environments paves the way for its applicability to real-world datasets, potentially contributing significantly to fields such as epidemiology, economics, and social sciences where causal inference is paramount.

PDF Markdown

Related Papers

Tweets

https://twitter.com/StatMLPapers/status/1755457497287025108