A Theory of Interpretable Approximations (2406.10529v1)

Published 15 Jun 2024 in cs.LG, cs.AI, and stat.ML

Abstract: Can a deep neural network be approximated by a small decision tree based on simple features? This question and its variants are behind the growing demand for machine learning models that are interpretable by humans. In this work we study such questions by introducing interpretable approximations, a notion that captures the idea of approximating a target concept $c$ by a small aggregation of concepts from some base class $\mathcal{H}$. In particular, we consider the approximation of a binary concept $c$ by decision trees based on a simple class $\mathcal{H}$ (e.g., of bounded VC dimension), and use the tree depth as a measure of complexity. Our primary contribution is the following remarkable trichotomy. For any given pair of $\mathcal{H}$ and $c$, exactly one of these cases holds: (i) $c$ cannot be approximated by $\mathcal{H}$ with arbitrary accuracy; (ii) $c$ can be approximated by $\mathcal{H}$ with arbitrary accuracy, but there exists no universal rate that bounds the complexity of the approximations as a function of the accuracy; or (iii) there exists a constant $\kappa$ that depends only on $\mathcal{H}$ and $c$ such that, for any data distribution and any desired accuracy level, $c$ can be approximated by $\mathcal{H}$ with a complexity not exceeding $\kappa$. This taxonomy stands in stark contrast to the landscape of supervised classification, which offers a complex array of distribution-free and universally learnable scenarios. We show that, in the case of interpretable approximations, even a slightly nontrivial a-priori guarantee on the complexity of approximations implies approximations with constant (distribution-free and accuracy-free) complexity. We extend our trichotomy to classes $\mathcal{H}$ of unbounded VC dimension and give characterizations of interpretability based on the algebra generated by $\mathcal{H}$.

Summary

The paper presents a trichotomy that classifies the approximability of binary concepts using decision trees into non-approximability, approximability without a universal rate, and uniform interpretability.
It establishes that under VC conditions, if a concept is interpretable at all, it achieves uniform interpretability with bounded complexity, connecting to PAC learnability.
The framework guides the design of transparent models in high-stakes domains and suggests potential algorithmic implementations leveraging boosting methods.

A Theory of Interpretable Approximations

The paper under consideration presents a comprehensive theoretical framework for understanding interpretable approximations in machine learning models, specifically focusing on the potential for deep neural networks to be approximated by small decision trees. This concept addresses an increasing demand for models that prioritize interpretability, especially in high-stakes domains like healthcare and law enforcement.

Core Contributions

The main contributions of the paper include the introduction of a trichotomy for approximating a binary concept $c$ through decision trees utilizing a base class $H$ . The trichotomy posits three possible scenarios for any given pair of $c$ and $H$ :

Non-Approximability: The concept $c$ cannot be approximated with arbitrary accuracy using $H$ .
Approximability without Universal Rate: The concept $c$ can be approximated with arbitrary accuracy, but without a universal rate that bounds the complexity of the approximations.
Uniform Interpretability: There exists a constant $\kappa$ such that $c$ can be approximated by $H$ with complexity not exceeding $\kappa$ , independently of the data distribution or desired accuracy level.

The authors extend this trichotomy to broader classes $H$ with unbounded VC dimensions and relate interpretability to algebraic closures generated by $H$ . A surprisingly narrow behavioral range is revealed, suggesting that nontrivial a priori complexity constraints lead to consistent, distribution-free interpretability.

Theoretical Implications

From a theoretical standpoint, the trichotomy introduced significantly delineates the landscape of model interpretability in learning theory. It underscores how a target concept's interpretability can be uniformly achieved with bounded complexity in cases where $H$ is a VC class, reinforcing the practicality of interpretable models.

A key insight is the collapse of the interpretability hierarchy: if a concept is interpretable at all, it is uniformly interpretable with constant or logarithmic complexity, depending on whether $H$ is a VC class. This result correlates with standard learning notions such as PAC learnability, providing novel vistas for future exploration in understanding approximations and understandability.

Practical Implications and Future Directions

The development of this theory presents multiple implications for the practical deployment of machine learning models. By characterizing conditions for approximability and interpretability, the findings guide the design of models that balance complexity with transparency, crucial in areas demanding accountable algorithmic decision-making.

While the paper is primarily theoretical, the connections with known algorithmic frameworks like boosting indicate potential pathways for deriving practical algorithms from these theoretical guarantees. Future work might focus on algorithmic implementations that exploit the theoretical bounds to develop effective methods for interpretable approximations.

Additionally, the paper opens questions regarding complexity rates for non-VC classes and their algorithms, potentially steering future studies toward establishing more refined complexity relationships or exploring different complexity measures beyond tree depth, such as circuit size.

Conclusion

The framework established by the authors provides a robust foundation for understanding the conceptual limits and capabilities of interpretability in machine learning models. By defining a clear taxonomy of behavior, they contribute significantly to the theoretical toolbox available for researchers working on model transparency. This work advances the field towards practically feasible solutions, balancing the need for model accuracy with the imperative of interpretability, especially in sensitive or regulated domains.

PDF Markdown

Related Papers

Tweets

https://twitter.com/NicoloCB/status/1807604717360459819

https://twitter.com/realmofresearch/status/1803856972892246396