Reframing the Expected Free Energy: Four Formulations and a Unification (2402.14460v1)

Published 22 Feb 2024 in cs.AI

Abstract: Active inference is a leading theory of perception, learning and decision making, which can be applied to neuroscience, robotics, psychology, and machine learning. Active inference is based on the expected free energy, which is mostly justified by the intuitive plausibility of its formulations, e.g., the risk plus ambiguity and information gain / pragmatic value formulations. This paper seek to formalize the problem of deriving these formulations from a single root expected free energy definition, i.e., the unification problem. Then, we study two settings, each one having its own root expected free energy definition. In the first setting, no justification for the expected free energy has been proposed to date, but all the formulations can be recovered from it. However, in this setting, the agent cannot have arbitrary prior preferences over observations. Indeed, only a limited class of prior preferences over observations is compatible with the likelihood mapping of the generative model. In the second setting, a justification of the root expected free energy definition is known, but this setting only accounts for two formulations, i.e., the risk over states plus ambiguity and entropy plus expected energy formulations.

References (24)

Citations (3)

View on Semantic Scholar

Summary

The paper introduces a unified framework that reconciles four prevalent expected free energy formulations in active inference.
It demonstrates how aligning forecast and target distributions integrates risk, ambiguity, and information gain perspectives.
The findings clarify decision-making processes and establish a basis for advancing theoretical and empirical research in active inference.

Reframing the Expected Free Energy: Four Formulations and a Unification

The paper "Reframing the Expected Free Energy: Four Formulations and a Unification" examines a core issue within active inference: the varying formulations of expected free energy (EFE) and the need for a unified theoretical grounding across these definitions. Active inference, a computational framework that elucidates decision-making and learning mechanisms under uncertainty, relies heavily on EFE as a means for agents to make probabilistic predictions about the consequences of their actions. This paper deciphers the complexities associated with formulating EFE and explores potential resolutions to unify its various conceptualizations.

Overview of Active Inference and Expected Free Energy

Active inference operates on the premise that agents minimize EFE to maintain preferred states or observations. This construct, though critical, lacks a definitive formulation akin to those found for variational free energy (VFE). Traditionally, EFE has been approached through perspectives like risk plus ambiguity and information gain/pragmatic value. Nonetheless, these formulations often emerge from heuristic motivations rather than a foundational definition. This paper seeks to address this gap by presenting a unification problem and examining two specific settings for EFE formulation: one lacking justification yet fully encompassing existing views, the other providing justification for a subset of formulations while constraining prior observation preferences.

The Unification Problem

Fundamentally, the unification problem targets the derivation of EFE from a consistent definition. The authors delineate this problem as a 4-tuple consisting of forecast distributions, target distributions, EFE definitions, and a set of formulations—namely risk over states plus ambiguity, risk over observations plus ambiguity, information gain and pragmatic value, and entropy plus expected energy formulations. The paper's motive is to ascertain whether a singular paradigm can encompass all these formulations through re-definition of the forecast and target distributions.

Implications and Findings

The discourse in this paper underscores pivotal claims, specifying how certain assumptions—such as aligning forecast and target distributions—aid in unifying the formulations. Notably, some assertions state that unifying risks over states and observations efficaciously results in the preferred information gain and pragmatic value formulation. Moreover, the findings articulate the necessary conditions such as the likelihood of forecast and target distributions being identical for the derivations to hold true, thus influencing prior preferences and their compatibility with likelihood mappings in generative models.

Discussion of Limitations and Future Directions

A notable limitation identified by the authors is rooted in the assumptions concerning forecast and target distribution alignments, which might restrict the degree of arbitrary predilections an agent could hold. This constraint necessitates careful deliberation on the admissibility of observation preferences vis-à-vis generative model mappings.

Despite proposing a promising unification approach, the authors concede the absence of explicit justification from first principles for certain EFE formulations when defined as risk over observations plus ambiguity. This highlights existing theoretical gaps and calls for further inquiry to substantiate all EFE formulations on a firm theoretical basis.

Conclusion

This paper marks a significant advance in reconciling the various formulations of expected free energy within the broader ambit of active inference. Beyond presenting a theoretical framework, it lays the groundwork for further exploration of EFE's foundational underpinnings, especially in high-dimensional settings such as deep active inference. Future research routes include refining the derivations under alternative constraints and pursuing empirical validations that address the theoretical conjectures posited herein.

By exploring the intricacies of EFE formulation and proposing a unification approach, this work seeks not only to streamline the conceptual landscape of active inference but also to inspire novel methodologies and applications in complex decision-making systems.

PDF Markdown