Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
158 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Reframing the Expected Free Energy: Four Formulations and a Unification (2402.14460v1)

Published 22 Feb 2024 in cs.AI

Abstract: Active inference is a leading theory of perception, learning and decision making, which can be applied to neuroscience, robotics, psychology, and machine learning. Active inference is based on the expected free energy, which is mostly justified by the intuitive plausibility of its formulations, e.g., the risk plus ambiguity and information gain / pragmatic value formulations. This paper seek to formalize the problem of deriving these formulations from a single root expected free energy definition, i.e., the unification problem. Then, we study two settings, each one having its own root expected free energy definition. In the first setting, no justification for the expected free energy has been proposed to date, but all the formulations can be recovered from it. However, in this setting, the agent cannot have arbitrary prior preferences over observations. Indeed, only a limited class of prior preferences over observations is compatible with the likelihood mapping of the generative model. In the second setting, a justification of the root expected free energy definition is known, but this setting only accounts for two formulations, i.e., the risk over states plus ambiguity and entropy plus expected energy formulations.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (24)
  1. Learning perception and planning with deep active inference. In 2020 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2020, Barcelona, Spain, May 4-8, 2020, pages 3952–3956. IEEE, 2020. doi: 10.1109/ICASSP40776.2020.9054364. URL https://doi.org/10.1109/ICASSP40776.2020.9054364.
  2. Branching time active inference: Empirical study and complexity class analysis. Neural Networks, 152:450–466, 2022a. ISSN 0893-6080. doi: https://doi.org/10.1016/j.neunet.2022.05.010. URL https://www.sciencedirect.com/science/article/pii/S0893608022001824.
  3. Branching time active inference: The theory and its generality. Neural Networks, 151:295–316, 2022b. ISSN 0893-6080. doi: https://doi.org/10.1016/j.neunet.2022.03.036. URL https://www.sciencedirect.com/science/article/pii/S0893608022001149.
  4. Multi-modal and multi-factor branching time active inference, 2022c. URL https://arxiv.org/abs/2206.12503.
  5. Branching Time Active Inference with Bayesian Filtering. Neural Computation, 34(10):2132–2144, 09 2022d. ISSN 0899-7667. doi: 10.1162/neco˙a˙01529. URL https://doi.org/10.1162/neco_a_01529.
  6. Deconstructing deep active inference, 2023. URL https://arxiv.org/abs/2303.01618.
  7. Active inference in openai gym: A paradigm for computational investigations into psychiatric illness. Biological Psychiatry: Cognitive Neuroscience and Neuroimaging, 3(9):809 – 818, 2018. ISSN 2451-9022. doi: https://doi.org/10.1016/j.bpsc.2018.06.010. URL http://www.sciencedirect.com/science/article/pii/S2451902218301617. Computational Methods and Modeling in Psychiatry.
  8. Dopamine, reward learning, and active inference. Frontiers in Computational Neuroscience, 9:136, 2015. ISSN 1662-5188. doi: 10.3389/fncom.2015.00136. URL https://www.frontiersin.org/article/10.3389/fncom.2015.00136.
  9. Deep active inference agents using Monte-Carlo methods. In Hugo Larochelle, Marc’Aurelio Ranzato, Raia Hadsell, Maria-Florina Balcan, and Hsuan-Tien Lin, editors, Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, NeurIPS 2020, December 6-12, 2020, virtual, 2020. URL https://proceedings.neurips.cc/paper/2020/hash/865dfbde8a344b44095495f3591f7407-Abstract.html.
  10. A tutorial on variational Bayesian inference. Artificial Intelligence Review, 38(2):85–95, Aug 2012. ISSN 1573-7462. doi: 10.1007/s10462-011-9236-8. URL https://doi.org/10.1007/s10462-011-9236-8.
  11. Active inference and learning. Neuroscience & Biobehavioral Reviews, 68:862 – 879, 2016. ISSN 0149-7634. doi: https://doi.org/10.1016/j.neubiorev.2016.06.022.
  12. Sophisticated Inference. Neural Computation, 33(3):713–763, 03 2021. ISSN 0899-7667. doi: 10.1162/neco˙a˙01351. URL https://doi.org/10.1162/neco_a_01351.
  13. The graphical brain: Belief propagation and active inference. Network Neuroscience, 1(4):381–414, 12 2017. ISSN 2472-1751. doi: 10.1162/NETN˙a˙00018. URL https://doi.org/10.1162/NETN_a_00018.
  14. Bayesian surprise attracts human attention. Vision Research, 49(10):1295 – 1306, 2009. ISSN 0042-6989. doi: https://doi.org/10.1016/j.visres.2008.09.007. URL http://www.sciencedirect.com/science/article/pii/S0042698908004380. Visual Attention: Psychophysics, electrophysiology and neuroimaging.
  15. Edwin T Jaynes. Information theory and statistical mechanics. Physical review, 106(4):620, 1957a.
  16. Edwin T Jaynes. Information theory and statistical mechanics. ii. Physical review, 108(2):171, 1957b.
  17. Probabilistic graphical models: principles and techniques. MIT press, 2009.
  18. Factor graphs and the sum-product algorithm. IEEE Transactions on information theory, 47(2):498–519, 2001.
  19. Beren Millidge. Combining active inference and hierarchical predictive coding: A tutorial introduction and case study., 2019. URL https://doi.org/10.31234/osf.io/kf6wc.
  20. Neuronal message passing using mean-field, bethe, and marginal approximations. Scientific reports, 9(1):1889, 2019.
  21. Active inference: the free energy principle in mind, brain, and behavior. MIT Press, 2022.
  22. End-to-end pixel-based deep active inference for body perception and action. In Joint IEEE 10th International Conference on Development and Learning and Epigenetic Robotics, ICDL-EpiRob 2020, Valparaiso, Chile, October 26-30, 2020, pages 1–8. IEEE, 2020. doi: 10.1109/ICDL-EpiRob48136.2020.9278105. URL https://doi.org/10.1109/ICDL-EpiRob48136.2020.9278105.
  23. Computational mechanisms of curiosity and goal-directed exploration. bioRxiv, 2018. doi: 10.1101/411272. URL https://www.biorxiv.org/content/early/2018/09/07/411272.
  24. Variational message passing. Journal of Machine Learning Research, 6(4), 2005.
Citations (3)

Summary

  • The paper introduces a unified framework that reconciles four prevalent expected free energy formulations in active inference.
  • It demonstrates how aligning forecast and target distributions integrates risk, ambiguity, and information gain perspectives.
  • The findings clarify decision-making processes and establish a basis for advancing theoretical and empirical research in active inference.

Reframing the Expected Free Energy: Four Formulations and a Unification

The paper "Reframing the Expected Free Energy: Four Formulations and a Unification" examines a core issue within active inference: the varying formulations of expected free energy (EFE) and the need for a unified theoretical grounding across these definitions. Active inference, a computational framework that elucidates decision-making and learning mechanisms under uncertainty, relies heavily on EFE as a means for agents to make probabilistic predictions about the consequences of their actions. This paper deciphers the complexities associated with formulating EFE and explores potential resolutions to unify its various conceptualizations.

Overview of Active Inference and Expected Free Energy

Active inference operates on the premise that agents minimize EFE to maintain preferred states or observations. This construct, though critical, lacks a definitive formulation akin to those found for variational free energy (VFE). Traditionally, EFE has been approached through perspectives like risk plus ambiguity and information gain/pragmatic value. Nonetheless, these formulations often emerge from heuristic motivations rather than a foundational definition. This paper seeks to address this gap by presenting a unification problem and examining two specific settings for EFE formulation: one lacking justification yet fully encompassing existing views, the other providing justification for a subset of formulations while constraining prior observation preferences.

The Unification Problem

Fundamentally, the unification problem targets the derivation of EFE from a consistent definition. The authors delineate this problem as a 4-tuple consisting of forecast distributions, target distributions, EFE definitions, and a set of formulations—namely risk over states plus ambiguity, risk over observations plus ambiguity, information gain and pragmatic value, and entropy plus expected energy formulations. The paper's motive is to ascertain whether a singular paradigm can encompass all these formulations through re-definition of the forecast and target distributions.

Implications and Findings

The discourse in this paper underscores pivotal claims, specifying how certain assumptions—such as aligning forecast and target distributions—aid in unifying the formulations. Notably, some assertions state that unifying risks over states and observations efficaciously results in the preferred information gain and pragmatic value formulation. Moreover, the findings articulate the necessary conditions such as the likelihood of forecast and target distributions being identical for the derivations to hold true, thus influencing prior preferences and their compatibility with likelihood mappings in generative models.

Discussion of Limitations and Future Directions

A notable limitation identified by the authors is rooted in the assumptions concerning forecast and target distribution alignments, which might restrict the degree of arbitrary predilections an agent could hold. This constraint necessitates careful deliberation on the admissibility of observation preferences vis-à-vis generative model mappings.

Despite proposing a promising unification approach, the authors concede the absence of explicit justification from first principles for certain EFE formulations when defined as risk over observations plus ambiguity. This highlights existing theoretical gaps and calls for further inquiry to substantiate all EFE formulations on a firm theoretical basis.

Conclusion

This paper marks a significant advance in reconciling the various formulations of expected free energy within the broader ambit of active inference. Beyond presenting a theoretical framework, it lays the groundwork for further exploration of EFE's foundational underpinnings, especially in high-dimensional settings such as deep active inference. Future research routes include refining the derivations under alternative constraints and pursuing empirical validations that address the theoretical conjectures posited herein.

By exploring the intricacies of EFE formulation and proposing a unification approach, this work seeks not only to streamline the conceptual landscape of active inference but also to inspire novel methodologies and applications in complex decision-making systems.