Neural Autoregressive Flows (1804.00779v1)

Published 3 Apr 2018 in cs.LG and stat.ML

Abstract: Normalizing flows and autoregressive models have been successfully combined to produce state-of-the-art results in density estimation, via Masked Autoregressive Flows (MAF), and to accelerate state-of-the-art WaveNet-based speech synthesis to 20x faster than real-time, via Inverse Autoregressive Flows (IAF). We unify and generalize these approaches, replacing the (conditionally) affine univariate transformations of MAF/IAF with a more general class of invertible univariate transformations expressed as monotonic neural networks. We demonstrate that the proposed neural autoregressive flows (NAF) are universal approximators for continuous probability distributions, and their greater expressivity allows them to better capture multimodal target distributions. Experimentally, NAF yields state-of-the-art performance on a suite of density estimation tasks and outperforms IAF in variational autoencoders trained on binarized MNIST.

Authors (4)

Chin-Wei Huang (24 papers)
David Krueger (75 papers)
Alexandre Lacoste (42 papers)
Aaron Courville (201 papers)

Citations (425)

View on Semantic Scholar

Summary

The paper introduces neural autoregressive flows that replace affine transformations with neural networks to capture complex, multimodal distributions.
It demonstrates a universal approximation capability through DSF and DDSF architectures, leading to state-of-the-art performance on standard benchmarks.
The method enhances density estimation and variational inference, paving the way for improved anomaly detection, data synthesis, and Bayesian inference.

Overview of "Neural Autoregressive Flows"

The paper "Neural Autoregressive Flows" presents a significant advancement in the area of normalizing flows, a tool integral to many applications in machine learning, including deep generative models and variational inference. Traditional methods such as Masked Autoregressive Flows (MAF) and Inverse Autoregressive Flows (IAF) have relied on affine transformations to express target distributions. The primary innovation presented in this paper is the introduction of Neural Autoregressive Flows (NAF), which replace affine transformations with non-linear transformations parameterized by neural networks. This shift enhances the expressivity of models, allowing them to capture more complex, multimodal distributions.

Technical Contributions

The paper's primary contribution lies in extending the flexibility and capacity of normalizing flows through the use of neural networks. Unlike traditional affine transformations, the proposed neural transformations can flexibly adjust to complex data distributions, thanks to their non-linear nature and neural network-backed parameterization. The authors demonstrate mathematically that NAF serves as a universal approximator for continuous probability distributions, widening the horizon for approximation tasks beyond the scope of typical normalizing flows.

Two neural architectures are introduced for these transformations: Deep Sigmoidal Flows (DSF) and Deep Dense Sigmoidal Flows (DDSF). These architectures, particularly DSF, are designed to efficiently model the inverse CDFs of complex distributions, making them suitable for challenging density estimation tasks.

Experimental Results

Extensive experiments reveal that NAF achieves state-of-the-art performance across various benchmarks. In density estimation experiments on standard datasets like UCI and BSDS300, NAF outperforms both IAF and MAF, showcasing its enhanced capability for capturing multimodality in distributions. For variational inference tasks, employing NAF improves the evidence lower bound (ELBO) and test log-likelihood on the binarized MNIST dataset compared to baseline models. These results underscore the practical viability of employing NAF in real-world density estimation and variational inference workflows.

Implications and Future Directions

The implications of these contributions are notable for both theoretical and practical applications. Theoretically, the notion of using neural networks as universal approximators within the context of normalizing flows suggests new research directions in the design of flows that balance tractability and flexibility. Practically, the enhanced ability of NAF to model complex distributions opens up avenues for improved performance in applications requiring precise density estimates, such as anomaly detection, data synthesis, and Bayesian inference.

Looking ahead, the techniques and architectures introduced in this paper lay the groundwork for further exploration into more efficient and expressive flow models. Future work may investigate integrating these ideas with other state-of-the-art neural architectures or exploring novel applications where such sophisticated modeling is advantageous. Additionally, the application of NAFs to generative modeling tasks, where the quality of generated samples is paramount, remains a prospective area of exploration.

In conclusion, the Neural Autoregressive Flows framework provides a more expressive means to model and approximate complex probability distributions, which can significantly enhance the performance of methods relying on accurate density estimation and variational inference. This work not only achieves state-of-the-art results but also encourages further research into generalizing the capabilities of normalizing flows through neural network parameterization.

PDF Markdown