Block Neural Autoregressive Flow (1904.04676v1)

Published 9 Apr 2019 in stat.ML and cs.LG

Abstract: Normalising flows (NFS) map two density functions via a differentiable bijection whose Jacobian determinant can be computed efficiently. Recently, as an alternative to hand-crafted bijections, Huang et al. (2018) proposed neural autoregressive flow (NAF) which is a universal approximator for density functions. Their flow is a neural network (NN) whose parameters are predicted by another NN. The latter grows quadratically with the size of the former and thus an efficient technique for parametrization is needed. We propose block neural autoregressive flow (B-NAF), a much more compact universal approximator of density functions, where we model a bijection directly using a single feed-forward network. Invertibility is ensured by carefully designing each affine transformation with block matrices that make the flow autoregressive and (strictly) monotone. We compare B-NAF to NAF and other established flows on density estimation and approximate inference for latent variable models. Our proposed flow is competitive across datasets while using orders of magnitude fewer parameters.

Citations (117)

View on Semantic Scholar

Summary

The paper introduces B-NAF, reducing parameter overgrowth by integrating transformation parametrization directly within an autoregressive network.
B-NAF achieves competitive density estimation performance on both toy and real-world datasets while using significantly fewer parameters.
Its block matrix design enables efficient variational inference and offers innovative insights for constructing invertible neural architectures.

An Overview of Block Neural Autoregressive Flow

The paper "Block Neural Autoregressive Flow" by Nicola De Cao, Wilker Aziz, and Ivan Titov introduces Block Neural Autoregressive Flow (B-NAF), a streamlined and efficient normalizing flow model designed for density estimation and variational inference. This work presents a significant enhancement over existing normalizing flow models, specifically targeting the inefficiencies related to parameter overgrowth that plague other models such as Neural Autoregressive Flows (NAFs).

Context and Motivation

Normalizing flows (NFs) have emerged as a powerful tool for modeling complex probability distributions by transforming simpler distributions into more complex ones through an invertible mapping with a tractable Jacobian determinant. This property makes them valuable for tasks like density estimation and variational inference in latent variable models. The primary goal in developing NFs is to create expressive models that remain computationally feasible.

The bottleneck in many existing NF models, particularly NAFs, lies in their parameter requirement, which grows quadratically with the neural architecture's size. This paper seeks to address this limitation by introducing B-NAF, a model that maintains the expressive power of NAFs while radically reducing the number of parameters needed.

Block Neural Autoregressive Flow (B-NAF)

B-NAF is designed as a universal approximator of density functions with a key focus on compactness. Unlike NAFs, which rely on a separate conditioner network to parametrize transformations within the flow, B-NAF integrates the transformation parametrization directly within a single autoregressive feed-forward network. The core innovation lies in structuring the parameters using block matrices, which facilitate autoregressive and strictly monotonic transformations. By ensuring that the diagonal blocks of the weight matrices in dense layers are strictly positive, B-NAF guarantees invertibility of the flows without the requirement for separate conditioners.

Experimentation and Results

The experimental evaluation of B-NAF showcases its competitive performance with substantially fewer parameters compared to existing state-of-the-art models. Specifically, the authors demonstrate B-NAF's capability on both 2D toy datasets and more complex real-world datasets from the UCI repository and image datasets such as MNIST and Omniglot.

Density Estimation: B-NAF achieves comparable log-likelihood results to other leading models, including NAF, Real NVP, and FFJORD, across various datasets. This performance is achieved while using significantly fewer parameters, notably by orders of magnitude in higher-dimensional data scenarios.
Variational Inference: In the context of Variational Autoencoders (VAEs), B-NAF is applied to enhance posterior distribution modeling. Results indicate that B-NAF outperforms other models like planar flows and IAFs. Although slightly underperforming in comparison to Sylvester flows, B-NAF achieves this with far fewer trainable parameters and a reduced number of amortized parameters, underscoring the efficiency benefits.

Implications and Future Directions

The implications of B-NAF are twofold. Practically, the reduced parameter footprint facilitates the integration of powerful normalization flows into larger systems, such as those found in deep learning applications requiring efficient memory usage. Theoretically, the introduction of block matrix parametrization offers a fresh perspective on constructing invertible neural architectures, potentially inspiring further innovations in neural network design.

Looking towards future advancements, two potential areas of exploration are highlighted: achieving analytic inverses for B-NAFs and integrating these flows into deep generative models with substantial decoders, particularly within domains like natural language processing. Such work would further solidify the standing of B-NAF in the broader machine learning landscape, offering robust solutions to contemporary challenges in AI model training and inference.

Related Papers

Neural Spline Flows (2019)
Improving Variational Inference with Inverse Autoregressive Flow (2016)
Transformer Neural Autoregressive Flows (2024)
Normalizing Flows for Probabilistic Modeling and Inference (2019)
Neural Autoregressive Flows (2018)

GitHub

GitHub - nicola-decao/BNAF: Pytorch implementation of Block Neural Autoregressive Flow (179 stars)