Emergent Mind

Consistency Trajectory Models: Learning Probability Flow ODE Trajectory of Diffusion

(2310.02279)
Published Oct 1, 2023 in cs.LG , cs.AI , cs.CV , and stat.ML

Abstract

Consistency Models (CM) (Song et al., 2023) accelerate score-based diffusion model sampling at the cost of sample quality but lack a natural way to trade-off quality for speed. To address this limitation, we propose Consistency Trajectory Model (CTM), a generalization encompassing CM and score-based models as special cases. CTM trains a single neural network that can -- in a single forward pass -- output scores (i.e., gradients of log-density) and enables unrestricted traversal between any initial and final time along the Probability Flow Ordinary Differential Equation (ODE) in a diffusion process. CTM enables the efficient combination of adversarial training and denoising score matching loss to enhance performance and achieves new state-of-the-art FIDs for single-step diffusion model sampling on CIFAR-10 (FID 1.73) and ImageNet at 64x64 resolution (FID 1.92). CTM also enables a new family of sampling schemes, both deterministic and stochastic, involving long jumps along the ODE solution trajectories. It consistently improves sample quality as computational budgets increase, avoiding the degradation seen in CM. Furthermore, unlike CM, CTM's access to the score function can streamline the adoption of established controllable/conditional generation methods from the diffusion community. This access also enables the computation of likelihood. The code is available at https://github.com/sony/ctm.

Comparison of model training, error sources in score-based, distillation models, and CTM's mitigation strategy.

Overview

  • Generative models, particularly Diffusion Models (DMs), have been enhanced through a new approach called Consistency Trajectory Models (CTM), which unifies score-based and distillation models for improved sample generation speed and computational efficiency.

  • CTM introduces a novel parameterization in the probability flow Ordinary Differential Equation (ODE) for diffusion processes, allowing the model to capture both minute and significant changes in data trajectories, which leads to state-of-the-art performance in image generation.

  • CTM's training strategy combines score matching with reconstruction and adversarial losses, introducing a comprehensive learning process and an innovative γ-sampling method for flexible generation paths.

  • The success of CTM in providing a unified perspective on generative modeling and achieving remarkable results in image generation points to its potential for broad applications across domains and sets a pathway for future research in AI.

Consistency Trajectory Models: Bridging Score-based and Distillation Models for Efficient Diffusion

Introduction

Generative models, especially Diffusion Models (DMs), have achieved remarkable success in generating high-quality samples. These models, however, face challenges regarding sample generation speed and computational efficiency. A new approach, Consistency Trajectory Models (CTM), proposes a solution by unifying score-based and distillation models under a common framework. CTM excels in providing a generalized method that benefits from both the precise generative process control of score-based models and the enhanced sampling efficiency of distillation models.

CTM Framework

Unification of Models

CTM introduces a novel parameterization to model both infinitesimal changes and significant trajectory shifts in the probability flow Ordinary Differential Equation (ODE) for diffusion processes. This dual capability allows for capturing both detailed gradient information and broad trend changes in data generation, providing flexibility in application across various domains. Notably, CTM achieves state-of-the-art (SOTA) performance in image generation tasks on CIFAR-10 and ImageNet 64x64, demonstrating its effectiveness in practice.

Training Approach

CTM's training strategy is unique in integrating score matching with reconstruction and adversarial losses, aiming at a comprehensive learning process. Its structure facilitates direct learning of data trajectories, significantly enhancing model performance while ensuring computational efficiency. Furthermore, an innovative sampling method, termed γ-sampling, emerges from CTM’s framework, supporting both deterministic and stochastic generation paths with controlled variance.

Performance and Implications

The introduction of CTM marks a significant advancement in generative modeling, particularly in the realm of DMs. With its dual modeling capability, CTM not only improves sampling efficiency but also provides a unified perspective on score-based and distillation models. The achieved SOTA results in terms of both density estimation and image generation underscore the model's potential impact. CTM paves the way for future research in generative AI, suggesting paths for further optimization and new model architectures. Moreover, the applicability of CTM across various domains hints at its broad implications, from enhancing generative quality in AI art to accelerating simulation processes in scientific research.

Future Directions

Looking ahead, several areas warrant further exploration. The flexibility of CTM in adapting to different domains suggests potential for application beyond image and media generation, including language models and time-series prediction. Additionally, the insightful blending of adversarial training within the CTM framework sparks curiosity about the integration of other deep learning methodologies to enrich and expand the model's capabilities. As the field of AI continues to evolve, CTM represents a significant step forward, offering a robust template for the next generation of generative models.

CTMs exhibit a promising avenue in the evolution of generative models, demonstrating both efficiency in learning and flexibility in application. As we advance, the exploration of CTM's boundaries and potential applications will undoubtedly yield further insights and breakthroughs in the field of artificial intelligence.

Create an account to read this summary for free:

Newsletter

Get summaries of trending comp sci papers delivered straight to your inbox:

Unsubscribe anytime.

GitHub

GitHub - sony/ctm (212 stars)

YouTube