Adversarial Score Distillation: When score distillation meets GAN (2312.00739v2)

Published 1 Dec 2023 in cs.CV

Abstract: Existing score distillation methods are sensitive to classifier-free guidance (CFG) scale: manifested as over-smoothness or instability at small CFG scales, while over-saturation at large ones. To explain and analyze these issues, we revisit the derivation of Score Distillation Sampling (SDS) and decipher existing score distillation with the Wasserstein Generative Adversarial Network (WGAN) paradigm. With the WGAN paradigm, we find that existing score distillation either employs a fixed sub-optimal discriminator or conducts incomplete discriminator optimization, resulting in the scale-sensitive issue. We propose the Adversarial Score Distillation (ASD), which maintains an optimizable discriminator and updates it using the complete optimization objective. Experiments show that the proposed ASD performs favorably in 2D distillation and text-to-3D tasks against existing methods. Furthermore, to explore the generalization ability of our WGAN paradigm, we extend ASD to the image editing task, which achieves competitive results. The project page and code are at https://github.com/2y7c3/ASD.

Citations (4)

View on Semantic Scholar

Summary

The paper introduces ASD, a novel framework that integrates score distillation with GANs to address scale and stability challenges.
ASD leverages a fully optimizable discriminator using WGAN losses, yielding improved quality, stability, and diversity in tasks like 2D distillation and text-to-3D conversion.
The approach extends to image editing, enabling precise modifications such as object replacement and detail refinement while using pretrained diffusion models.

Understanding Adversarial Score Distillation

The Interplay of Distillation and GANs

Recent developments in AI have introduced a method known as Adversarial Score Distillation (ASD), which integrates the concept of score distillation with Generative Adversarial Networks (GAN). Traditional score distillation techniques have been sensitive to the scale associated with classifier-free guidance, leading to issues like over-smoothness or instability at small scales, and over-saturation at larger ones. ASD addresses these scale-sensitive issues by employing an optimizable discriminator updated using a complete WGAN optimization objective.

The Innovations of ASD

ASD proposes a novel framework where a discriminator is implementable by combining diffusion models and textual-inversion embedding or alternative methods that allow it to be optimizable. This discriminator optimization uses losses derived from WGAN, which provides significant improvements over past methods that employed a fixed sub-optimal discriminator or an incomplete optimization objective.

The experiments conducted demonstrate that ASD yields favorable outcomes in tasks like 2D distillation and text-to-3D tasks, showing improvements over existing score distillation methods in terms of quality, stability, and diversity.

Exploring Image Editing

Further exploration of ASD's capability is evident in its application to image editing tasks. The approach was able to produce competitive results, managing to carry out intricate tasks like simplifying images, refining details, and replacing objects within an image contextually and with high fidelity. This extension of ASD to image editing illustrates its versatility and robust nature, in using the established WGAN paradigm.

Contributions and Implications

This advancement in score distillation methodology has broader implications, shedding light on the fundamental connections between score distillation and GANs. It opens up avenues for leveraging pretrained diffusion models for diverse downstream tasks without the necessity for task-specific model redesigns or extensive fine-tuning datasets.

Future Directions

While ASD shows promise, it currently shares a similar computational demand to previous variants like VSD. Future work may enhance its efficiency through advancements in optimization strategies, possibly leading to even faster and more refined distillation processes.

In summary, ASD marks a significant step in the field of AI-driven content creation, potentially leading to better quality, consistency, and versatility in future applications.

PDF Markdown

Related Papers

GitHub

GitHub - 2y7c3/ASD: [CVPR2024] Official Codes for "Adversarial Score Distillation: When score distillation meets GAN" (31 stars)