DiffSketcher: Text Guided Vector Sketch Synthesis through Latent Diffusion Models

Published 26 Jun 2023 in cs.CV and cs.AI | (2306.14685v4)

Abstract: Even though trained mainly on images, we discover that pretrained diffusion models show impressive power in guiding sketch synthesis. In this paper, we present DiffSketcher, an innovative algorithm that creates \textit{vectorized} free-hand sketches using natural language input. DiffSketcher is developed based on a pre-trained text-to-image diffusion model. It performs the task by directly optimizing a set of B\'ezier curves with an extended version of the score distillation sampling (SDS) loss, which allows us to use a raster-level diffusion model as a prior for optimizing a parametric vectorized sketch generator. Furthermore, we explore attention maps embedded in the diffusion model for effective stroke initialization to speed up the generation process. The generated sketches demonstrate multiple levels of abstraction while maintaining recognizability, underlying structure, and essential visual details of the subject drawn. Our experiments show that DiffSketcher achieves greater quality than prior work. The code and demo of DiffSketcher can be found at https://ximinng.github.io/DiffSketcher-project/.

Abstract PDF Upgrade to Chat

Citations (30)

View on Semantic Scholar

Summary

The paper introduces DiffSketcher, which leverages latent diffusion models to synthesize vector sketches from text, eliminating the need for supervised sketch datasets.
It employs an extended score distillation loss combined with attention-based stroke initialization to optimize Bézier curve parameters.
Experimental results demonstrate that DiffSketcher outperforms previous methods in visual coherence and semantic alignment for text-to-sketch transformation.

DiffSketcher: Text Guided Vector Sketch Synthesis through Latent Diffusion Models

The research paper introduces DiffSketcher, an approach designed to create vectorized sketches from natural language descriptions. Unlike traditional methods that rely on complex datasets or supervised learning, DiffSketcher leverages pre-trained text-to-image diffusion models to facilitate sketch synthesis without the need for extensive training pairs. This novel technique exploits latent diffusion models for efficient text-to-sketch transformation, optimizing Bézier curve parameters to generate abstract yet recognizable vector sketches.

The methodology underpinning DiffSketcher is rooted in the adaptation of score distillation sampling (SDS) loss, allowing raster-level diffusion models to optimize parametric vector sketches. By utilizing attention maps from diffusion models, the algorithm achieves efficient stroke initialization, improving the generation process's quality and speed. This process ensures that the resulting sketches maintain coherence with the input textual semantics while offering varying levels of abstraction.

Key Contributions and Methodology

Latent Diffusion Model Utilization: DiffSketcher capitalizes on pre-existing text-to-image diffusion models to generate sketches without requiring direct sketch datasets. It employs a differentiable rasterizer for optimizing curve parameters, effectively transferring image synthesis knowledge to the sketch generation domain.
Extended SDS Loss: Building on the SDS framework, the paper introduces an enhanced version that integrates with CLIP and LPIPS losses, enabling diverse and controlled vector sketch synthesis. This modification supports greater fidelity to textual prompts.
Attention-based Stroke Initialization: By exploiting attention maps within the diffusion model, the research presents a refined initialization strategy for stroke placement. This is critical for non-convex optimization landscapes, enhancing both convergence speed and final sketch quality.
Opacity and Stylistic Variability: The integration of opacity controls within the optimization process adds stylistic depth, mimicking human sketch styles by varying brushstroke weights, thus achieving more natural sketches.

Experimental Results and Implications

The experimental evaluations illustrate that DiffSketcher surpasses existing methods in generating high-quality and diverse sketches from textual descriptions. The comparisons with methods like CLIPasso reveal significant improvements in visual coherence and semantic alignment. These advancements underscore DiffSketcher's ability to translate textual abstractions into visually compelling sketches.

The implications of this research are manifold. On a practical level, it provides a tool for designers and artists to swiftly generate conceptual sketches from textual ideas, reducing manual effort and time. Theoretically, it demonstrates the potential of diffusion models in domains beyond traditional image synthesis, bridging the gap between natural language processing and computer graphics.

Future Directions

While the introduction of DiffSketcher marks a significant stride in text-to-sketch synthesis, the paper identifies several avenues for future research. Enhancing the capability to control the abstraction level directly through textual prompts could offer more personalized sketch generation. Additionally, extending the model's capacity to incorporate stylistic variations and multi-object scenes could further expand its applicability. Investigating integration with advanced neural architectures or alternative diffusion frameworks might yield further improvements in efficiency and quality.

In sum, DiffSketcher represents a promising convergence of language understanding and visual synthesis, presenting a robust method for automatic sketch generation that can be a foundational tool in creative and design-oriented AI applications.

Markdown

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Paper Prompts

Top Community Prompts

Explain it Like I'm 14

off on

Knowledge Gaps

off on

Glossary

off on

Practical Applications

off on

Conceptual Simplification

off on

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Generate Now

Continue Learning

Authors (6)

Collections

GitHub

Tweets

YouTube

Show All Videos

DiffSketcher: Text Guided Vector Sketch Synthesis through Latent Diffusion Models

Summary

DiffSketcher: Text Guided Vector Sketch Synthesis through Latent Diffusion Models

Key Contributions and Methodology

Experimental Results and Implications

Future Directions

Paper to Video (Beta)

Whiteboard

Paper Prompts

Top Community Prompts

Open Problems

Continue Learning

Related Papers

Authors (6)

Collections

GitHub

Tweets

YouTube