LayoutDiT: Exploring Content-Graphic Balance in Layout Generation with Diffusion Transformer (2407.15233v3)

Published 21 Jul 2024 in cs.CV

Abstract: Layout generation is a foundation task of graphic design, which requires the integration of visual aesthetics and harmonious expression of content delivery. However, existing methods still face challenges in generating precise and visually appealing layouts, including blocking, overlapping, small-sized, or spatial misalignment. We found that these methods overlook the crucial balance between learning content-aware and graphic-aware features. This oversight results in their limited ability to model the graphic structure of layouts and generate reasonable layout arrangements. To address these challenges, we introduce LayoutDiT, an effective framework that balances content and graphic features to generate high-quality, visually appealing layouts. Specifically, we first design an adaptive factor that optimizes the model's awareness of the layout generation space, balancing the model's performance in both content and graphic aspects. Secondly, we introduce a graphic condition, the saliency bounding box, to bridge the modality difference between images in the visual domain and layouts in the geometric parameter domain. In addition, we adapt a diffusion transformer model as the backbone, whose powerful generative capability ensures the quality of layout generation. Benefiting from the properties of diffusion models, our method excels in constrained settings without introducing additional constraint modules. Extensive experimental results demonstrate that our method achieves superior performance in both constrained and unconstrained settings, significantly outperforming existing methods.

Authors (9)

Yu Li (378 papers)
Yifan Chen (164 papers)
Gongye Liu (7 papers)
Jie Wu (230 papers)
Yujiu Yang (155 papers)
Fei Yin (36 papers)
Qingyan Bai (11 papers)
Hongfa Wang (29 papers)
Ruihang Chu (18 papers)

Summary

We haven't generated a summary for this paper yet.

Summarize Now

LayoutDiT: Exploring Content-Graphic Balance in Layout Generation with Diffusion Transformer (2407.15233v3)

Summary

Related Papers