Training-and-Prompt-Free General Painterly Harmonization via Zero-Shot Disentenglement on Style and Content References (2404.12900v2)

Published 19 Apr 2024 in cs.CV, cs.AI, and cs.MM

Abstract: Painterly image harmonization aims at seamlessly blending disparate visual elements within a single image. However, previous approaches often struggle due to limitations in training data or reliance on additional prompts, leading to inharmonious and content-disrupted output. To surmount these hurdles, we design a Training-and-prompt-Free General Painterly Harmonization method (TF-GPH). TF-GPH incorporates a novel Similarity Disentangle Mask'', which disentangles the foreground content and background image by redirecting their attention to corresponding reference images, enhancing the attention mechanism for multi-image inputs. Additionally, we propose aSimilarity Reweighting'' mechanism to balance harmonization between stylization and content preservation. This mechanism minimizes content disruption by prioritizing the content-similar features within the given background style reference. Finally, we address the deficiencies in existing benchmarks by proposing novel range-based evaluation metrics and a new benchmark to better reflect real-world applications. Extensive experiments demonstrate the efficacy of our method in all benchmarks. More detailed in https://github.com/BlueDyee/TF-GPH.

References (64)