Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Make-A-Protagonist: Generic Video Editing with An Ensemble of Experts (2305.08850v2)

Published 15 May 2023 in cs.CV

Abstract: The text-driven image and video diffusion models have achieved unprecedented success in generating realistic and diverse content. Recently, the editing and variation of existing images and videos in diffusion-based generative models have garnered significant attention. However, previous works are limited to editing content with text or providing coarse personalization using a single visual clue, rendering them unsuitable for indescribable content that requires fine-grained and detailed control. In this regard, we propose a generic video editing framework called Make-A-Protagonist, which utilizes textual and visual clues to edit videos with the goal of empowering individuals to become the protagonists. Specifically, we leverage multiple experts to parse source video, target visual and textual clues, and propose a visual-textual-based video generation model that employs mask-guided denoising sampling to generate the desired output. Extensive results demonstrate the versatile and remarkable editing capabilities of Make-A-Protagonist.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Yuyang Zhao (24 papers)
  2. Enze Xie (84 papers)
  3. Lanqing Hong (72 papers)
  4. Zhenguo Li (195 papers)
  5. Gim Hee Lee (136 papers)
Citations (26)

Summary

We haven't generated a summary for this paper yet.