Emergent Mind

One-Shot Video Inpainting

(2302.14362)
Published Feb 28, 2023 in cs.CV

Abstract

Recently, removing objects from videos and filling in the erased regions using deep video inpainting (VI) algorithms has attracted considerable attention. Usually, a video sequence and object segmentation masks for all frames are required as the input for this task. However, in real-world applications, providing segmentation masks for all frames is quite difficult and inefficient. Therefore, we deal with VI in a one-shot manner, which only takes the initial frame's object mask as its input. Although we can achieve that using naive combinations of video object segmentation (VOS) and VI methods, they are sub-optimal and generally cause critical errors. To address that, we propose a unified pipeline for one-shot video inpainting (OSVI). By jointly learning mask prediction and video completion in an end-to-end manner, the results can be optimal for the entire task instead of each separate module. Additionally, unlike the two stage methods that use the predicted masks as ground truth cues, our method is more reliable because the predicted masks can be used as the network's internal guidance. On the synthesized datasets for OSVI, our proposed method outperforms all others both quantitatively and qualitatively.

We're not able to analyze this paper right now due to high demand.

Please check back later (sorry!).

Generate a summary of this paper on our Pro plan:

We ran into a problem analyzing this paper.

Newsletter

Get summaries of trending comp sci papers delivered straight to your inbox:

Unsubscribe anytime.