Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Generating Long Videos of Dynamic Scenes (2206.03429v2)

Published 7 Jun 2022 in cs.CV, cs.AI, cs.LG, and cs.NE

Abstract: We present a video generation model that accurately reproduces object motion, changes in camera viewpoint, and new content that arises over time. Existing video generation methods often fail to produce new content as a function of time while maintaining consistencies expected in real environments, such as plausible dynamics and object persistence. A common failure case is for content to never change due to over-reliance on inductive biases to provide temporal consistency, such as a single latent code that dictates content for the entire video. On the other extreme, without long-term consistency, generated videos may morph unrealistically between different scenes. To address these limitations, we prioritize the time axis by redesigning the temporal latent representation and learning long-term consistency from data by training on longer videos. To this end, we leverage a two-phase training strategy, where we separately train using longer videos at a low resolution and shorter videos at a high resolution. To evaluate the capabilities of our model, we introduce two new benchmark datasets with explicit focus on long-term temporal dynamics.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (9)
  1. Tim Brooks (10 papers)
  2. Janne Hellsten (6 papers)
  3. Miika Aittala (22 papers)
  4. Ting-Chun Wang (26 papers)
  5. Timo Aila (23 papers)
  6. Jaakko Lehtinen (23 papers)
  7. Ming-Yu Liu (87 papers)
  8. Tero Karras (26 papers)
  9. Alexei A. Efros (100 papers)
Citations (83)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com