- The paper introduces a novel Story-to-Motion framework that converts detailed narratives into continuous, controllable character animations.
- It employs a three-module approach combining text-driven scheduling, motion retrieval, and neural motion blending for seamless transitions.
- Experimental results show improved trajectory following, temporal action composition, and motion blending compared to state-of-the-art methods.
Introduction
The art of crafting virtual worlds and characters that move in harmony with compelling narratives is a challenge that spans across the animation, gaming, and film industries. At the forefront of this field is the novel task known as Story-to-Motion, which strives to synthesize character animations that are both infinite in scope and closely aligned with textual descriptions.
Overview of Story-to-Motion
The Story-to-Motion process begins by taking a detailed textual narrative—a "story"—and transforming it into a meticulously constructed sequence of character motions. What sets Story-to-Motion apart is its holistic approach: it attends to both the exacting details of kinematics (specifically, the trajectories characters follow) and the broader semantic meaning of actions described in a text. Traditional approaches have fallen short in this regard, focusing either on closely following trajectories without considering the text or generating brief, semantic motions while disregarding longer, trajectory-informed animations. The new system introduced aims to transcend these limitations.
Methodology
The system proposed in the paper comprises three interconnected modules:
- Text-driven Motion Scheduler: Utilizing a LLM, this module parses an input story, distilling it into a list of character actions, locations, and temporal spans. With some knowledge of the 3D scene, the locations mentioned can be translated into continuous trajectories via a path-finding algorithm.
- Text-based Motion Retrieval: In this step, a motion database is accessed via an auto-regressive retrieval function to find matching clips that align with the text and provide a realistic portrayal of the motion. The retrieval strategy includes kinematic and semantic features to ensure the fidelity of both motion and narrative.
- Neural Motion Blending: This final module is where the magic happens. Here, the selected motion clips are woven into a seamless and natural motion sequence. To overcome common issues in blending, such as jarring transitions or mismatches in motion style, a progressive mask transformer has been developed, ensuring smoother transitions.
Experimental Results and Contributions
Evaluations reveal that the new system outperforms existing state-of-the-art methods in key sub-tasks: trajectory following, temporal action composition, and motion blending. Not only does it improve the quality of motion over short distances, but it also excels in the synthesis of extended sequences, a testament to its versatile applicability.
The contributions of this work are threefold, offering a comprehensive solution for infinite, lifelike motion generation tied to textual narratives, a text-driven, controllable system for long human animations, and empirical evidence of its superior performance across standard benchmarks.
Final Thoughts
Story-to-Motion represents a significant leap towards more natural and infinite character animation in response to narrative text. With the promise of enhancing the animation pipeline and offering new creative tools to filmmakers and game developers, it opens the door to an era where dynamic character animations are no longer bound by the limits of pre-scripted motion paths, but flow as freely as the stories they are born from.