Controllable Motion Diffusion Model

Random Sampling

Controllable Synthesis with motion-inpainting

Hand following a sine function on z-axis during random-play

Head following a height limit using a simple feedback rule

Heading with a fixed angle while Hip following a sine function on z-axis

Controllable Synthesis with Policy Network

Target Reaching

Joystick

PathFollowing

Motion-Inpainting during following policy

Hand is kept to a height with motion-inpainting while executing joytick command.

Abstract

Generating realistic and controllable motions for virtual characters is a challenging task in computer animation, and its implications extend to games, simulations, and virtual reality. Recent studies have drawn inspiration from the success of diffusion models in image generation, demonstrating the potential for addressing this task. However, the majority of these studies have been limited to offline applications that target at sequence-level generation that generates all steps simultaneously. To enable real-time motion synthesis with diffusion models in response to time-varying control signals, we propose the framework of the Controllable Motion Diffusion Model (COMODO). Our framework begins with an auto-regressive motion diffusion model (A-MDM), which generates motion sequences step by step. In this way, simply using the standard DDPM algorithm without any additional complexity, our framework is able to generate high-fidelity motion sequences over extended periods with different types of control signals. Our model can generate high-fidelity motion sequences over extended periods in real-time, using the standard DDPM algorithm, without any additional complexity. Then, we propose our reinforcement learning-based controller and controlling strategies on top of the A-MDM model, so that our framework can steer the motion synthesis process across multiple tasks, including target reaching, joystick-based control, goal-oriented control and trajectory following. The proposed framework enables the real-time generation of diverse motions that react adaptively to user commands on-the-fly, thereby enhancing the overall user experience. Besides, it is compatible with the inpainting-based editing methods and can predict much more diverse motions without additional fine-tuning of the basic motion generation models.

Pipeline Overview

A-MDM: An auto-regressive motion diffusion model, COMODO: A pipeline for controllable motion synthesis that supports RL based controllers and inpainting simultaneously.

The controllers are trained with RL to perform motion-inpainting on the base model. For a motion frame x, the controllers intervene in the classic DDPM denoising process by predicting dx and make x0 = x0+dx during a few pre-determined steps.

Conclusion

In our research, we present an auto-regressive diffusion model for kinematic-based motion synthesis. we demonstrate that a first-order auto-regressive motion synthesis model can generate character motion with excellent stability and diversity using just a small number of training clips. we examine different control strategies for synthesizing controllable motion sequences. We specifically propose an efficient RL-based control methodology to train the policy network for locomotion tasks and generate high-quality motions.