Synthesizing Physically Plausible Human Motions
in 3D Scenes

Liang Pan1   Jingbo Wang2   Buzhen Huang1   Junyu Zhang1
Haofan Wang3   Xu Tang3   Yangang Wang1

Southeast University1 Shanghai AI Lab2 Xiaohongshu Inc.3

3DV 2024

[Paper]   [Code]

Synthesizing physically plausible human motions in 3D scenes is a challenging problem. Kinematics-based methods cannot avoid inherent artifacts (e.g., penetration and foot skating) due to the lack of physical constraints. Meanwhile, existing physics-based methods cannot generalize to multi-object scenarios since the policy trained with reinforcement learning has limited modeling capacity. In this work, we present a framework that enables physically simulated characters to perform long-term interaction tasks in diverse, cluttered, and unseen scenes. The key idea is to decompose human-scene interactions into two fundamental processes, Interacting and Navigating, which motivates us to construct two reusable Controller, i.e., InterCon and NavCon. Specifically, InterCon contains two complementary policies that enable characters to enter and leave the interacting state (e.g., sitting on a chair and getting up). To generate interaction with objects at different places, we further design NavCon, a trajectory following policy, to keep characters' locomotion in the free space of 3D scenes. Benefiting from the divide and conquer strategy, we can train the policies in simple environments and generalize to complex multi-object scenes. Experimental results demonstrate that our framework can synthesize physically plausible long-term human motions in complex 3D scenes.


Previous works [1, 2]


Existing physics-based frameworks cannot generalize to multi-object scenarios due to the lack of two important abilities:
(1-top) continuous interaction, (2-bottom) obstacle avoidance.


The interaction controller consists of two separate control policies, which provide two interaction-involved skills, i.e., sitting and getting up. The navigation controller employs a trajectory following policy that controls the character's movements along a specific path. Then, two reusable controllers are combined to synthesize human motions in complex 3D scenes without additional training. This is achieved by using a finite state machine that receives user instructions to enable the simulated character to perform long-term interaction tasks.

Synthesized Results in Diverse 3D Scenes

Scalability of Our Framework

By training an additional interaction controller, our framework can be extended to new action (e.g., lying down), which enables physically simulated characters to interaction with objects more diversely.


    title={Synthesizing Physically Plausible Human Motions in 3D Scenes}, 
    author={Liang Pan and Jingbo Wang and Buzhen Huang and Junyu Zhang and Haofan Wang and Xu Tang and Yangang Wang},

Related Projects

We sincerely thank HuMoR for its awesome renderer. This project page template is based on this page.