Physics-informed Modeling of Dynamic Humans and their Interactions

Speaker

Shashank Tripathi is a final year PhD student at the Max Planck Institute for Intelligent Systems (MPI-IS), where he is advised by Prof. Michael Black. Previously, he worked as an Applied Scientist at Amazon. Shashank earned his Masters from the Robotics Institute at Carnegie Mellon University, under the supervision of Prof. Kris Kitani. His research broadly lies at the intersection of computer vision, machine learning, and computer graphics, with a particular focus on 3D modeling of human bodies, modeling human-object interactions, and physics-inspired human motion understanding.

Abstract

Humans constantly interact with the physical world, and such interactions are guided by the laws of physics. To understand humans and their actions, computers need automatic methods to reconstruct and model the body in 3D. State-of-the-art (SOTA) 3D human pose estimation methods have made rapid progress, estimating 3D humans that align well with image features in the camera view. Similarly, there has been rapid progress in training models that generate human motions either unconditionally or conditioned on text or previous motions. However, most methods ignore the fact that people move in a scene, interact with it, and receive physical support by contacting it. This is a deal-breaker for inherently 3D applications, such as biomechanics, augmented/virtual reality (AR/VR), interactive entertainment, and gaming. In this talk, I will discuss our work towards integrating differentiable physics and biomechanics in data-driven training. Specifically, I will introduce IPMAN, a 3D human pose estimation method that leverages novel intuitive physics terms to estimate physically plausible 3D bodies from a color image in a “stable” configuration. Next, I will discuss our recent work on shape-conditioned human motion generation that extends IPMAN’s differentiable physics terms to dynamic humans. Lastly, I’ll present DECO, a novel method for accurately estimating 3D human-scene and human-object contact in complex scenarios depicted in real-world images. Trained on the DAMON dataset, DECO leverages detailed vertex-level annotations to effectively model physical interactions. I’ll also discuss how we curated and scaled 3D contact annotations tailored for real-world images in the DAMON dataset.

Video

Coming soon. Stay tuned. :-)