SLoMo: A General System for Legged Robot Motion Imitation from Casual Videos

John Zhang Shuo Yang Gengshan Yang Arun Bishop Swaminathan Gurumurthy Deva Ramanan Zachary Manchester

A collection of video-to-robot motion transfer demonstrations on the Unitree Go1 quadrupedal robot on hardware (left three) and Atlas humanoid robot in simulation (right). From left to right: a dog reaching for a water feeder with one of its front feet, a house cat pacing gracefully across the living room floor, a trained dog performing a Cardiopulmonary Resuscitation (CPR) exercise on its human partner during a competition routine, and a human stretching his body and limbs.

Abstract

We present SLoMo: a first-of-its-kind framework for transferring skilled motions from casually captured video footage of humans and animals to legged robots. SLoMo works in three stages: 1) synthesize a physically plausible reconstructed key-point trajectory from monocular videos; 2) optimize a dynamically feasible reference trajectory for the robot offline that includes body and foot motion, as well as contact sequences that closely tracks the key points; 3) track the reference trajectory online using a general-purpose model-predictive controller on robot hardware. Traditional motion imitation for legged motor skills often requires expert animators, collaborative demonstrations, and/or expensive motion capture equipment, all of which limits scalability. Instead, SLoMo only relies on easy-to-obtain monocular video footage, readily available in online repositories such as YouTube. It converts videos into motion primitives that can be executed reliably by real-world robots. We demonstrate our approach by transferring the motions of cats, dogs, and humans to example robots including a quadruped (on hardware) and a humanoid (in simulation). To the best knowledge of the authors, this is the first attempt at a general-purpose motion transfer framework that imitates animal and human motions on legged robots directly from casual videos without artificial markers or labels.

arXiv | Code | Video

Experiment Videos

Dog Reach

Cat Walk

Dog CPR

NEW: Dog Crazyflie

NEW: Cat Walk 2

Human Stretch

Human Wave

Human Jumping Jack

Dog Reach Comparision

Yellow: Reconstructed key-point reference. Teal: Optimized reference. Purple: Imitation learning policy.

Cat Pace Comparision

Yellow: Reconstructed key-point reference. Teal: Optimized reference. Purple: Imitation learning policy.

Random seed comparison

Imitation learning policies with 4 different random seeds.