RoboTurk — A System for Large-Scale Teleoperation of Robots

Learning Multi-Arm Manipulation Through Collaborative Teleoperation

Albert Tung^*1, Josiah Wong^*1,Ajay Mandlekar¹,Roberto Martín-Martín¹,Yuke Zhu², Li Fei-Fei¹, Silvio Savarese¹

^* These authors contributed equally. ¹Stanford Vision and Learning Lab. ²The University of Texas at Austin

Imitation Learning (IL) is a powerful paradigm to teach robots to perform manipulation tasks by allowing them to learn from human demonstrations collected via teleoperation, but has mostly been limited to single-arm manipulation. However, many real-world tasks require multiple arms, such as lifting a heavy object or assembling a desk. Unfortunately, applying IL to multi-arm manipulation tasks has been challenging -- asking a human to control more than one robotic arm can impose significant cognitive burden and is often only possible for a maximum of two robot arms.

To address these challenges:

We present Multi-Arm Roboturk(MART), a multi-user data collection platform that allows multiple remote users to simultaneously teleoperate a set of robotic arms and collect demonstrations for multi-arm tasks
We show that learning from such data consequently presents challenges for centralized agents that directly attempt to model all robot actions simultaneously
We propose and evaluate a base-residual policy framework that allows trained policies to better adapt to the mixed coordination setting common in multi-arm manipulation, and show that a centralized policy augmented with a decentralized residual model outperforms all other models on our set of benchmark tasks.