The RoboTurk Real Robot Dataset

We collected a large-scale dataset on three different real world tasks: Laundry Layout, Tower Creation, and Object Search. All three datasets were collected using the RoboTurk platform, collected by crowdsourced workers remotely. Our dataset consists of 2144 different demonstrations from 54 unique users. We provide both the complete dataset for training and smaller subsamples of the dataset for exploration.

We are providing this comprehensive and diverse dataset that, instead of largely 2D manipulation tasks, include tasks with complex 3D motions that can be utilized for similar 3D manipulation tasks. Furthermore, our tasks are long-horizon, so it is important for prediction models to be able to reason about different parts of the task, given some context or history. In addition, our dataset can be used for action-conditioned video prediction for model based predictive control1 or for action-free video prediction2.

We will describe the structure of the dataset in the sections below and demonstrate video prediction results.

Laundry Layout (17.2 GB) Object Search (18.8 GB) Tower Creation (18.8 GB)

We have provided a Github repository that includes scripts for exploring the dataset and for training video prediction:

After You Download

After unzipping the dataset, the following subdirectories can be found within the each directory. Every directory has the same structure as described below:

{task_name}_aligned_dataset.hdf5: A postprocessed, aligned set of data that contains control data from the user and joint data from the robot along with the corresponding timestamps with the structure described in the next section

{task_name}.zip: A postprocessed, aligned set of folders that, for each demonstration, contains three MP4 videos: depth videos from the Kinect (kinect_depth_aligned) which contain monochromatic data from the kinect depth sensor, RGB videos from the Kinect (kinect_rgb_aligned), and RGB videos from the webcam (usb_aligned).

The file structure of the video data is at follows and we will provide scripts below that map the demos to the video files:

Video Prediction

We show results from our experiments with Stochastic Variational Video Prediction to show the use of our dataset in action-conditioned video prediction which has possible uses in model predictive control. The scripts to run the same experiments are explained in the Github wiki.

Qualitative Results

Predictions on Laundry Layout
Ground Truth
Predictions on Tower Creation
Ground Truth