Collect Data

Collect Data#

We provide 1,000 pre-collected trajectories per task as part of the open-source release RoboSynChallenge Dataset. The datasets hosted on HuggingFace are available at here.

However, we still strongly recommend users to perform data collection themselves.

bash launch/run_task.sh {task_name} [random|clear] [3_0/2_1] [Other Extra Arguments]
# View supported tasks and extra arguments: bash launch/run_task.sh -h
# bash launch/run_task.sh click_bell clear 3_0
# Collect data for the click_bell task without domain randomization and the data is the LeRobot 3.0 format.
# bash launch/run_task.sh mixer_operating random 2_1
# Collect data for the mixer_operating task involving domain randomization and convert the data to the LeRobot 2.1 format.

After data collection is completed, the collected data will be stored under lerobot_dataset/{task_name}/.

If you want to convert lerobot 3.0 to the lerobot 2.1 format manually, we have also provide ready-made conversion scripts:

python scripts/convert_lerobot3.0_to_2.1.py --repo-id {repo_id} --root /path/to/datasets

For pre-collected simulated and real datasets, see Download Data.

If you want to train on multiple datasets together (e.g., multi-task, mixed training with simulated and real data), use the lerobot-edit-dataset tool or the helper script launch/collect_combined_dataset.sh.