DexScale: Automating Data Scaling for Sim2Real Generalizable Robot Skills

Guiliang Liu1*, Yueci Deng2*, Runyi Zhao1, Huayi Zhou1, Jian Chen2, Jietao Chen2, Ruiyan Xu1, Yunxin Tai2, Kui Jia1,2†
1The Chinese University of Hong Kong, Shenzhen
2Dexforce Co. Ltd
*Equal Contribution; Corresponding author at kuijia@cuhk.edu.cn

Abstract

A critical prerequisite for achieving generalizable robot control is the availability of a large-scale robot training dataset. Due to the expense of collecting realistic robotic data, recent studies explored simulating and recording robot skills in vir- tual environments. While simulated data can be generated at higher speeds, lower costs, and larger scales, the applicability of such simulated data remains questionable due to the gap of between simulated and realistic environments.

To advance the Sim2Real generalization, in this study, we present DexScale, a data engine designed to perform automatic skills simulation and scaling for learning deployable robot manipulation policies. Specifically, DexScale ensures the usability of simulated skills by integrating diverse forms of realistic data into the simulated environment, preserving semantic alignment with the target tasks.

For each simulated skill in the environment, DexScale facilitates effective Sim2Real data scaling by automating the process of domain randomization and adaptation. Tuned by the scaled dataset, the control policy achieves zero-shot Sim2Real generalization across diverse tasks, multiple robot embodiments, and widely studied policy model architectures, highlighting its importance in ad- vancing Sim2Real embodied intelligence

Data Scaling Pipeline


As a data engine, DexScale takes task-descriptive data as input and generates a skill dataset to support Sim2Real transfer. This enables the zero-shot deployment of robot policies in realistic environments.

Examples of Data Scaling

Figures:

Data Scaling for Object Grasping

Data Scaling for Object Manipulation & Object Re-arrangement


Videos:

Data Scaling for Object Grasping

Data Scaling for Object Manipulation

Data Scaling for Object Re-arrangement

Data Scaling for Object Grasp-then-Manipulation

Figures and Videos above illustrate examples of action trajectories for the tasks of object grasping, box manipulation, and table rearrangement. To demonstrate the scalability of DexScale, the control policies are deployed on different robots, including two single-arm robots and a dual-arm robot equipped with wrist-mounted cameras.

Experiment Results

1. Success rates of imitation policies learned by different datasets under various Sim2Real gaps. For the first eight domain gaps, we employ the transformer-based policy Zhao2023ACT to tackle grasping tasks. For the last two domain gaps, we use the diffusion-based policy Chi2023DiffusionPolicy to address the open-box task.

Table showing success rates

2. Robot control performance for different tasks under both realistic (upper) and simulated (lower) environments. The Re-Orientation task is tackeled by a simulation data trained diffusion-based Vision-Language-Action (VLA) model Liu2024RDT

Experiment 2 Bar Plot

3. Real-world deployment videos of policy trained on datasets of different tasks from DexScale.

Object Grasp

Object Manipulation

Object Rearrangement

Object Grasp-then-Manipulation

Citation

If you find this work useful, please cite:
            @inproceedings{
            liu2025dexscale,
            title={DexScale: Automating Data Scaling for Sim2Real Generalizable Robot Control},
            author={Guiliang Liu and Yueci Deng and Runyi Zhao and Huayi Zhou and Jian Chen and
                Jietao Chen and Ruiyan Xu and Yunxin Tai and Kui Jia},
            booktitle={International Conference on Machine Learning, ICML},
            year={2025},
            }