Transforming household chores through automation is a top priority for many. Roboticists are striving to develop a machine that can learn generalist policies to efficiently handle tasks in any environment. MIT‘s Computer Science and Artificial Intelligence Laboratory (CSAIL) researchers are focused on creating trainable robot policies tailored for specific environments to make household robots more effective and practical.
“We aim for robots to perform exceptionally well under disturbances, distractions, varying lighting conditions, and changes in object poses, all within a single environment,” says Marcel Torne Villasevil, MIT CSAIL research assistant in the Improbable AI lab and lead author on a recent paper about the work.
“We propose a method to create digital twins on the fly using the latest advances in computer vision. With just their phones, anyone can capture a digital replica of the real world, and the robots can train in a simulated environment much faster than the real world, thanks to GPU parallelization. Our approach eliminates the need for extensive reward engineering by leveraging a few real-world demonstrations to jump-start the training process.”
Transforming your home with RialTo is a sophisticated process that starts with scanning the environment using advanced tools like NeRFStudio, ARCode, or Polycam. Once the scene is reconstructed, you can fine-tune it and customize the robots before exporting the refined scene into the simulator. Real-world actions and observations are used to develop a policy, providing valuable data for reinforcement learning. With RialTo, you can bring your ideal home bot to life with precision and efficiency.
“This helps in creating a strong policy that works well in both the simulation and the real world. An enhanced algorithm using reinforcement learning helps guide this process, to ensure the policy is effective when applied outside of the simulator,” says Torne.
RialTo has demonstrated its ability to create robust policies for a wide range of tasks, excelling in both controlled lab environments and unpredictable real-world scenarios. It achieved a 67% improvement over imitation learning, irrespective of whether the tasks involved opening a toaster, placing a book on a shelf, or dealing with various other everyday activities.
The system’s performance was put to the test under increasingly difficult conditions, including randomizing object poses, introducing visual distractions, and applying physical disturbances during task executions. When combined with real-world data, RialTo surpassed traditional imitation-learning methods, particularly in scenarios involving numerous visual distractions or physical disruptions.
“These experiments show that if we care about being very robust to one particular environment, the best idea is to leverage digital twins instead of trying to obtain robustness with large-scale data collection in diverse environments,” says Pulkit Agrawal, director of Improbable AI Lab, and senior author on the work.
As for its limitations, RialTo currently requires three days for full training. To expedite this process, the team is exploring avenues for enhancing the underlying algorithms and utilizing foundation models. However, training in simulation poses challenges, particularly in achieving seamless sim-to-real transfer and simulating deformable objects or liquids.
Looking ahead, the next phase of RialTo’s development will focus on fortifying the model against various disturbances while enhancing its capacity to adapt to new environments, leveraging previous research efforts.
“Our next endeavor is this approach to using pre-trained models, accelerating the learning process, minimizing human input, and achieving broader generalization capabilities,” says Torne.
“We’re incredibly enthusiastic about our ‘on-the-fly’ robot programming concept, where robots can autonomously scan their environment and learn how to solve specific tasks in simulation. While our current method has limitations — such as requiring a few initial demonstrations by a human and significant compute time for training these policies (up to three days) — we see it as a significant step towards achieving ‘on-the-fly’ robot learning and deployment,” says Torne.
“This approach moves us closer to a future where robots won’t need a preexisting policy that covers every scenario. Instead, they can rapidly learn new tasks without extensive real-world interaction. In my view, this advancement could expedite the practical application of robotics far sooner than relying solely on a universal, all-encompassing policy.”
“To deploy robots in the real world, researchers have traditionally relied on methods such as imitation learning from expert data, which can be expensive, or reinforcement learning, which can be unsafe,” says Zoey Chen, a computer science PhD student at the University of Washington who wasn’t involved in the paper.
“RialTo directly addresses both the safety constraints of real-world RL [robot learning] and efficient data constraints for data-driven learning methods, with its novel real-to-sim-to-real pipeline. This novel pipeline not only ensures safe and robust training in simulation before real-world deployment but also significantly improves the efficiency of data collection. RialTo has the potential to significantly scale up robot learning and allows robots to adapt to complex real-world scenarios much more effectively.”
“Simulation has shown impressive capabilities on real robots by providing inexpensive, possibly infinite data for policy learning,” adds Marius Memmel, a computer science PhD student at the University of Washington who wasn’t involved in the work.
“However, these methods are limited to a few specific scenarios, and constructing the corresponding simulations is expensive and laborious. RialTo provides an easy-to-use tool to reconstruct real-world environments in minutes instead of hours. Furthermore, it makes extensive use of collected demonstrations during policy learning, minimizing the burden on the operator and reducing the sim2real gap. RialTo demonstrates robustness to object poses and disturbances, showing incredible real-world performance without requiring extensive simulator construction and data collection.”
Journal reference:
- Marcel Torne, Anthony Simeonov, Zechu Li, April Chan, Tao Chen, Abhishek Gupta, Pulkit Agrawal. Reconciling Reality through Simulation: A Real-to-Sim-to-Real Approach for Robust Manipulation. arXiv, 2024; DOI: 10.48550/arXiv.2403.03949