
MIT researchers are launching a new era in robotics, where general-purpose robots can learn a variety of tasks without starting from scratch each time. This development could soon make the versatile dexterity of Rosie, the robotic maid from "The Jetsons," a practical reality. The research group's technique leans heavily on heterogeneous data — a vast collection pulled from diverse sources and varying modalities, to rapidly to adapt to different environments or tasks, according to MIT News.
Combining this vast array of data, including vision sensors and robotic arm position encoder, these scientists have created a shared "language" that allows the AI model within a robot to process a wide range of commands. Lirui Wang, an electrical engineering and computer science graduate student at MIT, and lead author of the paper, says, "Our work shows how you'd be able to train a robot with all of them put together," in a statement obtained by MIT News. This method of training has proved to significantly outperform traditional training methods by more than 20 percent in both simulation and real-world experiments.
At the heart of this new architectural framework is a machine-learning model known as a transformer, which underlies many large language models like GPT-4. The transformer processes inputs from both vision and proprioception, encoding these varied data into a uniform structure of tokens. Once the data is seamlessly integrated, the transformer learns from it, enhancing its capabilities to handle a greater diversity of tasks.
"In robotics, people often claim that we don’t have enough training data. But in my view, another big problem is that the data come from so many different domains, modalities, and robot hardware," Wang told MIT News. This new training method requires only a small subset of data specific to a user's robot setup to quickly to teach it new tasks
David Held, associate professor at the Carnegie Mellon University Robotics Institute, who was not involved with this work, commended the innovative approach, highlighting its ability to quickly adapt to new robot designs, which he deems crucial as the field sees continuous production of new robotic forms.
The researchers, funded by the Amazon Greater Boston Tech Initiative and the Toyota Research Institute, hold ambitious plans for the future. They aim to create a "universal robot brain" that could be downloaded and used without any preliminary training, as per MIT News. While the technology is still in the infancy stages, this groundbreaking development hints at a future where general-purpose robots might become an integral part of our everyday lives. The full details of the research were presented at the recent Conference on Neural Information Processing Systems.









