Robotic helper making mistakes? Just nudge it in the right direction
Imagine a scenario where you’re tidying up your kitchen, and a robot is assisting you with dishwashing chores. You instruct the robot to pick up a soapy bowl from the sink, but its gripper slightly misaligns with the target. What if you could easily correct its behavior with a simple nudge or a screen interaction? Thanks to a revolutionary framework developed by researchers from MIT and NVIDIA, this futuristic possibility is edging closer to reality.
The unique approach allows users to correct a robot’s actions through intuitive interactions, bypassing the cumbersome process of gathering new data and retraining the robot’s machine-learning model. You can simply point to the desired object, draw a path on a screen to guide it, or manually nudge its arm to redirect its actions effectively. This framework empowers a robot to integrate real-time human feedback and select an action sequence that nearly meets the user’s intention, thereby enhancing its operation right out-of-the-box without needing intricate technical adjustments.
This novel framework achieved a remarkable 21 percent improvement in success rate compared to other methods that don’t benefit from direct human intervention, according to testing results. This advancement holds the potential to allow individuals to seamlessly guide a factory-configured robot to undertake a broad range of tasks across various home settings.
“We can’t expect laypeople to perform data collection and fine-tune a neural network model. The consumer will expect the robot to work right out of the box, and if it doesn’t, they would want an intuitive mechanism to customize it. That is the challenge we tackled in this work,” explains Felix Yanwei Wang, an electrical engineering and computer science graduate student and the leading mind behind the project.
The research, a collaborative effort including Lirui Wang PhD ’24, Yilun Du PhD ’24, and senior author Julie Shah, who helms the Interactive Robotics Group at MIT, as well as Balakumar Sundaralingam, Xuning Yang, Yu-Wei Chao, Claudia Perez-D’Arpino PhD ’19, and Dieter Fox from NVIDIA, will be showcased at the International Conference on Robots and Automation.
Mitigating Misalignment
The quest for aligning robot actions with human intent has seen researchers leveraging pre-trained generative AI models to derive policies—a set of guiding rules that robots adhere to for task completion. These models are adept at solving varied complex tasks and generate valid trajectories based on feasible robot movements seen during training.
While these pathways are technically valid, they don’t always align seamlessly with user intentions in reality. For instance, a robot trained to carefully retrieve boxes from shelves might falter when confronted with a differently arranged bookshelf in a real-world setting. Traditionally, engineers would need to gather new task-specific data and retrain the model, a process that’s both costly and time-intensive.
The MIT research team, however, strived to empower users to refine the robot’s delivery even during actual deployment when errors occur. Yet, merely allowing human corrections could drive the robot into performing invalid actions, like knocking over books while reaching for a desired box.
“We want to allow the user to interact with the robot without introducing those kinds of mistakes, so we get a behavior that is much more aligned with user intent during deployment, but that is also valid and feasible,” states Wang.
User-Friendly Correction Techniques
The innovative framework presents three effortless pathways for users to correct a robot’s course: pointing to the target in the robot’s camera view interface, tracing a trajectory to it on the screen, or manually positioning the robot’s arm towards the target. Each approach carries unique advantages.
“When you are mapping a 2D image of the environment to actions in a 3D space, some information is lost. Physically nudging the robot is the most direct way to specifying user intent without losing any of the information,” Wang elaborates.
Strategic Sampling for Success
To avoid leading the robot into an invalid action, like colliding with other objects, a specific sampling procedure is utilized. This enables the robot to choose actions that align closely with user intent from the set of feasible actions learned during training.
“Rather than just imposing the user’s will, we give the robot an idea of what the user intends but let the sampling procedure oscillate around its own set of learned behaviors,” Wang explains.
In tests and simulations within a toy kitchen environment, this approach allowed the framework to outperform other methods significantly. Though a task might not be instantly completed, users benefit from the ability to immediately tweak the robot’s path if they notice it erring, rather than waiting for it to finish its task before redirecting it.
With repeated nudges, the robot learns corrective actions over time, improving its future interactions by incorporating these adjustments into its operational repertoire. Essentially, continual improvement in robot behavior is made possible through continual user interaction.
In looking ahead, the researchers aim to further enhance the speed of the sampling process while maintaining, if not improving, its efficacy. They are also keen to test robot policy generation within wholly new environments, broadening the landscape for potential applications.