ROS-LLM Connects Language Models like GPT with Robots via ROS

A team from Huawei, the Technical University of Darmstadt, and ETH Zürich presents ROS-LLM, an open-source framework that acts as a bridge between advanced language models and the robot operating system (ROS). Published in Nature Machine Intelligence, it allows giving orders to a robot in natural language, such as pick up the blue cup, and for the robot to execute the corresponding physical action without the need for specific programming by the user. 🤖

A kitchen robot receives the order pick up the blue cup from an interface with GPT text, while its robotic arm moves toward the dishes.

How the framework works: from instruction to executable action ⚙️

ROS-LLM receives a task in natural language and breaks it down into a sequence of atomic steps that the robot can understand. It consults a knowledge base of preprogrammed skills by experts and generates planning and execution code compatible with ROS. The system supports direct execution modes, step by step, or by confirmation, and can learn new sequences by demonstration, combining base skills to solve unforeseen tasks.

Goodbye to manuals, hello to literal misunderstandings 😅

The promise is clear: you no longer need to learn to program a robotic arm. Just tell it what you want. That said, you have to be extremely precise. An order like clean up this mess a bit could end with the robot filing away that important report that was messy on the table, or watering the plant with coffee. The era of accessible robotics also inaugurates the era of semantic arguments with your iron assistant.