Who hasn't lost their keys or remote control at home? A team of researchers presents a robotic system designed for that task. It combines 3D mapping and computer vision capabilities with a language model trained with internet knowledge. This provides common sense to deduce where an object might be, guiding a logical search.
The fusion of perception and common sense 🤖
The system operates in two phases. First, the robot explores the room, creating a real-time 3D map and identifying objects with cameras. Then, when searching for a specific item, it consults the language model. This, trained with internet data, suggests likely locations: a mobile phone might be near a power outlet, a book on a table. The robot prioritizes those areas in its scan, making the search more efficient.
Goodbye to the excuse of "I had it here a minute ago" 😅
This development could mark the end of an era: the one of wandering around the house opening drawers at random and blaming the house gnomes. Now, a robotic witness with access to the collective wisdom of the internet will be able to point out, with data, that your charger usually ends up behind the sofa cushion. Get ready for your own lack of logic to be documented and analyzed by a machine.