DOMINO and PUMA: Advancing Dynamic Robotic Manipulation with VLA

Published on March 17, 2026 | Translated from Spanish

Vision-Language-Action (VLA) models dominate static manipulation, but their performance drops in dynamic scenarios with moving targets. This bottleneck is due to the lack of specific training data and architectures that rely on single-instant observations, limiting their spatio-temporal reasoning. We present DOMINO, a massive dataset for dynamic manipulation, and PUMA, a VLA architecture that integrates historical optical flow for motion-aware perception. 🤖

Robotic arm interacting with a moving cube on a surface, illustrating dynamic manipulation.

Methodology: DOMINO Dataset and PUMA Architecture for Implicit Prediction 🧠

DOMINO is a comprehensive benchmark with 35 hierarchically complex tasks, over 110,000 expert trajectories, and a multidimensional evaluation system. To leverage this data, we propose PUMA, an architecture that overcomes the single-observation limitation. PUMA innovatively integrates scene-centered historical optical flow and specialized world queries. This design couples historical context perception with short-horizon prediction, allowing the model to implicitly infer future states of moving objects, which is crucial for successfully interacting with them.

Dynamic Awareness: An Improvement that Transcends the Dynamic âš¡

The results show that PUMA achieves an absolute improvement of 6.3% in success rate over baselines in dynamic tasks. Beyond that, training with DOMINO's dynamic data generates robust spatio-temporal representations that improve performance even in static manipulation tasks. This suggests that dynamic awareness is not a specialized module, but a fundamental capability that enriches the robot's general understanding of its environment.

How can VLA (Vision-Language-Action) models overcome the limitations of static manipulation to robustly handle real-time dynamic interaction with moving objects?

(P.S.: Simulating robots is fun, until they decide not to follow your orders.)