Gemini 3.5 Flash: fast and cheap intelligence for developers

Published on May 23, 2026 | Translated from Spanish

Google has launched Gemini 3.5 Flash, the first model in its new family, designed to deliver frontier intelligence at superior speed and for less than half the price of its rivals. According to Google DeepMind, this model produces four times more tokens per second than others, outperforming Gemini 3.1 Pro in key benchmarks such as Terminal-Bench 2.1 and CharXiv Reasoning. It is the first to simultaneously occupy the top-right quadrant of the Artificial Analysis index, combining high intelligence and speed.

photorealistic technical illustration of a glowing blue neural network processor chip being assembled by robotic arms in a cleanroom, the chip emits rapid pulsing light trails representing four times faster token generation, floating holographic benchmark graphs showing Terminal-Bench 2.1 and CharXiv Reasoning scores, price tag icons with slash marks indicating half cost, the chip occupies the top-right quadrant of a glowing Artificial Analysis index grid, robotic arms move with visible speed lines, cool blue and white industrial lighting, ultra-detailed circuit traces on the chip surface, microscopic precision tools in action, cinematic engineering visualization

How the new generation of models works ⚡

The Gemini 3.5 Flash architecture optimizes parallel processing, reducing latency and increasing performance without sacrificing accuracy. In internal tests, the model shows notable improvements in visual reasoning and execution of complex terminal tasks. Being more efficient, it allows developers to run applications that previously required expensive hardware, lowering the cost per query. Google is betting on democratizing access to high-performance models, directly competing with slower and more expensive solutions on the market.

The AI that responds before you finish asking 🤯

Gemini 3.5 Flash is so fast that it has likely already generated a response before you finish reading this sentence. At this rate, we will soon see models that answer questions we haven't even asked yet. Meanwhile, rivals watch with envy as Google sells intelligence at bargain prices, making paying more for fewer tokens seem almost like old-school fraud.