Untether AI Boqueria: AI inference without leaving memory

AI inference processing has a classic bottleneck: moving data from memory to the processor. Untether AI introduces Boqueria, an accelerator that breaks this dynamic. Its massively parallel architecture operates at-memory, meaning right where data is stored, reducing energy consumption and increasing performance per watt. It's not magic, it's well-thought-out engineering.

Untether AI Boqueria chip array processing inference at-memory, data streams flowing directly from stacked memory banks into parallel compute units without crossing a bus, glowing green energy efficiency metric overlay showing zero data movement overhead, while a technician observes a thermal camera display demonstrating reduced heat dissipation, cinematic engineering visualization, futuristic server room background, photorealistic industrial lighting, macro lens focus on silicon die architecture with visible memory layers, ultra-detailed metallic surfaces and fiber optic connections

How Boqueria's at-memory architecture works 🚀

Boqueria integrates thousands of compute cores directly into SRAM memory, eliminating the need to move data across external buses. Each core executes simple operations but in parallel, allowing neural network models to be processed with high efficiency. By minimizing latency and the energy cost of data movement, this chip achieves sustained performance in inference tasks without relying on expensive HBM memory or extreme cooling.

The smart cousin who doesn't need to move to work 🏠

While other accelerators put on a logistical circus to bring data closer to the processor, Boqueria is that colleague who works from home. Literally, it processes information where it lives. So if your GPU sounds like a noisy, hot vacuum cleaner, maybe you should consider a change. After all, you don't need to travel to the other side of the chip to do the math.