The race for AI efficiency has a new contender. Etched has unveiled the Sohu, an ASIC chip designed from the ground up to exclusively run Transformer models like Llama or GPT. Forget general-purpose GPUs; this application-specific integrated circuit promises to accelerate inferences at speeds that leave any conventional hardware behind.
Fixed architecture vs. flexibility: the necessary sacrifice 🎯
Unlike GPUs, which handle any graphics or compute workload, the Sohu is a functional monolith. Its circuitry is optimized to the millimeter for the key operations of Transformers: attention, projections, and feed-forward layers. By eliminating the overhead of general programmability, it achieves much higher performance per watt. The downside is obvious: if a different AI architecture emerges tomorrow, the chip will become obsolete.
The drama of owning a Ferrari that can only go in a straight line 🏎️
Imagine buying a race car that is incredibly fast but only works on a straight toll highway. That's the Sohu. While GPUs are like a van that carries everything, this ASIC is an F1 that locks up at the first roundabout. If you are a company that lives and dies by Llama, it's your ace in the hole. For everyone else, it will be a waiting game to see if the market decides specialization pays the bills.