Meta has introduced the second generation of its inference accelerator, the MTIA v2, codenamed Artemis. This chip is not designed for gaming or generative text AI, but for a very specific task: making Facebook and Instagram's recommendation algorithms run faster and with greater energy efficiency.
A specific chip for the recommendation engine 🚀
The MTIA v2 is an inference accelerator focused on low-precision Deep Learning models, such as those used by Meta's ranking and recommendation systems. With 256 cores and 128 MB of SRAM memory, Artemis delivers performance of up to 102.4 TOPS (INT8). Its 5nm TSMC design allows for 90W power consumption, optimizing the balance between speed and heat for servers. The key lies in its data architecture, which reduces latency in embedding and product search tasks.
Artemis: because your Reels feed isn't going to recommend itself 🔥
So, Meta has built a specific processor so the algorithm can decide if that video of a cat playing the piano deserves to be in your feed before your aunt's recipe. Now, instead of waiting for a generic server to calculate it, Artemis does it in a flash and using less power. All so you stay glued to the scroll watching things you didn't even know you wanted to see. Energy efficiency is an excuse; the real goal is that you can't put your phone down.