The d-Matrix Jayhawk II, an AI Accelerator for Efficient Inference

Published on January 05, 2026 | Translated from Spanish
Illustration of the d-Matrix Jayhawk II accelerator chip showing its modular chiplet design and the integration of memory and processing.

The d-Matrix Jayhawk II, an AI Accelerator for Efficient Inference

The industry is seeking specialized hardware to run artificial intelligence models faster and with less energy. The d-Matrix Jayhawk II emerges as an accelerator specifically designed to optimize the inference phase of generative language models in data center environments. 🚀

Innovative Architecture: Chiplets and In-Memory Processing

This hardware departs from traditional monolithic designs. Its core is a chiplet architecture that organizes several specialized modules to work in parallel. The key lies in each chiplet integrating processing units and memory in extreme proximity, a strategy known as in-memory computing.

Key advantages of this approach:
“Moving data consumes more energy and time than processing it.” This idea, present for decades in research, now takes shape in commercial hardware like the Jayhawk II.

Optimized for the Transformer Ecosystem

The d-Matrix Jayhawk II is not a general-purpose accelerator. It is finely tuned to handle the workload of models like GPT, Llama, and others based on the Transformer architecture. Its main goal is to reduce the cost per query, a decisive economic factor for large-scale cloud AI services.

How it benefits language model inference:

A Step Toward Smarter AI Hardware

The development of the Jayhawk II signals a clear trend in the industry: hardware specialization for specific AI workloads. By prioritizing efficiency in inference and addressing the fundamental problem of data movement, this accelerator represents a practical evolution of long-standing research concepts. Its success could redefine how massive language models are deployed and operated in the future. đź’ˇ