DeepSeek has launched its V4 model, an open-weight system that takes on Claude Opus 4.6 and GPT-5.4. With two versions, Pro and Flash, it promises competitive performance at a cost up to seven times lower. Its internal benchmarks show solid results, though independent verification is still pending.
MoE Architecture and One Million Token Context ðŸ§
DeepSeek V4 uses a Mixture-of-Experts architecture to reduce computational costs. The Pro version features 1.6 trillion parameters (49 billion active), while Flash has 248 billion (13 billion active). It supports a one million token context window, far exceeding the 128 thousand of its predecessor. On LiveCodeBench, V4 Pro-Max achieves 93.5%, matching Claude Opus 4.6 Max and Gemini 3.1 Pro.
The Price That Makes OpenAI's Accounts Cry 💸
DeepSeek V4 Pro costs $1.74 per million input tokens and $3.48 per million output tokens. That's up to seven times less than Opus 4.7 and almost nine times less than GPT-5.5. Flash is even cheaper. If real-world performance confirms the promises, the billing teams of competitors will have to start cutting back on specialty coffee expenses.