Why did DeepSeek V4 Pro API prices drop?

DeepSeek permanently lowered its V4 Pro API prices by 75% due to technical optimizations in its MLA architecture and Engram system, which reduce hardware costs.

What are the new prices for DeepSeek V4 Pro API?

The new prices are ¥0.025 per million tokens for input when cache hits, and ¥6 per million tokens for output. This is about 2-3% of competitor prices like GPT-5.5.

How does this price drop affect AI developers?

This significant cost reduction makes AI application development more affordable, especially for independent developers, startups, and students who previously found API costs too high.

When did the new DeepSeek V4 Pro API prices become effective?

The new, permanent pricing structure became effective on May 31, 2026, after the previous limited-time discount ended.

What is DeepSeek's strategy with these low API prices?

DeepSeek aims to build a large developer ecosystem by offering very low API costs and also plans to release open-source model weights, prioritizing AGI breakthroughs over short-term profit.

DeepSeek V4 API Price Drops 75% Permanently for Users

DeepSeek-V4-Pro 模型的 API 调用价格将永久性地降至原价的四分之一。此次价格调整，原本是作为一项限时优惠活动，但现已转变为一项长期的定价策略。

调整后的价格体系，尤其是在输入缓存命中时，其每百万 tokens 的费用降至 0.025 元。输出的定价则为 6 元/百万 tokens。与此前的优惠活动（截至 2026 年 5 月 31 日的 2.5 折）相比，这意味着该模型将持续以接近该折扣水平的价格提供服务。

DeepSeek 方面表示，此次降价并非单纯的促销手段，而是基于其 MLA (Multi-Head Latent Attention) 架构 和 Engram 系统 的技术优化。MLA 架构旨在大幅压缩注意力机制的显存占用，而 Engram 系统则将大部分静态知识存储在 CPU DRAM 中，由 GPU 负责核心推理任务，以此提高 GPU 显存利用率，并显著摊薄硬件成本。

"这不是短暂的促销，而是一次实打实的定价策略转向。"

与市场上其他大型语言模型相比，DeepSeek-V4-Pro 的定价显示出显著的成本优势。有分析指出，其输入输出价格可能仅为 GPT-5.5 等竞品的 2% 到 3%。例如，对于代码生成、长文档分析等高 Token 消耗场景，DeepSeek-V4-Pro 的单次调用成本预计将远低于竞争对手。

技术支撑与成本结构

DeepSeek 团队宣称，其技术创新是此次价格调整的基石。V4 模型在 V2 的 MLA 架构基础上进行了进一步优化，将单次推理的显存开销降低了约 60%。此外，对 华为昇腾 910B 芯片的深度算子适配，旨在提升国产芯片的性能与成本效益。

Engram 系统，被描述为一种“冷热分离”架构，将 80% 的静态知识置于 CPU DRAM，将 GPU 资源集中于高效推理，从而实现了数倍的 GPU 显存利用率提升。

影响与潜在开发者生态

此次价格调整被认为将对 AI 应用的开发成本产生实质性影响，尤其对于预算有限的独立开发者、初创公司及学生群体。原本因成本过高而难以实现或扩展的 AI 应用场景，现在可能变得更具可行性。

DeepSeek 同时披露了其 700 亿元的融资计划，并强调 AGI (Artificial General Intelligence) 技术突破优先于短期商业化。这一策略暗示，公司短期内可能不会因融资压力而大幅提价，反而倾向于通过低价策略构建其开发者生态。

竞品对比与模型选择

DeepSeek 提供了 DeepSeek-V4-Flash 和 DeepSeek-V4-Pro 两款模型，以满足不同应用需求。Flash 模型侧重快速响应，适合简单任务和高并发场景（支持 2500 并发），而 Pro 模型则专注于深度推理，适用于复杂任务（支持 500 并发），且在处理代码生成、架构设计等领域具有更高质量的输出。

在缓存命中场景下，Flash 模型的价格低至 0.0028 美分/M tokens，Pro 模型也仅为 0.003625 美元/M tokens，这比许多本地部署的开源模型成本还要低。

背景：价格战与生态构建

AI 模型 API 的价格一直是行业关注的焦点。DeepSeek 的此次举动，在竞争日益激烈的大模型市场中，无疑是对现有价格基准的一次冲击。通过大幅降低 API 调用成本，DeepSeek 意图吸引更多开发者使用其模型，从而快速构建和扩大其生态系统。这种策略与 Meta 当年开源 LLaMA 的逻辑有相似之处，即先建立生态壁垒，再图谋商业化。DeepSeek 的“极低价 API + 开源权重”的双轨路线，则进一步增加了其对开发者的吸引力。

价格调整自 2026 年 5 月 31 日 优惠活动结束后正式生效。

DeepSeek V4 API Price Drops 75% Permanently for Users

技术支撑与成本结构

影响与潜在开发者生态

竞品对比与模型选择

背景：价格战与生态构建

Frequently Asked Questions

NewsRadar

The Present

Search Records

Explore

DeepSeek V4 API Price Drops 75% Permanently for Users

技术支撑与成本结构

影响与潜在开发者生态

竞品对比与模型选择

背景：价格战与生态构建

Frequently Asked Questions

Know What Changed

AI Assistant Changes How It Answers Questions

McKinsey uses AI instead of coaches for interview help

AI development slows down, focusing on specific tasks

New SuperClaude Software Framework Announced for Complex Workflows

Google DeepMind Gemma 4 release date 24 May 2026 runs on local PCs

NewsRadar

The Present

Search Records

Explore