DeepSeek V4: 75% Price Cut, Huawei Chips, and the Trust Gap

DeepSeek V4 — Crowd Intelligence Report

SEO Brief

SEO title: DeepSeek V4: 75% Price Cut, Huawei Chips, and the Trust Gap Meta description: DeepSeek V4 Pro scores 80.6% on SWEbench at 11x less than GPT5.5. The price cut is permanent, it runs on Huawei chips, and governments are banning it. Canonical path: /research/deepseekv4 Primary search intent: Understand whether DeepSeek V4 is worth switching to from GPT or Claude, given its low price and Chinaorigin tradeoffs. Target keywords: DeepSeek V4 review, DeepSeek V4 vs GPT5.5, DeepSeek V4 vs Claude, DeepSeek V4 pricing, DeepSeek V4 worth it, DeepSeek V4 Pro benchmark, DeepSeek Huawei Ascend, is DeepSeek V4 good

Report Status

Readiness: publishableseed (90.0/100) Generated: 20260603T09:58:27.282534+00:00 Entity type: topic Industry: Artificial Intelligence / Foundation Models Data foundation: 3,653 content items, 1,276 extracted opinion units, 84 entity insights, 37 sampled evidence links.

The Model That Made Everyone Check Their AI Bill

DeepSeek V4 is a family of large language models built by DeepSeek, a Chinese AI lab headquartered in Hangzhou, Zhejiang. The company was founded in July 2023 by Liang Wenfeng, the cofounder of HighFlyer, a quantitative hedge fund that bankrolled the lab’s early research. DeepSeek has no outside investors it survived on hedge fund profits until June 2026, when it began raising its first external round at a reported $59 billion valuation, with Tencent and CATL among the backers.

V4 launched on April 24, 2026 in two variants: V4Pro (1.6 trillion parameters, flagship) and V4Flash (284 billion parameters, cost tier). Both ship with a 1milliontoken context window and open weights under the MIT License on HuggingFace. What makes V4 unusual is not the architecture it is the price. V4Pro matches GPT5.2 on agentic benchmarks while costing roughly onesixth as much. The model is also the first major frontier release optimized for Huawei Ascend AI processors rather than NVIDIA hardware, which means it runs on infrastructure that is not subject to U.S. export controls.

For everyone else, the question is simpler: why are you still paying full price?

"It does not matter if you are 10% better if your competition is 90% cheaper." YouTube commenter @mattbenz99

"If frontier cloud models are that overpriced for equivalent quality, it makes me question how much of my daily work really needs cloud at all." r/LocalLLaMA user after benchmarking production workloads against DeepSeek V4 pricing

r/LocalLLaMA: DeepSeek V4 being 17x cheaper got me to actually try it

The 75% Price Cut That Became Permanent

On May 22, 2026, DeepSeek made its 75% promotional discount on V4Pro permanent. The discount had originally been framed as expiring May 31, but instead of letting it lapse, DeepSeek locked it in as the new standard rate. The standing prices are now:

V4Pro (flagship): $0.435 per million input tokens (cache miss), $0.87 per million output tokens. Cached input: $0.003625/MTok the lowest firstparty frontiermodel cache price on the market.

V4Flash (cost tier): $0.14 per million input tokens (cache miss), $0.28 per million output tokens.

For context, GPT5.5 Standard costs $5.00/$30.00 per million tokens. That makes V4Pro roughly 11.5x cheaper on input and 34.5x cheaper on output. On SWEbench Verified, V4Pro scores 80.6%, tying Gemini 3.1 Pro and nearly matching Opus 4.6 at 80.8% at about 1/11th the input price and 1/29th the output price.

The reaction on r/LocalLLaMA was a mix of disbelief and spreadsheet warfare. Developers started sharing actual cost comparisons from their production workloads, and the numbers were compelling enough to make people question their entire cloud AI budget.

"The $6M figure is only the compute cost of training the base model and does not include RL, R&D, experimentation, or failed runs." YouTube commenter @joandaniels7461

But one Redditor topped up $20 to try V4 Pro as a replacement for their expiring Copilot subscription and was disappointed: "To my horror, it was not as good as advertised." The only upside they cited was the very cheap price. The pricing story is powerful, but it has to survive contact with real work.

"The only pro is the very cheap price." u/[deleted] on r/DeepSeek, after testing V4 Pro against Opus 4.7

https://www.youtube.com/watch?v=bAtuUeRWV3w

Built on Huawei, Not NVIDIA

V4 is the first frontier model trained and deployed on Huawei Ascend processors rather than NVIDIA hardware. Huawei confirmed its chips were used for part of V4Flash’s training and that V4 runs on Ascend 950based supernode clusters. For Chinese AI application teams facing restricted NVIDIA access, this stack represents what one YouTube commenter called "the most practical path forward" for domestic AI development.

The strategic implications extend beyond China. Multiple Western and Asian governments have banned their institutions from using DeepSeek, citing data privacy concerns. The U.S. State Department sent a diplomatic cable to embassies worldwide instructing staff to warn foreign governments about alleged IP theft by DeepSeek and other Chinese AI firms. DeepSeek faces scrutiny over whether the NVIDIA chips it acknowledged using were subject to export restrictions.

"DeepSeek v4 is trained and run on Huawei Ascend, which beats Nvidia at scale. And other Chinese domestic AI labs will follow suit." YouTube commenter on a V4 infrastructure analysis video

On Reddit, someone demonstrated running DeepSeekV4 locally on four legacy RTX 2080 Ti GPUs within a $2,000 budget, highlighting custom Turing kernels and W8A8 quantization at 255 prefill tokens per second. The full FP8 V4Pro checkpoint is roughly 862 GB and needs about 900 GB of VRAM in production practically, eight H100s, eight MI300Xs, or two Blackwell B200 nodes. The hardware story positions DeepSeek as more than a model. It is the anchor of an alternative infrastructure ecosystem that does not depend on American chips.

The Censorship Jokes and the Server That Is Always Busy

Not everything in the DeepSeek conversation is admiration. On YouTube, benchmark comparisons frequently mix up DeepSeek V3, R1, and V4, making it hard to know which model is actually being evaluated. On the censorship front, DeepSeek’s refusal to engage with Taiwan or Tiananmenrelated prompts is a running joke but it also generates real concern from users who wonder what other topics might be silently filtered.

"DeepSeek will admit anything except that Taiwan is a real country." YouTube commenter @PhoveusMLBB2025

Users are explicitly warning not to send sensitive information to DeepSeek, while others debate whether Chinaspecific risk is materially different from U.S. or EU cloud risk. The "server is busy" errors that surface even during offpeak hours compound the reliability perception problem. For a model competing primarily on price, being cheap but unreachable is a losing proposition.

https://www.youtube.com/watch?v=E2Am7aEyQ5I

The Integration Tax Is Real

Developers trying to use DeepSeek V4 in production are surfacing concrete integration issues. The API breaks when combining reasoningcontent with toolchoice parameters, causing 400 and 500 errors that halt agent workflows. The compatibility layer designed to let teams swap DeepSeek into Claude or GPT pipelines does not support image input, Anthropic prompt caching semantics, or MCP server tools.

On Reddit, a project called "DeepClaude" attempting to run the full Claude Code agent loop on DeepSeek V4 Pro documents these gaps explicitly. For developers evaluating a migration, the price advantage is clear but the integration tax is real. You cannot simply swap DeepSeek in and expect your existing toolchain to work.

"Image input is not supported through the compatibility layer, Anthropic prompt caching semantics do not carry over, and MCP server tools do not work through the proxy." DeepClaude project documentation on r/ClaudeCode

r/ClaudeCode: DeepClaude — full Claude Code agent loop

Who Is Actually Using It in Production

Despite the integration friction, the production adoption signal is real. V4Pro is recommended for complex agent workflows, deep codebase analysis, and multistep reasoning tasks where you need nearfrontier performance without paying frontier prices. The savviest organizations are already running tiered model strategies using DeepSeek V4Flash for routine agent steps and routing only complex decisions to premium providers like Claude or GPT5.5.

On coding and software engineering tasks, developers broadly confirm V4Pro performs within striking distance of top closed models. One TikTok commenter who runs QA across "over a hundred different tests" described V4Pro as "the cheapest SOTA model at 1/20th the cost of Opus 4.7." Another r/technology poster called it "near stateoftheart intelligence at 1/6th the cost."

"DeepSeek V4 Pro is about 17 times cheaper for the same agentic workload." r/LocalLLaMA post benchmarking V4 Pro against GPT5.2 on FoodTruck Bench

https://www.tiktok.com/@barneslucas/video/7632405284085910804

The Pricing Precedent Nobody Can Ignore

DeepSeek V4 represents the most serious price pressure the frontier model market has seen. It is not disrupting on quality alone it is disrupting on the economics of AI inference, and doing it on nonNVIDIA hardware that is not subject to export controls. The decision to lock in the 75% discount one month after launch suggests DeepSeek is prioritizing market share over perunit revenue, positioning V4 as the default for applications processing large documents, codebases, or conversational histories where token costs compound fast.

For agentic workflows requiring reliable tool calling, the API compatibility issues are a genuine blocker today. For teams that need image processing or prompt caching, the compatibility layer gaps mean a rewrite, not a swap. And for anyone handling sensitive data, the Chinaorigin concerns and governme