The Production of Local Chips: How China Built Its Own AI Ecosystem

quietly_staking · 2026-03-23T18:24:45+00:00

Years ago, the world struggled to understand the true meaning of the digital industry. Now, what is production in the midst of the AI revolution? It is not simply hardware manufacturing. It is the building of a complete ecosystem where innovation and

quietly_staking

2026-03-23 18:24:45

Years ago, the world was struggling to understand the true meaning of the digital industry. Today, what is production amid the AI revolution? It’s not just hardware manufacturing. It’s the creation of a complete ecosystem where innovation and independence come together. Over more than six years, China has transitioned from hoping to develop its own solutions to truly gaining power in computing. This story begins with a crisis and leads to a revolution.

The Bottleneck Not Hardware: Why CUDA Is the Real Barrier

In 2018, ZTE experienced a crippling event. A decree from the U.S. Bureau of Industry and Security paralyzed a multinational corporation with 80,000 employees and annual revenue reaching billions of dollars overnight. No Qualcomm chips, no Google Android license—the entire operation was halted.

But at a deeper level, the real problem isn’t just hardware. It’s an ecosystem called CUDA.

CUDA, or Compute Unified Device Architecture, is a parallel computing platform introduced by NVIDIA in 2006. During the deep learning revolution, it became the foundation of the entire AI industry. Training large AI models is essentially massive matrix operations—and here, GPUs are unique. NVIDIA built a complete ecosystem over ten years, from hardware to software tools, using CUDA.

Today, all major AI frameworks like Google’s TensorFlow and Meta’s PyTorch rely heavily on CUDA. Every AI researcher starts coding in CUDA from day one. The result is a relentless flywheel: more developers use it, more tools are created, the ecosystem grows, and more developers join.

By 2025, the CUDA ecosystem has over 4.5 million developers and covers more than 3,000 GPU-accelerated applications. Over 90% of AI developers worldwide depend on this ecosystem. How do you replace this foundation? Yes, it’s possible—but you’d have to rewrite all the experience, tools, and code accumulated by brilliant minds over a decade. Who will bear this cost?

From Inference to Training: The Qualitative Shift in Local Production

Faced with ongoing U.S. regulations—initially in 2022 with NVIDIA’s A100 and H100, then in 2023 with A800 and H800, and finally in 2024 with H20—Chinese AI companies refuse to give up direct competition. They choose a different path: algorithm-level optimization.

From late 2024 to 2025, Chinese AI companies shifted to a technical direction: mixture of experts (MoE) architecture. DeepSeek’s V3 is an example. It has 671 billion parameters, but during each inference, only 37 billion are active—just 5.5% of the total. Training costs are $5.576 million, compared to an estimated $78 million for GPT-4.

This algorithmic optimization is directly reflected in cost: DeepSeek API charges $0.028–$0.28 per million tokens for input, while GPT-4 is $5. The Claude Opus costs $15. China is 25 to 75 times cheaper.

But the real breakthrough isn’t just price. By 2025, AI applications shifted from simple conversations to agent scenarios, where token usage increases 10 to 100 times. At this volume, price becomes a decisive factor.

Now, local production has expanded from inference to training—a qualitative leap. Inference only runs pre-trained models; training requires higher computing power, interconnect bandwidth, and a robust software ecosystem.

The Production Ecosystem: From Jiangsu Xinghua to Global Deployment

Amid industrial transformation, a server production line was established in Jiangsu Xinghua, a small city known for stainless steel and healthy food. From contract signing to operation, it took only 180 days. The production line uses two local chips: the Loongson 3C6000 processor and the TaiChu Yuanqi T100 AI accelerator.

This is a milestone: today’s production isn’t just about volume but capability. When fully operational, one server is produced every 5 minutes, totaling 100,000 units annually.

More importantly, clusters of thousands of local chips have begun to support real training of large models. In January 2026, Huawei announced GLM-Image—the first SOTA image generation model trained entirely with local chips. In February, China Telecom completed the full Xingchen model training in its local compute pool in Shanghai Lingang.

The main driver behind this is Huawei’s Ascend series. By the end of 2025, the Ascend ecosystem has 4 million developers and over 3,000 partners. Forty-three major large models have completed pre-training using Ascend, along with over 200 open-source models adapted.

On March 2, 2026, Huawei unveiled the new SuperPoD infrastructure. The FP16 computing power of Ascend 910B is equivalent to NVIDIA A100. If the ecosystem develops while chips improve—without waiting for perfection—the upgrade speed is extremely rapid.

Energy Advantage and Computing Power: Why China Has the Lead

In early 2026, Virginia stopped issuing new data center permits. Followed by Georgia, Illinois, and Michigan. The U.S. grid is exceeding capacity.

In 2024, U.S. data centers consumed 183 terawatt-hours of electricity—4% of total consumption. By 2030, this is projected to reach 426 TWh, or 12%. Arm’s CEO said AI data centers could consume 20–25% of U.S. electricity by 2030.

While the U.S. worries about power, China has a significant advantage. China’s annual electricity generation is 10.4 trillion units, compared to 4.2 trillion in the U.S.—only 2.5 times more. More importantly, residential use in China accounts for just 15% of total, versus 36% in America. This means more industrial capacity is available for computing.

Electricity prices in western China’s industrial areas are about $0.03 per kWh, while in U.S. AI hubs it’s $0.12–$0.15. China has a 4- to 5-fold advantage.

Tokens and Markets: The Growing New Digital Commodity

Today, as the U.S. invests in energy infrastructure, China exports a new product: tokens. A token is the smallest unit of information an AI model can think of—it has become a digital commodity produced in computing power factories and sent worldwide.

The distribution of DeepSeek users tells the story: 30.7% from China, 13.6% from India, 6.9% from Indonesia, 4.3% from the U.S., 3.2% from France. It supports 37 languages and is rapidly expanding into emerging markets like Brazil.

In February 2026, on the OpenRouter platform, weekly usage of Chinese AI models surged 127% over three weeks, surpassing the U.S. for the first time. A year earlier, Chinese models’ market share was below 2%. Now, it’s reached 6%, a 421% increase.

On February 27, 2026, three local chip companies reported results. Cambrian’s revenue soared 453%, achieving full-year profitability for the first time. Moore Threads rose 243% but still posted a net loss of $1 billion. Muxi increased 121% but incurred nearly $8 billion in losses.

Half fire, half water. The fire is market demand. The 95% space released by Huang Renxun is filled with revenue numbers from local companies, one by one. The market needs a second option without NVIDIA. This is an unprecedented structural opportunity.

Lessons from History

Eight years ago, a remarkable framework unfolded. In 1986, the US-Japan Semiconductor Agreement reshaped the industry. Japan led—until 1988, controlling 51% of the global semiconductor market, compared to 36% for the U.S. Six of the top ten companies were Japanese.

But through strategic trade measures, regulatory scrutiny, and supporting Korean competitors, the U.S. reduced Japan’s DRAM share from 80% to 10%. The former giants were divided, bought out, or faded away.

The tragedy: Japan was content with a global division of labor where they were the best manufacturer but did not build an independent ecosystem. When the wave receded, they had only manufacturing.

China’s current AI industry faces similar pressure but has chosen a harder path: extreme algorithmic optimization, local chip development from inference to training, ecosystem building with 4 million Ascend developers, and global token distribution. Every step is creating an independent industrial ecosystem.

The sea is costly for ecosystem building. Every loss is real money spent to follow CUDA. These are learning costs, software subsidies, and engineer deployment expenses. It’s the necessary tax for digital independence.

These three financial reports more honestly reflect the true state of competition than any industry report. It’s not a celebration but a fierce battle for position, where soldiers rise amid bleeding.

But the form of digital has truly changed. Eight years ago, the question was “Can we survive?” Now it’s “What price must we pay to survive?” And that price itself is progress.

View Original

This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.

1 Likes