Ark Invest: The Current State and Future of AI Infrastructure

Source: Frank Downing, Ark Invest; Compilation: Jinse Finance Claw

AI infrastructure spending is surging explosively

Over the three years since ChatGPT’s release, demand for accelerated computing has grown explosively. NVIDIA’s annual revenue has jumped nearly 8x, from $27.0 billion in 2022 to $216.0 billion in 2025; the market consensus expects it will grow another 62% in 2026 to reach $350.0 billion. The growth rate of global data center systems investment (including computing, networking, and storage hardware) has accelerated from an average of 5% per year during the preceding decade through 2022 to 30% over the past three years; it is expected to grow another 30% in 2026 to reach $653.0 billion.

ARK’s research shows that accelerated computing (relative to general-purpose CPUs), driven by GPUs and AI-dedicated application-specific integrated circuits (ASICs), now dominates server investment, accounting for 86% of compute server sales.

Cost crashes drive adoption

The momentum behind continually increasing spending on accelerated computing infrastructure needed to run AI models is coming from the ongoing expansion of generative AI use cases across both consumer and enterprise settings, as well as the demand for training more capable foundational models in the pursuit of “superintelligence.”

The rapid decline in costs is further accelerating demand growth. According to our research, AI training costs fall 75% year over year. Inference costs fall even faster—among the benchmarks tracked by Artificial Analysis, models with scores above 50% have median costs with annualized decline of up to 95%.

Two forces together have driven the sharp drop in costs: first, industry leaders such as NVIDIA are releasing new products every year, delivering performance improvements from one generation of hardware to the next; second, software-side algorithm improvements that continuously raise training and inference efficiency on the same hardware.

Consumers and enterprises both send strong demand signals

The pace at which consumers adopt AI is clearly faster than the speed at which consumers adopted the internet back then. AI penetration expands to about 20% within three years—more than twice as fast as consumers’ shift to the internet.

Enterprise demand is also growing at an astonishing speed. For example, based on OpenRouter data, since December 2024, token demand has increased 28x.

Over the past two years, AI lab Anthropic—favored most by enterprise customers—achieved roughly a 100x astonishing revenue growth: from $100 million in annualized operating revenue at the end of 2023, rising to an estimated $8.0 billion to $10.0 billion by the end of 2025. Anthropic’s growth momentum in 2026 is continuing: this year in February, it announced annualized revenue of $14.0 billion and completed a $30.0 billion funding round, valuing the company at $380.0 billion.

OpenAI, competing simultaneously on both fronts—consumers and enterprises—has also seen strong growth among enterprise users; as of November 2025 it has 1.0 million enterprise customers. As Chief Financial Officer Sarah Friar described, OpenAI’s enterprise revenue growth rate is faster than its consumer business, and it expects enterprise revenue will account for 50% of the company’s total revenue in 2026. In a blog post in January 2026, Friar also laid out the rationale for further investing in infrastructure: over the past three years, OpenAI’s revenue and its compute capacity have grown in direct proportion.

Private markets provide funding for AI buildout

To meet strong demand signals, large-scale infrastructure investment has become necessary. According to Crunchbase data, in 2025, private AI lab funding exceeded $200.0 billion, with about $80.0 billion flowing to foundational model developers such as OpenAI, Anthropic, and xAI. In public markets, mega-scale cloud computing companies are tapping their cash reserves and seeking other financing options to support their AI capital expenditure plans—by 2026, the spending could be as high as $700.0 billion.

Reportedly, the $30.0 billion deal between Meta and Blue Owl is the largest private-capital transaction on record. The transaction is structured through a joint venture, financed primarily via debt; its special purpose vehicle (SPV) structure will keep the project debt off Meta’s balance sheet, which has already sparked significant controversy.

AMD and other vendors become strong challengers to NVIDIA

Beyond physical data centers, compute chips have always been at the core of AI capex. NVIDIA has been at the forefront of the accelerated computing era, but now the biggest buyer of AI chips is trying to increase the AI compute power gained per dollar invested. Since acquiring ATI Technologies in 2006, Advanced Micro Devices (AMD) has long sold GPUs alongside NVIDIA in the consumer market, and today has become an emerging competitor in the enterprise market as well. Since launching the EPYC product line in 2017, AMD’s share in the server CPU market has grown from nearly zero in 2017 to 40% in 2025.

For inference of small models, AMD GPUs are now roughly on par with NVIDIA in total cost of ownership (TCO) versus relative performance. TCO takes into account both the upfront purchase cost of the chip (capital expenditure) and the operating cost of the chip over its service life (operating expenditure). Performance benchmarks use SemiAnalysis’s InferenceMax metric, with tokens processed per GPU per second when optimized for throughput as the measure; cost benchmarks use SemiAnalysis’s estimates of capital expenditure and operating expenditure per hour.

Although AMD has already “caught up” in small-model performance, NVIDIA still maintains a significant lead in large-model performance—see the chart below.

NVIDIA’s rack-level Grace Blackwell solution links 72 Grace Blackwell GPUs (GB200) together, making them operate like an ultra-large GPU with shared memory. This tight interconnect between chips strengthens large-model inference capability—large models need to distribute model weights across multiple GPUs, requiring more communication bandwidth than small models. To close the gap before NVIDIA’s Vera Rubin launch, AMD’s rack-level solution is planned for release in the second half of 2026. So far, AMD has won orders at customers including Microsoft, Meta, OpenAI, xAI, and Oracle.

Hyperscalers lead the custom-chip revolution

In addition to commercial GPU suppliers, hyperscalers and AI labs also want to control NVIDIA’s influence by developing their own chips, reducing AI compute costs. For more than a decade, Google has been designing its own AI-dedicated integrated circuits—tensor processing units (TPUs)—to run recommendation models for its search business, and it has optimized the latest-generation TPU v7 for generative AI. SemiAnalysis estimates that by using self-developed TPUs to process internal workloads, Google can reduce the cost per computation by 62% compared with NVIDIA. Anthropic and Meta are using Google’s TPUs to expand their compute capacity, which may validate that the 62% estimate is not far off the real-world outcome.

Amazon’s Trainium chips appear to be a similarly advanced solution. After acquiring Annapurna Labs in 2015, Amazon was among the first to develop custom chips for its cloud business: it expanded its ARM-based Graviton CPUs and Nitro data processing units (DPUs) to support key compute power for Amazon Web Services (AWS). Recently, Amazon announced that in 2025, Graviton has provided more than half of AWS’s incremental CPU compute power for the third consecutive year. Beyond using TPUs, Anthropic also uses AWS and Trainium as its preferred training platform.

Microsoft only entered the custom-chip arena in 2023, launching its AI accelerator Maia 100, but at the time it did not focus on generative AI; now its second-generation product is being rolled out, targeting AI inference scenarios.

Broadcom leads the custom-chip services market

Google and Amazon focus on front-end chip design (architecture and functionality), while backend design partners handle turning that logic into silicon, managing advanced packaging, and coordinating production with foundries such as TSMC. Against the backdrop of challenges in Intel’s foundry business, TSMC has become the preferred partner for most major AI chip projects, while Broadcom has become the leading backend design partner for Google’s TPU, Meta’s MTIA, and the custom chips OpenAI is set to launch in 2026. Apple has always focused on handling end-to-end design for its phone and PC chips, but there have been reports that it may also be working with Broadcom to develop AI chips. Citigroup predicts Broadcom’s AI revenue could grow fivefold over the next two years, from $20.0 billion in 2025 to $100.0 billion in 2027.

Amazon’s Trainium R&D path is rather unusual among peers—reportedly, Trainium 2 worked with Marvell, and then after Marvell underperformed, Trainium 3 and Trainium 4 shifted to Alchip. Amazon’s ability to change backend partners shows that vertical integration does carry some risk for companies like Broadcom. Notably, Apple and Tesla work directly with foundries. Google may do something similar with its TPU v8 as well—this product has two SKUs: one co-designed by Broadcom, and the other designed and controlled independently by Google with MediaTek support.

Chip startups heat up

Our research shows that a long-tail force made up of startups trying new architectural paradigms could further challenge the market position of existing chip manufacturers. Cerebras is known for its wafer-scale engine (a massive chip made from a single silicon wafer, roughly the size of a pizza box), offering the fastest tokens-per-second processing speeds in the market, and it is reported to be planning to go public this year. The company recently announced a partnership with OpenAI to launch a high-speed programming model, Codex Spark; previously, the two sides reached a partnership agreement earlier this year in January. Groq, too, has built on its outstanding tokens-per-second performance; it recently signed a non-exclusive intellectual property licensing agreement worth $20.0 billion with NVIDIA, including Groq’s 90% of employees and CEO and TPU co-founder Jonathan Ross. This is essentially an acquisition of Groq’s team and technology; this kind of transaction structure is becoming increasingly popular in the M&A market, because tech giants want to avoid delays caused by regulatory scrutiny. In other acquisition-related developments, Intel reportedly failed in acquisition talks and then pivoted to form a partnership with SambaNova. Since 2014, Intel has made four acquisitions in AI, yet it has consistently failed to launch an AI product widely recognized and adopted by the market—this track record is quite sobering.

Looking ahead: the scale will reach $1.4 trillion by 2030

Based on our research, sustained demand growth and continual performance improvements over the next five years will drive the development of AI software and cloud services, and AI infrastructure spending will triple over the next five years—from $500.0 billion in 2025 to nearly $1.5 trillion in 2030.

Our forecast is based on historical observations of data center system investment relative to software revenue. In the early 2010s, with the rise of cloud computing, systems investment accounted for about 50% of global software spending. By 2021, after excessive investment following the COVID-19 pandemic and customer optimization, the systems investment-to-software spending ratio fell to the low end of the low-20% range. Our $1.5 trillion forecast assumes that in 2030, investment will be 20% of our neutral forecast scenario for global software spending (i.e., $7.0 trillion in July 2030), a ratio we laid out in detail in a blog post we published last year. We believe the 20% level has already sufficiently accounted for the risk of potential overinvestment before 2030, as well as the possibility that software revenue growth will be slower than in the neutral forecast scenario—under that latter scenario, we think infrastructure investment would continue to grow at a high pace, just as it did in the early 2010s.

As compute demand driven by AI continues to grow, we expect the share of custom chips in compute spending to keep rising—because the time and capital required to design chips targeted for specific workloads will translate into increasingly important per-dollar performance advantages as it scales. We think that by 2030, the share of custom ASICs in the compute market could exceed one-third.

Taken together, our research suggests that the infrastructure buildout currently underway is not a bubble that is about to burst, but the foundation of a once-in-a-generation platform-level transformation. ARK forecasts that AI infrastructure annual spending will approach $1.5 trillion in 2030—this market is driven by real, continuously accelerating demand from both consumers and enterprises, and the steadily falling costs are consistently validating and unlocking new use cases. We believe that the companies that stand out in the next five years will be those that can design the most efficient chips, build the strongest models, and deploy both at massive scale.

As NVIDIA CEO Jensen Huang outlined in the company’s Q4 FY2026 earnings call, truly practical AI agents have only just started to roll out at large scale over the past few months. They consume enormous amounts of tokens, but their capabilities far exceed what most users have previously been accustomed to in AI products. Scaling these agents to millions of enterprises will be an extremely compute-intensive job—and in our view, the productivity gains resulting from it will be absolutely worth these investments.

View Original
This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
  • Reward
  • Comment
  • Repost
  • Share
Comment
Add a comment
Add a comment
No comments
  • Pin