Inside the AI Compute Boom

In partnership with

👋 Hi, it’s Rohit Malhotra and welcome to the FREE edition of Partner Growth Newsletter, my weekly newsletter doing deep dives into the fastest-growing startups and S1 briefs. Subscribe to join readers who get Partner Growth delivered to their inbox every Wednesday morning.

Latest posts

If you’re new, not yet a subscriber, or just plain missed it, here are some of our recent editions.

 Partners

Learn AI in 5 minutes a day

This is the easiest way for a busy person wanting to learn AI in as little time as possible:

  1. Sign up for The Rundown AI newsletter

  2. They send you 5-minute email updates on the latest AI news and how to use it

  3. You learn how to become 2x more productive by leveraging AI

Interested in sponsoring these emails? See our partnership options here.

Subscribe to the Life Self Mastery podcast, which guides you on getting funding and allowing your business to grow a rocketship. 

Previous guests include Guy Kawasaki, Brad Feld, James Clear, Nick Huber, Shu Nyatta and 350+ incredible guests.

The Age of Compute Power

In late 2025, as generative AI models pushed deeper into everyday life—from copilots in enterprise software to synthetic biology labs—another story was quietly unfolding in the background: the global race for compute power. Across continents, data centers are being built at a pace not seen since the dawn of the internet. They hum 24/7, stacked with servers running foundation models, large language models, and machine learning applications that define the modern AI economy.

Amid the AI boom, compute power has emerged as one of this decade’s most critical and constrained resources. The processors, memory, storage, and energy needed to fuel AI have turned physical infrastructure into the new frontier of innovation and competition. In an era where software intelligence scales exponentially, it’s the hardware underneath—the data centers, chips, and gigawatts—that now determine who leads and who follows.

Analysis shows the scale of the transformation: by 2030, data centers will require nearly $6.7 trillion in capital expenditures to meet compute demand. Of that, $5.2 trillion will go toward AI-specific workloads—training and inference tasks for ever-larger foundation models—while $1.5 trillion will support traditional IT applications. In total, the global compute economy is on track to rival the GDP of Japan.

Behind these figures lies a sprawling, interdependent value chain—from the real estate developers building hyperscale campuses to the utilities wiring them with clean energy, from semiconductor firms crafting next-generation GPUs to the cloud giants provisioning trillions of terabytes of data. Each link in this chain is grappling with the same dilemma: how to deploy capital fast enough to capture opportunity without stranding assets in a technology cycle that evolves every six months.

The stakes are enormous. Overinvest, and you risk empty data halls and stranded capital. Underinvest, and you risk being shut out of the future of intelligence itself. The coming decade will test not just balance sheets but foresight—decisions on where and how to allocate compute will shape national competitiveness, corporate hierarchies, and innovation ecosystems.

Global Compute Economy

By mid-2025, the scramble to forecast the world’s appetite for compute power has become one of the most complex—and consequential—exercises in business strategy. Every hyperscaler, semiconductor manufacturer, and sovereign AI initiative is now asking the same question: how much compute will the world actually need?

The answer is far from simple. Compute demand is expanding at a rate that defies traditional forecasting models. Global data center capacity is expected to nearly triple by 2030, with AI workloads accounting for roughly 70 percent of that growth. But this trajectory rests on two pivotal uncertainties: how rapidly enterprises turn AI into real economic value, and how efficiently new technologies reshape the cost curve of compute itself.

1. The Application Layer Drives the Real Demand

The first uncertainty lies in AI use cases. The value in AI doesn’t come from the models—it comes from how organizations operationalize them. Today’s enterprise adoption is still early-stage, often limited to copilots, content generation, and analytics augmentation. But as AI systems move deeper into decision-making, R&D, and automation, demand for compute could explode.

If these applications deliver tangible productivity gains—measured not in novelty but in dollars—compute consumption could accelerate beyond every current forecast. On the other hand, if enterprise adoption stalls or ROI remains elusive, trillions in projected infrastructure spending could vaporize. In that sense, compute investment is less about technology forecasting and more about betting on the pace of business transformation.

2. The Efficiency Paradox

The second uncertainty lies in technological disruption. Every few months, new breakthroughs reshape assumptions about performance per watt and cost per flop.

In February 2025, DeepSeek, a leading Chinese LLM developer, announced that its V3 model achieved remarkable efficiency gains—18× lower training costs and 36× lower inference costs than OpenAI’s GPT-4o. It was the kind of improvement that, on paper, should flatten the compute demand curve.

But here’s the paradox: efficiency doesn’t always reduce consumption—it often fuels more of it. Each leap in cost efficiency triggers more experimentation, more retraining, and more model deployment. The net result is that total compute demand rarely contracts. Instead, it compounds. The more efficient compute becomes, the more it’s consumed.

This “efficiency paradox” has defined every technological epoch—from semiconductors to cloud—and AI is no different. McKinsey’s analysis suggests that while individual model costs will fall, the total market’s compute usage will continue to rise sharply through 2030.

3. The Scale of Investment

Translating demand into dollars, the numbers are staggering. To support AI workloads alone, $5.2 trillion in capital investment will be required by 2030. That equates to 156 gigawatts of AI-related data center capacity, with 125 GW added between 2025 and 2030—a buildout comparable to the world’s entire installed nuclear capacity.

When accounting for traditional IT workloads, the total compute infrastructure investment climbs to $6.7 trillion, underscoring just how central data centers have become to global economic competitiveness.

To frame it differently: the race to scale compute is not just a technology story—it’s an industrial revolution in capital formation.

4. Scenarios for the Next Five Years

Given the volatility of the AI landscape, a single forecast is insufficient. Instead, we model three scenarios:

  • Accelerated Demand: A breakout trajectory with 205 GW of incremental capacity and nearly $7.9 trillion in capital expenditures.

  • Base Case: A balanced growth path requiring 125 GW and $5.2 trillion in spending.

  • Constrained Demand: A slower buildout with 78 GW and $3.7 trillion invested—tempered by efficiency gains, regulation, or supply chain friction.

In every scenario, one truth persists: the scale of investment required dwarfs anything seen in prior technology cycles.

5. The Forces Powering the Boom

Several forces are converging to sustain this momentum:

  • Mass adoption of generative AI. Training and inference workloads are multiplying, with inference expected to dominate by 2030 as models shift from R&D to real-world deployment.

  • Enterprise integration. From banking to manufacturing, companies are embedding AI across workflows—each deployment drawing on massive compute reserves.

  • Infrastructure competition. Hyperscalers are racing to build proprietary AI capacity as a strategic moat, optimizing every layer of the stack to reduce compute costs.

  • Geopolitical imperatives. Governments are pouring billions into sovereign compute capacity to secure economic independence and national security.

Compute Power Value Chain

Behind every AI breakthrough—every chatbot reply, image generation, or reasoning agent—stands an immense, unseen industrial machine. It stretches from the silicon in Taiwan to wind farms in Texas, from hyperscale data halls in Virginia to liquid-cooled racks outside Shanghai.

This ecosystem—known as the compute power value chain—is now one of the most capital-intensive supply networks in the world. It binds together five investor archetypes: Builders, Energizers, Technology Developers, Operators, and AI Architects. Each plays a distinct role in constructing and maintaining the physical foundation of artificial intelligence.

Collectively, they represent a $5.2 trillion capital race that will define the AI economy through 2030.

1. Builders

At the front line of the compute revolution are the Builders—real estate developers, design firms, and construction companies responsible for expanding global data center capacity.

Their challenge is brutally simple: build faster, cheaper, and smarter than ever before. Builders are expected to deploy $800 billion in AI-related capital expenditure by 2030, acquiring land, securing materials, and orchestrating complex supply chains of steel, fiber, and cooling systems.

But the bottlenecks are tightening. Skilled labor shortages, zoning restrictions, and power grid constraints mean that every new megawatt of data center capacity is harder to deliver. Some forward-looking firms are turning to modular construction, prefabricating large components off-site to accelerate assembly and reduce costs.

In this race, location becomes strategy. Builders who can identify optimal geographies—where land, energy, and latency intersect—will dominate. They are the master planners of the AI age.

2. Energizers

If Builders provide the skeleton, Energizers are the lifeblood of the compute economy. They include utilities, energy producers, cooling system manufacturers, and telecom operators—the industries wiring intelligence into the grid.

AI’s hunger for power is immense. Each hyperscale data center can consume as much electricity as a mid-sized city, and AI workloads are projected to add 125 GW of new demand by 2030. Energizers are therefore on track to invest $1.3 trillion into generation, transmission, and cooling infrastructure.

The pivot toward sustainable energy is both an opportunity and an existential constraint. As processor densities rise, thermal management becomes a limiting factor. Air cooling gives way to direct-to-chip liquid cooling and immersion systems. Meanwhile, utilities are diversifying their portfolios—investing in nuclear microreactors, geothermal energy, and long-duration storage—to sustain the relentless growth of AI computing.

By 2030, renewables are expected to power nearly half of all AI-related data centers, up from one-third today. For Energizers, the future is not just about keeping the lights on—it’s about keeping the machines thinking.

3. Technology Developers and Designers

At the center of it all sit the Technology Developers and Designers—the semiconductor firms, chipmakers, and IT hardware suppliers that produce the GPUs, CPUs, and servers running AI workloads.

They are the single largest investors in this ecosystem, expected to deploy $3.1 trillion in AI-related capital expenditures by 2030. Their fabs are the new cathedrals of the digital age, requiring billions in precision engineering and years to bring online.

But this group also faces the sharpest volatility. A handful of companies—NVIDIA, TSMC, Intel, AMD—control the lion’s share of global chip production. Any disruption in supply can cascade across the entire compute value chain.

Their path forward depends on scale, specialization, and supply diversification. Expanding fabrication capacity in the U.S., Europe, and Southeast Asia could mitigate geopolitical risk. Meanwhile, innovation at the architecture level—like domain-specific accelerators or chiplet-based designs—will shape the next generation of compute efficiency.

In short: the firms that design intelligence are now the ones that manufacture intelligence.

4. Operators

Next come the Operators—the hyperscalers, colocation providers, and GPU-as-a-service platforms that make compute accessible to the world. Their role is less about building and more about optimizing—driving utilization, efficiency, and automation across data centers already in operation.

Though not included in the $5.2 trillion capex figure, Operators are the market’s most influential demand signal. When Amazon, Microsoft, or Google announces a new region, the ripple effects hit across construction, energy, and silicon.

Operators are increasingly investing in AI-driven orchestration—software that balances workloads in real time to minimize energy waste and latency. Some are even designing custom silicon to control inference costs, a critical move as inference workloads begin to dominate AI operations by the end of the decade.

Their mantra: fewer idle servers, smarter power usage, and tighter integration between physical and digital layers.

5. AI Architects

Finally, at the top of the value chain are the AI Architects—model developers, foundation model providers, and enterprises building proprietary AI systems.

While their capital outlays are often buried within R&D budgets, their choices dictate the entire system’s direction. Each model architecture—whether GPT, Gemini, Claude, or DeepSeek—determines how much compute, memory, and energy the ecosystem must deliver.

Yet AI Architects face mounting pressure. Inference costs—the expense of running models once trained—are spiraling. OpenAI’s o1 reasoning model, for instance, costs roughly six times more per inference than GPT-4o. That delta shapes everything from product pricing to hardware design.

To stay viable, Architects are exploring model optimization techniques like sparse activations and distillation—reducing the compute needed per query without sacrificing quality. In many ways, they are reverse-engineering intelligence to make it economically sustainable.

These are the designers of digital cognition—and the ultimate consumers of global compute supply.

Economics of AI Infrastructure

For all its promise, the AI infrastructure boom is not a simple growth story. It is an investment paradox—a race between technological acceleration and financial prudence.

By 2030, AI-related data center capacity could demand between $3.7 trillion and $7.9 trillion in capital outlays, depending on how fast adoption unfolds. The midpoint—$5.2 trillion—is staggering in itself, roughly equal to the annual GDP of Germany. Yet even this estimate masks the uncertainty that shadows every decision in the compute economy: how much capacity is enough?

Unlike traditional IT infrastructure, where demand curves were gradual and predictable, AI compute demand scales in quantum leaps. A single new model architecture, like OpenAI’s GPT-4 or DeepSeek’s V3, can trigger billions in incremental data center investment almost overnight.

As a result, capital allocators—from hyperscalers and chipmakers to sovereign funds—must walk a tightrope between overbuilding and falling behind.

1. The Risk of Overbuilding

History offers a cautionary parallel. During the telecom boom of the late 1990s, carriers poured billions into fiber networks that sat dark for years after the dot-com crash. Something similar could happen in compute if capacity ramps faster than AI adoption.

Overbuilding risk is real. Each hyperscale data center costs between $2 and $5 billion, depending on configuration. If demand plateaus or energy constraints limit operations, these assets can quickly become underutilized.

Moreover, technological obsolescence adds another layer of risk. AI hardware evolves on an 18-month cadence; a data hall built for H100 racks in 2025 could require a complete retrofit for Rubin or Blackwell-Next architectures by 2027. The result: depreciation curves that look more like consumer electronics than industrial infrastructure.

To mitigate this, investors are adopting staged build strategies—phasing construction based on demand signals, modular designs, and forward-contracts for compute buyers. Flexibility is becoming the new efficiency.

2. The Cost of Underinvestment

Yet being too conservative carries its own penalty. In AI, compute capacity is market share. The companies that can train and serve models faster—and cheaper—own the customer relationships, data pipelines, and application ecosystems that follow.

Underinvestment risks being locked out of the next generation of AI innovation. The economic consequences cascade: fewer foundation model experiments, slower product iteration, and dependency on foreign compute suppliers.

For nations, it becomes a matter of digital sovereignty. The U.S., China, and increasingly the EU are all treating compute as a strategic resource, not just a commercial one. Countries that fail to secure domestic AI capacity risk falling behind in both innovation and security.

In this sense, compute is not just capital expenditure—it’s industrial policy.

3. The New Rules of Capital Efficiency

Balancing these two extremes—excess and scarcity—requires a new playbook for capital efficiency in AI infrastructure. Three principles are emerging:

a. Dynamic Demand Forecasting
Traditional infrastructure forecasting assumes stability; AI shatters that. Investors now rely on AI-driven demand modeling that integrates LLM adoption curves, inference growth, and chip supply data. The goal is to predict not just usage—but behavioral inflection points where demand suddenly spikes.

b. Compute Elasticity as a KPI
Companies are starting to measure “compute elasticity”—the ability to scale capacity up or down without financial penalty. This includes flexible power agreements, modular campuses, and GPU leasing models that mirror cloud elasticity but on a physical layer.

c. Vertical Integration and Cross-Archetype Partnerships
No player can control the full value chain alone. Forward-thinking investors are co-developing ecosystems—chipmakers partnering with utilities, hyperscalers investing in renewable energy startups, or sovereign funds co-funding regional AI hubs. These partnerships reduce bottlenecks and create shared upside across archetypes.

4. The Geopolitics of Compute Capital

Beneath the spreadsheets lies geopolitics. Compute power has quietly become a strategic currency—as vital to 21st-century economies as oil was to the 20th.

Export controls on advanced chips, rising tariffs, and energy nationalism are fragmenting the global supply chain. The U.S. Chips Act, China’s domestic GPU drive, and Europe’s push for “technological sovereignty” all reflect a deeper truth: whoever controls compute capacity controls innovation velocity.

Investors must therefore evaluate not just financial ROI, but strategic exposure—where their capital sits within global supply risk maps. Proximity to stable grids, friendly regulation, and talent hubs will increasingly outweigh traditional tax or land cost advantages.

5. The Compute Dividend

Despite the volatility, the long-term economics of AI infrastructure remain compelling. Compute is the core enabler of every downstream AI application—from autonomous vehicles to personalized medicine, climate modeling, and robotics. Each new workload feeds back into the system, creating a compounding cycle of investment and innovation.

Those who play the long game—allocating capital not just to the fastest-growing regions but to the most resilient architectures—stand to capture extraordinary returns.

Compute may be costly, but intelligence is priceless.

Investment Scenarios

Forecasting compute demand over the next five years is not simply an exercise in data modeling—it’s an attempt to predict the speed of intelligence itself. Every new model, chip, and algorithm reshapes the curve. Yet one conclusion remains clear: AI is creating the largest wave of infrastructure investment since the Industrial Revolution.

McKinsey’s analysis estimates that by 2030, global data center capacity will almost triple, requiring a cumulative $6.7 trillion in capital investment. Of this, roughly $5.2 trillion will be directed toward AI workloads alone—servers, chips, memory, and power systems purpose-built for training and inference. The remaining $1.5 trillion will support traditional IT applications.

But how that capital unfolds depends on which future materializes. The compute economy could evolve along three distinct trajectories—each defined by different assumptions about AI adoption speed, hardware efficiency, and capital formation.

1. Scenario One: The Accelerated Demand Era

In this scenario, AI adoption outpaces all expectations, driven by breakthroughs in model performance and widespread enterprise integration. Foundation models become not just productivity tools but core infrastructure for global business operations, embedded in everything from manufacturing control systems to logistics and consumer applications.

  • Capacity Added (2025–2030): 205 GW

  • Capital Expenditure:$7.9 trillion

  • AI Share of Total Compute: 75–80%

This is the “AI Supercycle”, where hyperscalers, sovereign funds, and corporates all rush to secure capacity. Data centers become the new factories—vast, automated “AI plants” converting energy into intelligence.

In this world, power and land availability—not capital—become the primary constraints. Nations with stable grids and pro-infrastructure policy (such as the U.S., UAE, Singapore, and parts of Scandinavia) emerge as global compute exporters, hosting workloads for countries unable to meet their own energy or space demands.

Yet the downside risk is overextension. The $7.9 trillion capex wave could produce stranded assets if efficiency gains or regulatory shifts slow utilization. It’s a high-reward, high-risk cycle—driven by technological optimism and capital liquidity.

2. Scenario Two: The Base Case — “The Rational Boom”

This is the most likely path—a measured, sustainable buildout of AI capacity that mirrors past industrial cycles. AI continues to grow rapidly but within the boundaries of power, policy, and prudent corporate adoption.

  • Capacity Added (2025–2030): 125 GW

  • Capital Expenditure:$5.2 trillion

  • AI Share of Total Compute: ~70%

Here, hyperscalers continue to dominate investment, accounting for more than half of new capacity. However, the capital stack begins to diversify: infrastructure funds, sovereign wealth funds, and energy companies increasingly participate in compute projects.

Investors favor modular data centers, hybrid power models, and flexible chip architectures to hedge against volatility. Efficiency gains—while real—are offset by the scale effect of AI proliferation: every new application, model, and startup drives incremental compute demand.

This “rational boom” scenario assumes that the AI economy continues to compound without a speculative bubble or sharp correction—resulting in a durable, trillion-dollar infrastructure opportunity.

3. Scenario Three: The Constrained Future

In the constrained-demand case, global compute demand grows—but at a slower pace. Efficiency breakthroughs, model compression, and regulatory pressure cap the speed of infrastructure expansion.

  • Capacity Added (2025–2030): 78 GW

  • Capital Expenditure:$3.7 trillion

  • AI Share of Total Compute: ~60%

This future could emerge if AI adoption hits friction—for instance, if enterprise ROI fails to materialize, data regulation slows innovation, or energy grid limits restrict expansion.

While this scenario tempers the growth story, it doesn’t eliminate it. Even with conservative adoption, compute demand still represents one of the most capital-intensive sectors on Earth, with steady 10–12% CAGR in capacity and multi-decade investment durability.

4. The Power Equation

Across all scenarios, one constraint dominates the forecast: energy.
Compute is not abstract—it’s physical. A single AI data center can consume up to 500 MW, equivalent to powering a mid-sized city. As global AI capacity scales beyond 150 GW, the challenge shifts from chip availability to grid capacity and sustainable generation.

By 2030, the AI ecosystem could consume more than 1,000 terawatt-hours of electricity annually—roughly the combined usage of Japan and Germany today. This will drive massive secondary investment in renewables, nuclear, and energy storage, as well as innovations in direct-to-chip liquid cooling and heat recapture technologies.

Energy, not semiconductors, could soon become the rate-limiting factor for AI growth.

5. Forecast to Strategy

The difference between the $3.7 trillion and $7.9 trillion worlds will hinge on five signals:

  1. Inference Costs: Whether efficiency gains outpace model complexity.

  2. Power Availability: Grid modernization and renewable capacity timelines.

  3. Capital Liquidity: Access to long-term financing from sovereign and institutional investors.

  4. Enterprise ROI: The measurable value of AI deployments in revenue and productivity.

  5. Geopolitical Stability: The trajectory of export controls, tariffs, and regional alliances.

Tracking these indicators will help investors, governments, and enterprises gauge which scenario the world is moving toward—and adjust their capital strategies accordingly.

6. Shape of the Compute Economy

Regardless of the path, the trendline is unmistakable: AI compute is becoming a macroeconomic force.
The capital intensity of this transformation rivals that of the railroads, the power grid, and the internet combined.

By 2030, compute power will define national competitiveness, corporate valuations, and innovation velocity. The winners will be those who not only invest early—but invest intelligently, balancing speed, sustainability, and strategic control.

The next industrial age won’t be powered by steam or silicon alone—it will be powered by intelligence at scale.

Critical Challenges Ahead

For all the optimism surrounding the trillion-dollar AI infrastructure boom, there is a harder, more sobering reality: compute is not infinite.
The very forces driving this revolution—model growth, power demand, and global competition—are also introducing structural constraints that threaten to slow it down.

The race to scale intelligence is entering a new phase—one where the bottlenecks are no longer just technical, but physical, financial, and geopolitical.

Power Shortages and Grid Constraints

The first and most urgent challenge is energy.
AI data centers are voracious power consumers. A single hyperscale site can draw between 300 and 500 megawatts (MW), and global AI-related compute could surpass 150 gigawatts (GW) of demand by 2030. That’s equivalent to the current energy needs of countries like France or Brazil.

Yet the world’s electrical grids were never designed for this.
Delays in transmission build-outs, permitting bottlenecks, and inconsistent renewable integration are already slowing new data center projects. In the U.S., over 2 terawatts of clean-energy projects are stuck in interconnection queues; in Europe, similar grid delays stretch to five years.

This power bottleneck is becoming the defining constraint of the AI age. Nations that can expand generation and transmission—particularly via nuclear, geothermal, hydro, and advanced battery storage—will dictate where intelligence clusters form. In essence, energy policy has become compute policy.

Supply Chain Fragility and Chip Concentration

Compute power depends on a remarkably narrow industrial base.
A handful of companies—NVIDIA, TSMC, ASML, Samsung, and Intel—control the critical nodes of chip design and fabrication. The world’s most advanced GPUs are produced in a single geography: Taiwan, which fabricates more than 90% of global leading-edge semiconductors.

This concentration creates systemic risk. Any disruption—whether from geopolitical tension, export controls, or natural disasters—could ripple across the global AI economy within weeks.

Moreover, supply chains remain constrained. High-bandwidth memory (HBM), essential for AI training workloads, faces chronic shortages. So do advanced packaging and liquid cooling components, creating a domino effect in cost and delivery timelines.

In response, the U.S., EU, Japan, and India are all deploying industrial policy to “onshore” compute manufacturing through subsidies and incentives. But scaling fabrication is a decade-long endeavor, not a quarterly one. Until diversification materializes, semiconductors remain the single point of failure in the global AI build-out.

Capital Intensity and Return Uncertainty

The sheer scale of investment—between $3.7 trillion and $7.9 trillion by 2030—raises a question that few want to ask: will all this capital earn a return?

AI infrastructure projects are long-duration, capex-heavy, and uncertain in utilization. The technology refresh cycle for chips is 12–18 months, but data centers are built on 20-year depreciation schedules. This mismatch introduces risk.

Hyperscalers can absorb volatility through scale and balance sheet strength. But smaller operators, energy providers, and infrastructure funds face exposure to stranded asset risk if utilization dips or efficiency gains outpace deployment.

In essence, the cost of being wrong is rising. The next generation of investors must treat compute power not as a one-way growth story, but as a high-beta infrastructure class that requires hedging, flexibility, and multi-scenario planning.

Technological Volatility

AI progress is exponential—but so is obsolescence.
Every new generation of chips and architectures redefines the economics of compute. DeepSeek’s V3 model, launched in early 2025, reduced training costs 18× and inference costs 36× compared to GPT-4o. Yet instead of lowering aggregate demand, it triggered a wave of new model training experiments—canceling out the efficiency gain.

This is the efficiency paradox: the cheaper it becomes to compute intelligence, the more we compute.
History offers analogies—from Jevons’ paradox in energy economics to Moore’s Law in chips—but AI’s scale magnifies it. As models become more capable, they demand larger datasets, more context windows, and higher reasoning depth—all of which multiply compute requirements.

Efficiency will buy time—but it won’t buy equilibrium.

Geopolitical Fragmentation

Compute power is now a strategic resource, on par with oil and semiconductors.
Governments see data centers, chips, and AI clusters as tools of national influence and resilience. Export bans, tariffs, and data-sovereignty laws are fragmenting the once-global AI supply chain into regional compute blocs.

  • The U.S. leads in GPU design and hyperscale infrastructure.

  • China is racing to self-sufficiency through indigenous chip development and sovereign AI factories.

  • The EU is emphasizing sustainability, regulation, and energy diversification.

  • The Middle East is positioning itself as a neutral compute hub, powered by cheap energy and sovereign wealth.

This fragmentation introduces friction but also diversification. We are entering an era where compute alliances—cross-border agreements on energy, data, and chip sharing—will define who participates in the next wave of AI growth.

Sustainability and the Environmental Question

Perhaps the most under-examined challenge is climate impact.
AI’s energy intensity is climbing rapidly. By some estimates, inference and training workloads could consume 4–6% of global electricity by 2030—up from less than 1% today. Water usage for cooling already exceeds a million liters per data center per day in arid regions.

If left unchecked, the compute revolution could undermine global decarbonization targets.
The industry is responding—pushing for zero-carbon power, heat recovery systems, and AI-optimized energy routing—but scaling these solutions will take coordinated effort across utilities, regulators, and tech firms.

The irony is sharp: the same AI systems that help optimize grids and model climate risk are now contributing to it.

The Human Constraint

Finally, even with capital and chips secured, there’s one bottleneck money can’t immediately solve—talent.
Building and operating AI-ready data centers requires an interdisciplinary workforce: electrical engineers, cooling specialists, data architects, and ML infrastructure experts. The industry faces a shortage across all these roles.

Labor scarcity is already delaying projects in North America and Western Europe, with skilled technicians commanding 30–50% wage premiums. Without workforce expansion, the pace of physical deployment could lag far behind demand projections.

Playbook: Winning the Compute Race

The race for compute power is not a sprint—it’s a generational marathon. The companies that emerge as leaders won’t simply outspend their peers; they’ll outstrategize them across three dimensions: demand intelligence, efficiency innovation, and supply-side resilience. Winning in this trillion-dollar infrastructure cycle requires a clear playbook—one that blends foresight, flexibility, and operational precision.

1. See the Demand Before It Arrives

The first rule of the compute race: anticipate, don’t react.
Winners will build dynamic demand-forecasting models that integrate signals from AI model development, enterprise adoption, and macro-energy trends. These systems must go beyond backward-looking utilization metrics and model the pace of innovation—how fast context windows are expanding, how inference workloads are scaling, and which industries are deploying AI at volume.

The best investors and hyperscalers are already using AI to forecast AI, deploying predictive analytics to optimize where, when, and how to expand capacity. Those that can read demand shifts early will time capital deployment with surgical precision—avoiding both stranded assets and capacity shortfalls.

2. Design for Efficiency at Every Layer

Raw scale alone is no longer the advantage. Efficiency is.
From chips to cooling, leaders are designing systems where every watt and every FLOP compounds productivity. Semiconductor firms are racing to deliver domain-specific accelerators and energy-optimized architectures. Data center operators are adopting liquid cooling, modular rack design, and AI-driven energy routing to cut costs and emissions simultaneously.

The frontier of competition is now cost per unit of intelligence, not just cost per unit of compute. Companies that can train and deploy models faster, cheaper, and cleaner will create compounding cost advantages that rivals can’t easily replicate.

3. Build Supply-Side Resilience

As power grids strain and chip supply chains tighten, resilience becomes strategy.
Winning players are locking in long-term power purchase agreements, investing in micro-grids and on-site generation, and forging alliances with energy innovators in nuclear, geothermal, and long-duration storage. On the chip front, forward-thinking hyperscalers are vertically integrating—designing custom silicon, diversifying fabrication partners, and pre-ordering GPU capacity years in advance.

The logic is simple: secure the bottlenecks before they become bottlenecks.

4. Partner Across the Stack

No single company can master every layer of the compute economy. The leaders of the next decade will build ecosystems, not empires.
From sovereign governments funding AI clusters to cloud providers partnering with semiconductor startups, cross-stack collaboration will define competitive advantage. The most successful players will align incentives across developers, energy providers, and operators to capture the full compute value chain.

5. Balance Growth with Governance

Finally, winning sustainably means investing responsibly. Regulators are sharpening their focus on energy efficiency, carbon impact, and AI safety. Companies that integrate compliance, transparency, and sustainability from the outset will not only reduce risk—they’ll unlock access to cheaper capital and stronger public trust.

Compute as a Competitive Moat

In the first era of AI, breakthroughs came from algorithms. In the second, they came from data. But in this new era, compute itself has become the ultimate moat—the scarce, defensible resource separating those who merely deploy AI from those who define it.

Access to compute determines who can train frontier models, who can afford large-scale inference, and who can iterate fast enough to stay relevant as architectures evolve. Every advantage—product differentiation, model accuracy, time-to-market—now traces back to one question: how much compute can you command, and how efficiently can you use it?

The companies winning today—NVIDIA, Microsoft, Google, Amazon, and rising sovereign players in the Middle East and Asia—understand this. They are not just scaling infrastructure; they are transforming compute into strategic leverage. By controlling the full stack—from chip design to data center energy sources—they convert what was once a cost center into a barrier to entry.

Compute moats are also flywheels. The more compute an organization controls, the faster it can experiment, train, and deploy new models. Faster iteration drives better performance, attracting more users, generating more data—and justifying even greater compute investments. This compounding cycle mirrors the early cloud era, but with exponentially higher stakes.

However, moats come with responsibility. Unchecked concentration of compute in a few hands risks creating new asymmetries—between nations, companies, and innovators. The next decade will be defined not only by who builds the biggest AI factories, but by how equitably access to compute is distributed across ecosystems. The world will need both hyperscalers and open infrastructure alliances to sustain balanced innovation.

In the end, the frontier of AI will not be limited by ideas or capital—it will be limited by compute. Those who understand its scarcity, secure it early, and deploy it wisely will shape the trajectory of the intelligent economy.

Casey Woo is the founder of Operators Guild, a community of 800+ top operators. He is also a General Partner at FOG Ventures, where he's witnessing how AI is killing the 18-month fundraising cycle, enabling 1000x returns in half the time.

In this conversation, Casey and I discuss:

  • How has the “two founders, two laptops, and $20/month ChatGPT” reality changed startup economics?

  • Is defensibility in AI gone for good?

  • From the operator and investor perspective, what pricing models are actually working in today’s market?

If you enjoyed our analysis, we’d very much appreciate you sharing with a friend.

Tweets of the week

Here are the options I have for us to work together. If any of them are interesting to you - hit me up!

And that’s it from me. See you next week.

What do you think about my bi-weekly Newsletter? Love it | Okay-ish | Stop it

Reply

or to participate.