The AI Infrastructure Dilemma

Authored by Holden Spaht

“As long as the music is playing, you’ve got to get up and dance,” Citi’s former CEO Chuck Prince infamously commented in the summer of 2007— just months before the financial crisis. Some investors are finding themselves at risk — knowingly or not — of a similar dilemma when it comes to today’s massive Generative AI infrastructure build-out.

The bull case for building and investing in AI infrastructure is well-known. AI is a transformative technology that is expected to sweep across every economic sector, much like the web and mobile phases of digital transformation. Evidence of productivity gains from AI is emerging across industries, and expectations for what might be possible in the next few years are stunning and fascinating, even if somewhat speculative.

Building out the infrastructure for AI has already been an incredible (and profitable) experience for the companies that supply the “picks and shovels” and the investors who own them. NVIDIA is the icon — stock up 200x in a decade; sales last year up 265%; now the third largest US company by market capitalization with margins around 75%. AMD (a NVIDIA competitor) estimates that the total addressable market for bespoke AI chips might hit $400 bn by 2027. And though GPU chips get the most attention, the infrastructure system includes many other interconnected elements such as memory, network cards, and hyperscale cloud storage, all connected with high-speed networking in data centers that need to be powered and cooled.

Estimates at this stage should be taken with healthy skepticism as the market continues to evolve rapidly, but reasonable assumptions point to an overall AI infrastructure market of over $1 trillion by 2027. (To show how disparate projections can be, Sam Altman of OpenAI would argue that my framing of a “reasonable” market is actually far too conservative — by at least sevenfold. And that’s for semiconductor production alone, which represents just a slice of the overall AI infrastructure stack.)

To put it into context: total global IT expenditures in 2023 were around $4.7 trillion, about 3.3% higher than the year before. Simply adding new AI infrastructure investments to existing trends could double the growth rate to over 7%. You don’t need science fiction scenarios about robots taking over the world to make the case that the modern AI infrastructure build-out will be one of the great investment opportunities of this decade.

Of course, there is also the contrarian bear case about an AI bubble. Consider this chart that maps the path of NVIDIA stock over the last few years to Cisco, the poster child stock of the dot-com infrastructure build-out in the 1990s. It’s uncanny.

History may not repeat, but sometimes it rhymes. The 1990s birth of the World Wide Web came with confident proclamations of a “new economy with new rules” and an explosion of new start-up companies where only customer acquisition and data – not turning a profit – mattered. The internet drove a projected infrastructure demand for massive fiber optic cables and switches to carry what was supposed to be a historic exponential increase in internet traffic across what we then called “the information superhighway.”

The fiber optic network build-out was indeed historic. In just two years leading up to 2001, 100 million miles of cable capacity was built at a rough cost (in today’s dollars) of about $65 billion. But the dot-com bubble burst was also an internet infrastructure burst – demand projections and the economics of that massive infrastructure investment collapsed. By 2001, an estimated 95% of the newly laid fiber was “dark” (unutilized), and almost $14 billion of telecommunications bonds were in default halfway through the year.

In some cases, the price of bandwidth fell by as much as 80%. As a result, the most iconic fiber optic infrastructure stocks of the time collapsed: Global Crossing hit a market cap of around $47 billion in 1999; in 2002, it filed for bankruptcy and was acquired almost a decade later by Level 3 Communications, another major fiber optic provider that was once dubbed “the best-funded startup in history”, for $3 billion (including the assumption of just over $1 billion debt). Level 3’s stock price tells a similar story.

To be clear, while investors in these early infrastructure build-outs were battered by losses, the physical and technical infrastructure they paid for didn't disappear or go to waste. In fact, the underutilized fiber later became the foundation for the next generation of digital innovators (think Google and YouTube) who used underpriced infrastructure to super-charge their business models. As Carlota Perez explains so well in Technological Revolutions and Financial Capital, this is great for the development of technology and the overall economy — but that’s cold comfort to the bondholders and other investors in the red who paid for it all.

So, what does history suggest about the investment case for AI infrastructure in 2024?

I have a somewhat contrarian point of view on where I think the risks will reveal themselves. The question that most bearish investors pose right now is whether we are building too much infrastructure too quickly and whether the level of demand for that infrastructure will match the optimists’ projections.

From my perspective, it’s not so much the volume of demand that will be the problem. Rather, it is the nature and shape of that demand. We could indeed end up in an oversupply situation — not because demand isn’t strong, but because we are building the wrong kind of technical infrastructure. Supply-side innovation is what’s missing from the bears’ story. I would argue that cheaper and more distributed infrastructure may ultimately be more appropriate (and cost-efficient) to handle most of the demand.

Put differently, my hypothesis is that the demand for AI services is more likely to surprise on the upside — it may even meet or even exceed projections — because I agree that AI is a transformative general-purpose technology that will shape industries and sectors across the economy. But I see three indicators why the need for the infrastructure to meet that massive demand might very well be different (and likely smaller) than most investors today expect.

The first indicator: Smaller and open-source language models demonstrate surprisingly robust performance on many business-relevant tasks. The chart below shows how the performance of open-source models is improving and converging on a major metric with those of the most expensive proprietary models.

Compared to intricately complex and expensive-to-run large language models (the better-known LLMs), small language models (SLMs) run leaner — and for enterprises that use them, that means access to and affordability of AI that may have otherwise been out-of-reach.

Companies in areas such as drug discovery research and advanced material science may need the most advanced and expensive frontier AI models for high-end work. But if a business aims to automate a call center with AI or optimize an inventory management system, it’s more likely to use a smaller and cheaper model — perhaps an open-source model — that requires much less elaborate infrastructure to fine-tune and run.

Many of the companies building foundation AI models are now building for exactly this future: in their most recent product releases, Google, Anthropic, and Mistral offered multiple versions of their models at different complexities and price points.

In each case, the smallest models can increasingly run on the edge, not in the cloud — in other words, on a laptop or even a smartphone. High-end infrastructure is still needed by AI companies to train their models, but once the technology is in the hands of its users to solve business problems (what’s called “inference”), the requisite infrastructure may not only be drastically less expensive…it may already be in the palm of your hand.

CIOs with limited budgets are deeply interested in this evolution of versioning for AI, as they should be. The idea of “AI-at-all-costs” (implementation of the technology without regard to immediate ROI for the enterprise) may have seemed like a growing norm in early 2023, but a year later, that freewheeling mindset seems to have already come back down to earth.

That points us to the second indicator of a different and smaller market for AI infrastructure: the emergence of finely curated, high-quality data sets and refined data engineering, making model training much more efficient.

The prominence of DataBricks — a platform that prioritizes data engineering to scale and hones down AI infrastructure to its clients’ specific needs — is an important signal of this trend. (It’s no coincidence that DataBricks just released its own high-performance open-source model as a complement to its data engineering services.)

Curated, domain-specific data sets lead to more efficient training runs and enable the construction of AI models that require fewer parameters to achieve better results — like replacing dull sledgehammers with sharp scalpels. The infrastructure needs here are migrating toward data engineering for efficiency.

The third infrastructure market indicator may seem obvious, but it’s easy to forget amid the hype: competition actually works to reduce prices and advance technology in ways that innovate around infrastructure constraints.

All the attention (and the profit margins) on the most advanced NVIDIA chips massively raised the urgency and stakes for competitors to build and deploy alternatives. And those alternatives proliferate across the industry just about 18 months into the LLM era. AMD recently released a competitive M1300 chip. Google, Microsoft, and Amazon deploy bespoke chips on their respective cloud platforms. Specialized chip architectures from companies like Groq focus on particular parts of the AI value chain (for Groq, it’s about inference, not training).

I do not believe this means the market for NVIDIA’s most advanced chips is going away. It’s also important to acknowledge the lock-in effects of the full NVIDIA development environment and the robust developer community that is already familiar with that integrated system.

Still, even in these early days of Gen AI, the incentives and pressure to create more cost-efficient alternative development environments with bespoke chips are already extraordinarily high. As the Econ 101 saying about supply and demand goes: the cure for high prices is high prices. For AI infrastructure, that cure may come sooner than many expected.

That also means that I don’t expect the demand for AI products and services to diminish over time (imagine how much the world will spend on AI chips in 2027 if inference gets much cheaper than it is in early 2024!). My point is that those expenditures will probably be spread over a much more diverse infrastructure base. In other words, the AI infrastructure could be much larger than we expect but also quite different from what the bulk of new investment is funding today.

While I see it as a cautionary tale for investors, I want to be clear that it is not a pessimistic view of AI’s potential for setting off a new era of digital transformation.

The internet and the digital economy of the 2000s changed the shape of the modern economy. That transformation took advantage of underpriced infrastructure assets paid for by investors who lost money and subsidized the next generation of technology with their losses. It was good for technology and the economy overall but not very good for the investors who got too far ahead of the game — and who fell behind in their understanding of the technology dynamics driving demand and supply.

The AI era is moving even faster, and investors will need to keep pace.

Originally published on LinkedIn

Thought. Leadership.

Innovations, investments, strategies and opportunities introduced at the speed of business.

April 15, 2024

The AI Infrastructure Dilemma

Spotlight

The New Network Effect: How AI Transforms the Value of Enterprise Software

CHICAGO

DALLAS

LONDON

MIAMI

NEW YORK

SAN FRANCISCO

Thought. Leadership.

Innovations, investments, strategies and opportunities introduced at the speed of business.

April 15, 2024

The AI Infrastructure Dilemma

Spotlight

The New Network Effect: How AI Transforms the Value of Enterprise Software

Down Rounds, Debt, and Decisions

The new realism in venture capital is healthy

CHICAGO

DALLAS

LONDON

MIAMI

NEW YORK

SAN FRANCISCO