10x compute, 2x smarter

Every few months, a single quote rewires the entire AI investment narrative. This time it's Elon Musk's claim that applying 10x the compute to LLM training will effectively double a model's "intelligence." Morgan Stanley picked it up, published a sweeping report, and suddenly it became the number that justifies another hundred billion dollars in capital expenditure. The framing is seductive. More compute, more intelligence, more value. But if you're actually building with these models, the claim raises more questions than it answers. What does "2x smarter" even mean? And is raw intelligence really the thing holding anyone back?

The quote and the narrative machine

In a recent interview, Musk laid out the math simply: 10x compute yields roughly 2x intelligence. It's a shorthand for the logarithmic scaling laws that have governed LLM training since OpenAI's landmark 2020 paper by Jared Kaplan and others, which showed that model performance improves predictably as you increase parameters, data, and compute. Morgan Stanley ran with it. In a March 2026 report, the bank warned that a "transformative leap" in AI is imminent, that scaling laws are "holding firm," and that most of the world isn't ready. They pointed to OpenAI's GPT-5.4 scoring 83% on the GDPVal benchmark as evidence that the curve is only getting steeper. This is how narratives get built. A researcher's observation becomes a founder's talking point, then an analyst's thesis, then a reason to deploy capital at unprecedented scale. But somewhere along the way, the nuance gets lost.

What does "2x smarter" actually mean?

This is the part nobody defines carefully enough. "Intelligence" in the LLM context isn't a single number. It could mean benchmark performance, where models score higher on standardized tests. It could mean reasoning depth, the ability to chain logical steps together. It could mean instruction following, reliably doing what you ask. Or it could mean something fuzzier, like generating output that feels more useful in practice. Each of these paints a very different picture of progress. On benchmarks, the gains look real but increasingly narrow. GPT-5's launch in mid-2025 was met with widespread disappointment. Cal Newport, writing in The New Yorker, called the improvements "more like the targeted improvements you'd expect from a software update than like the broad expansion of capabilities in earlier breakthroughs." Users on Reddit called it "the biggest piece of garbage even as a paid user." Gary Marcus summarized it as "overdue, overhyped and underwhelming." On reasoning, the story is even murkier. Apple researchers published a paper titled "The Illusion of Thinking" that found state-of-the-art reasoning models showed "performance collapsing to zero" when puzzle complexity was extended beyond a modest threshold. Researchers at Arizona State University went further, calling what AI companies label reasoning "a brittle mirage that vanishes when it is pushed beyond training distributions." So when someone says 10x compute will make models "2x smarter," you have to ask: smarter at what? Scoring higher on benchmarks that may not reflect real-world usefulness? Or actually solving the problems that matter to the people building products?

The scaling wall is already here

The uncomfortable truth is that pure pre-training scaling, the strategy that powered the leap from GPT-3 to GPT-4, has already hit diminishing returns. OpenAI's internal project Orion, which was supposed to become a blockbuster successor to GPT-4, disappointed. According to The Information, "the increase in quality was far smaller compared with the jump between GPT-3 and GPT-4." Musk's own xAI tried to brute-force its way ahead with Grok 3, training on roughly 100,000 H100 GPUs, many times the compute used for GPT-4. It didn't significantly outperform competitors. The industry's response has been a pivot to post-training improvements: reinforcement learning, chain-of-thought reasoning, and inference-time compute. These techniques squeeze more performance out of existing models rather than building bigger ones. Satya Nadella called it "a new scaling law." Others called it a "second era of scaling." But as Newport observed, this is less like building a faster car and more like tuning up the one you already have. A lot of utility can come from souping up a Camry, but no amount of tweaking will turn it into a Ferrari. The gap between GPT-3 and GPT-4 felt like a generational leap. The gap between GPT-4 and GPT-5 felt like a version bump. If 10x more compute can't even reliably produce a model that feels meaningfully better to its users, the "2x smarter" framing starts to look more like marketing than science.

The Jevons paradox angle

There's a more interesting question hiding behind the Musk quote: even if models do get smarter, does that reduce AI spending? History says no. This is the Jevons paradox at work. In the 1860s, economist William Stanley Jevons observed that as coal-powered engines became more efficient, total coal consumption went up, not down. Cheaper energy didn't mean less demand. It meant new uses, new industries, and more burning. The AI version plays out the same way. When DeepSeek showed you could build competitive models at a fraction of the cost, Microsoft CEO Satya Nadella immediately invoked Jevons: "As AI gets more efficient and accessible, we will see its use skyrocket, turning it into a commodity we just can't get enough of." He's probably right, at least directionally. Cheaper intelligence means more people use it, for more things, more often. Companies that saved money on model costs will spend it on building agents, fine-tuning, evaluation pipelines, and inference at scale. The total spend goes up even as the unit cost goes down. This is great news if you're selling compute. It's less obviously great news if you're a builder hoping the next model will make everything cheaper and easier.

The real bottleneck isn't intelligence

Here's what the "10x compute, 2x smarter" framing completely misses: for most people building AI products today, model intelligence is not the bottleneck. The hard problems are orchestration, getting multiple AI components to work together reliably. They're evaluation, knowing whether your system is actually working. They're reliability, making sure it works the same way on the thousandth request as it did on the first. They're user experience, designing interfaces that make AI useful rather than frustrating. These are engineering problems, not intelligence problems. A model that's 2x smarter on benchmarks doesn't fix your agent architecture. It doesn't make your eval suite more comprehensive. It doesn't solve the fact that your users don't trust the output. The builders who are shipping real products aren't waiting for the next model. They're building robust systems around the models that already exist. The next GPT won't fix a broken feedback loop or a poorly designed workflow.

Who actually benefits from 10x compute?

It's worth asking who the "10x compute" narrative actually serves. Morgan Stanley's report projects a net U.S. power shortfall of 9 to 18 gigawatts through 2028, a 12% to 25% deficit in the power needed to run AI infrastructure. Developers are converting Bitcoin mining operations, firing up natural gas turbines, and deploying fuel cells. The bank describes an emerging "15-15-15" dynamic: 15-year data center leases at 15% yields, generating $15 per watt in net value creation. This is an infrastructure story, not an intelligence story. The primary beneficiaries of 10x compute are the companies selling the compute: the GPU manufacturers, the cloud providers, the energy companies, and the data center operators. It's the same handful of companies that have been winning the entire AI cycle so far. For a startup building an AI writing tool or a coding assistant, 10x more training compute in the next foundation model is largely irrelevant to their day-to-day challenges. They're not training models. They're calling APIs and trying to build reliable products on top of them.

Intelligence is a commodity, direction is not

The most important insight for builders right now might be this: intelligence is rapidly becoming a commodity. Multiple labs produce models of roughly comparable capability. Prices are falling. Access is democratizing. The gap between the best model and the fifth-best model matters far less than it did two years ago. What isn't commoditizing is taste. The ability to identify the right problem, frame it clearly, design a system that solves it elegantly, and ship something people actually want to use. No amount of compute produces that. More compute doesn't tell you which features to build. It doesn't tell you which problems are worth solving. It doesn't give you the judgment to know when AI is the right tool and when it isn't. The "10x compute, 2x smarter" narrative is compelling because it's simple. It turns the messy, uncertain future of AI into a clean input-output function. But the messy version is closer to the truth. The next generation of valuable AI companies won't be built by whoever has the most GPUs. They'll be built by the people who know what to point them at.

References

Kaplan, J. et al., "Scaling Laws for Neural Language Models," OpenAI, January 2020. arxiv.org/abs/2001.08361

Lichtenberg, N., "Morgan Stanley warns an AI breakthrough is coming in 2026, and most of the world isn't ready," Fortune, March 13, 2026. fortune.com

Newport, C., "What if A.I. Doesn't Get Much Better Than This?," The New Yorker, August 12, 2025. newyorker.com

Rosalsky, G., "Why the AI world is suddenly obsessed with Jevons paradox," NPR Planet Money, February 4, 2025. npr.org

Zeff, M., "AI Scaling Laws Are Showing Diminishing Returns, Forcing AI Labs to Change Course," TechCrunch, November 20, 2024. techcrunch.com

"Introducing GPT-5," OpenAI, 2025. openai.com/index/introducing-gpt-5

Edwards, W., "Morgan Stanley says markets are unprepared for AI disruptions in the next few months," Business Insider, March 11, 2026. businessinsider.com

"GPT-5 May Be Proof That Scaling Alone Can't Save AI," Artificial Corner, September 2025. medium.com