screen time

The Future of AI Shouldn’t Be Taken at Face Value

Photo-Illustration: Intelligencer; Photo: Getty Images

It costs a lot to build an AI company, which is why the most competitive ones are either existing tech giants with an abundance of cash to burn or start-ups that have raised billions of dollars largely from existing tech giants with an abundance of cash to burn. A product like ChatGPT was unusually expensive to build for two main reasons. One is constructing the model, a large language model, a process in which patterns and relationships are extracted from enormous amounts of data using massive clusters of processors and a lot of electricity. This is called training. The other is actively providing the service, allowing users to interact with the trained model, which also relies on access to or ownership of a lot of powerful computing hardware. This is called inference. 

After ChatGPT was released in 2022, money quickly poured into the industry — and OpenAI — based on the theory that training better versions of similar models would become much more expensive. This was true: Training costs for cutting-edge models have continued to climb (“GPT-4 used an estimated $78 million worth of compute to train, while Google’s Gemini Ultra cost $191 million for compute,” according to Stanford’s AI Index Report for 2024). Meanwhile, training also got a lot more efficient. Building a “frontier” model might still be out of reach for all but the largest firms due to the sheer size of the training set, but training a fairly functional large language model — or a model with similar capabilities to the frontier models of just a year ago — has become relatively cheap. In the same period, though, inference has become much more affordable, meaning that deploying AI products once they’ve been built has gotten cheaper. The result was that companies trying to get users for their AI products were able, or at least tempted, to give those products away for free, either in the form of open access to chatbots like ChatGPT or Gemini, or just built into software that people already use. Plans to charge for access to AI tools were somewhat complicated by the fact that basic chatbots, summarization, text generation, and image-editing tools were suddenly and widely available for free; Apple Intelligence, for example, is able to handle a lot of inference on users’ iPhones and Macs rather than in the cloud.

These industry expectations — high and rising training costs, falling inference costs, and downward price pressure — set the direction of AI funding and development for the last two years. In 2024, though, AI development swerved in a major way. First, word started leaking from the big labs that straightforward LLM scaling wasn’t producing the results they’d hoped for, leading some in the industry to worry that progress was approaching an unexpected and disastrous wall. AI companies needed something new. Soon, though, OpenAI and others got results from a new approach they’d been working on for a while: so-called “reasoning” models, starting with OpenAI o1, which, in the company’s words “thinks before it answers,” producing a “long internal chain of thought before responding to the user” — in other words, doing something roughly analogous to running lots of internal queries in the process of answering one. This month, OpenAI reported that, in testing, its new o3 model, which is not available to the public, had jumped ahead in industry benchmarks; AI pioneer François Chollet, who created one of the benchmarks, described the model as “a significant breakthrough in getting AI to adapt to novel tasks.”

If this sounds like good news for OpenAI and the industry in general — a clever way around a worrying obstacle that allows them to keep building more capable models — that’s because it is! But it also represents some new challenges. Training costs are still high and growing, but these reasoning models are also vastly more expensive at the inference phase, meaning that they’re costly not just to create but to deploy. There were hints of what this might mean when OpenAI debuted its $200-a-month ChatGPT Pro plan in early December. The chart above contains more: The cost of achieving high benchmark scores has crossed into the thousands of dollars. In the near term, this has implications for how and by whom leading-edge models might be used. A chatbot that racks up big charges and takes minutes to respond is going to have a fairly narrow set of customers, but if it can accomplish genuinely expensive work, it might be worth it — it’s a big departure from the high-volume, lower-value interactions most users are accustomed to having with chatbots, in the form of conversational chats or real-time assistance with programming. AI researchers expect techniques like this to become more efficient, making today’s frontier capabilities available to more people at a lower cost. They’re optimistic about this new form of scaling, although as was the case with pure LLMs, the limits of “test-time scaling” might not be apparent until AI firms start to hit them.

It remains an exciting time to work in AI research, in other words, but it also remains an extremely expensive time to be in the business of AI: The needs and priorities and strategies might have been shuffled around, but the bottom line is that AI companies are going to be spending, and losing, a lot of money for the foreseeable future (OpenAI recently told investors its losses could balloon to $14 billion by 2026). This represents a particular problem for OpenAI, which became deeply entangled with Microsoft after raising billions of dollars from the company. CEO Sam Altman has announced a plan to complete the conversion of OpenAI into a for-profit entity — the firm began as a nonprofit — and is in a better position than ever to raise money from other investors, even if actual profits remain theoretical. But Microsoft, a vastly larger company, still retains the rights to use OpenAI’s technology and acts as its primary infrastructure provider. It’s also entitled, for a term, to 20 percent of the company’s revenue. As OpenAI grows, and as its independent revenue climbs (the company should reach about $4 billion this year, albeit while operating at a major loss), this is becoming less tolerable to the company and its other investors.

OpenAI’s agreement does provide a way out: Microsoft loses access to OpenAI’s technology if the company achieves AGI, or artificial general intelligence. This was always a bit of a strange feature of the arrangement, at least as represented to the outside world: The definition of AGI is hotly contested, and an arrangement in which OpenAI would be able to simply declare its own products so good and powerful that it had to exit its comprehensive agreement with Microsoft seemed like the sort of deal a competent tech giant wouldn’t make. It turns out, according to a fascinating report in The Information, it didn’t:

Microsoft Chief Financial Officer Amy Hood has told her company’s shareholders that Microsoft can use any technology OpenAI develops within the term of the latest deal between the companies. That term currently lasts until 2030, said a person briefed on the terms.


In addition, last year’s agreement between Microsoft and OpenAI, which hasn’t been disclosed, said AGI would be achieved only when OpenAI has developed systems that have the “capability” to generate the maximum total profits to which its earliest investors, including Microsoft, are entitled, according to documents OpenAI distributed to investors. Those profits total about $100 billion, the documents showed.

This one detail explains an awful lot about what’s going on with OpenAI — why its feud with Microsoft keeps spilling into the public; why it’s so aggressively pursuing a new corporate structure; and why it’s raising so much money from other investors. It also offers some clues about why so many core employees and executives have left the company. In exchange for taking a multibillion-dollar risk on OpenAI before anyone else, Microsoft got the right to treat OpenAI like a subsidiary for the foreseeable future.

Just as interesting, perhaps, is the mismatch between how AI firms talk about concepts like AGI and how they write them into legal and/or legally binding documents. At conferences, in official materials, and in interviews, people like Altman and Microsoft CEO Satya Nadella opine about machine intelligence, speculate about what it might be like to create and encounter “general” or humanlike intelligence in machines, and suggest that profound and unpredictable economic and social changes will follow. Behind closed doors, with lawyers in the room, they’re less philosophical, and the prospect of AGI is rendered in simpler and perhaps more honest terms: It’s when the software we currently refer to as “AI” starts making lots and lots of money for its creators.

The Future of AI Shouldn’t Be Taken at Face Value