The week after the US government endorsed a plan to pump half a trillion dollars into building the world's most powerful compute infrastructure, tech share prices have plummeted.
That's because a small company in China showed that there could be another way, and it has terrified AI's most vocal proponents.
While US tech is stuck in a war over compute and who can deploy the largest data center, China's DeepSeek launched a cheap R1 chatbot that appears to be just as good as OpenAI's latest o1 models.
The company's open sourced DeepSeek-V3 model was built on a training run that cost $5.6 million (although there are many caveats, which we will look at), using hardware that is inferior to US deployments.
Due to sanctions, DeepSeek claims it used 2,048 H800 GPUs, the Chinese-only version of the H100, which has 50 percent of the transfer rate. Papers published by the company show it also has 10,000 A100 GPUs.
This, if true, upends the current Silicon Valley philosophy of ever-greater and costlier data center builds, begging the question of how necessary it is to pump $500 billion into Stargate, and hundreds of billions more into data centers, chips, and networking equipment.
There is, as ever, more nuance to the situation. Even if DeepSeek is telling the truth, the reality is the model did not cost $5.6m - that is only the cost of the final run. Building models take much iteration and experimentation, sometimes with different clusters running concurrently, so the total cost is likely much higher. The $5.6m number is also based on an assumed cloud rental price, when DeepSeek is believed to be using its own systems.
Critics and doubters of DeepSeek, meanwhile, are convinced the company is lying about its compute capabilities, and hiding a larger H100 cluster that it smuggled into China.
Dylan Patel, the well-respected analyst at SemiAnalysis, claims the company has a 50,000 GPU H100 cluster, a number also claimed by Scale AI CEO Alexandr Wang. That would cost billions, not even factoring in black market costs, pushing project costs much higher.
However, neither has provided any proof of such claims. Nor does the existence of such a cluster necessarily mean it was used for these models - parent company High-Flyer initially built out its compute for quantitative hedge fund work. With the company managing $8 billion in assets, there's no sign it has completely shifted its portfolio over to LLMs.
A third version of events could be that a larger model was trained on a hidden H100 cluster, and then through the process of distillation, a smaller model was developed for the H800 system.
Meta's chief AI scientist Yann LeCun has another suggestion for how the models ended up so cheap - the benefits of open source, once core to the mission of OpenAI, but now long discarded.
DeepSeek has "profited from open research and open source," LeCun said. "They came up with new ideas and built them on top of other people's work. Because their work is published and open source, everyone can profit from it."
Whether DeepSeek used more compute than claimed or not, it almost certainly used less than US rivals. Its efforts appear to be due to the old adage - necessity is the mother of all inventions. Sanctions and a growing hostility towards China has forced developers in the nation to focus on optimizing.
While Sam Altman can always raise another billion and build another data center, making OpenAI's focus on scale, DeepSeek and other Chinese developers have had to learn how to make the best of what they had.
This, at the very least, is the perception. That has completely upended the current generative AI narrative - give us enough money, compute, data, and time, and we will build unparalleled intelligence.
The entire conceit is based on a promise, one that claims that only a few great minds clustered around the San Francisco area will be able to create something worth countless trillions.
Microsoft has already shown it is unsure, deciding not to participate in Stargate funding. Nvidia is already hedging its bets, pushing the use of GPUs for traditional CPU workloads, and hyping robotics as the next big thing.
It could very well be that the Silicon Valley model is correct, that spending more and building more will win out, if allowed. But these companies will only be able to do so if investors are willing to take the expensive bet.
Winning them over just got significantly harder, which could cause funding to dry up and momentum to slow - the ultimate killer of any hype wave.
For any bubble to last and transition to a sustainable business model, investors have to be placated by an endless flurry of announcements and promises. The entire edifice has to perpetually feel on the edge of reaching the promised land, and any data showing the opposite can quickly cause everything to unravel.
Doubts about the viability of the generative AI business model predate DeepSeek, but its appearance as a viable, affordable competitor has allowed those concerns to come to the fore.
Expect OpenAI and others pushing the 'build it big' model to ramp up their promises in the days and weeks ahead to counter the narrative.
They will likely point to Jevons Paradox - that increased efficiency in resource use can lead to increased resource consumption. They will argue that they are also efficient, but ask - if someone can make that with 2k limited GPUs, imagine what we can do with a million more powerful ones?
The question, however, is whether the investment community will give them the time to prove it, or if the golden era is coming to a close.