①A report on the rental of H100 for $2 per hour and the eve of the GPU bubble burst has sparked high attention in the domestic market. ②Morgan Stanley stated that Nvidia's Blackwell series GPU will be sold out in the next year; Huang Renxun also mentioned that existing datacenters will require $1 trillion in GPU upgrades. ③Will the decline in H100 rental prices really be equated to the 'GPU bubble burst'?
"Star Daily" October 14th News Recently, a report on the rental of H100 for $2 per hour and the eve of the GPU bubble burst has sparked high attention in the domestic market. The related article points out:
After the Nvidia H100 GPU was launched in March 2023, due to the surge in demand, its rental price skyrocketed from the initial $4.7 per hour to over $8 per hour. However, since this year, H100 has become 'oversupplied,' and the hourly rental price has dropped to around $2.
The price drop is due to several factors: 1) Some companies that had long-term reservations for H100 resell idle unused computing power after model training; 2) Many companies no longer train new models from scratch, but instead fine-tune open models, significantly reducing computing power demands; 3) The number of new startups focused on building large-scale basic models has significantly decreased; 4) Alternatives to H100 have emerged, such as AMD and Intel GPUs, and so on.
After tracing back this report, it can be found that mainstream overseas media and major tech media have not yet reported on this. The original report titled "$2 H100s: How the GPU Bubble Burst" comes from a website called Latent Space, authored by Eugene Cheah.
According to the website introduction, Latent Space mainly focuses on AI, combining content information, blogs, and community. It is co-hosted by Swyx and Alessio Fanelli, with the former's social platform account not providing specific identity self-introduction, and the latter being a partner and CTO of the early-stage venture capital firm Decibel VC.
The original author Eugene Cheah is the CEO of the startup company Featherless.Ai.
According to Cheah's introduction at the end of the original article on the 'GPU bubble,' Featherless.Ai currently hosts the world's largest open-source AI model, "starting at $10 per month, immediate access with unlimited requests, fixed price; can perform instant inferences in a serverless manner, without the need for expensive dedicated GPUs.
Has the decline in H100 rental prices led to the bursting of the GPU bubble?
In the original article about the 'GPU bubble,' there is an illustration of the oil painting 'Le Duel à la tulipe' created by French artist Jean-Léon Gérôme in 1882.
This painting depicts the first recorded speculative bubble in history—the 'Tulip Mania' in 17th century Netherlands. The price of tulips continued to rise in 1634 and collapsed in February 1637, leaving speculators with only 5% of their initial investment.
Will the speculative bubble from over three hundred years ago repeat itself? This question is stirring the nerves of every AI investor, and perhaps it is also the reason why the 'H100 rental price drop' article has gained high attention in the AI community.
From the quoting on the Vast.ai website, the hourly rental quote for 1x H100 is indeed within the range of 2 to 3 US dollars.
Vast.ai Quote
However, can the drop in H100 rental prices really be equated with the collapse of the 'GPU bubble'?
On one hand, according to Eugene Cheah's article, the 'H100 price drop' may be more aptly described as 'differentiation'—the continuous decline mainly focuses on the rental prices of small-scale clusters, while the prices of large-scale computing clusters may remain at higher levels.
Behind these large-scale computation clusters are often technology giants such as Tesla, Microsoft, and OpenAI. According to Omidia data, in the third quarter of 2023 after the release of H100, the shipment quantity reached 0.65 million units, with only Meta and Microsoft securing 0.15 million units each, accounting for nearly half.
On the other hand, electronic products have update and iteration cycles, and GPU chips are no exception. Previously, it was reported that the next generation GPU Blackwell series from Nvidia had design flaws, possibly delaying shipments. However, Morgan Stanley released a report last week stating that the production of Blackwell is going "as planned," with supply for the next 12 months already sold out. This means that customers placing orders now will not receive the goods until the end of 2025, which will continue to drive strong short-term demand for existing Hopper architecture products.
The leasing price of H100 has not plummeted suddenly; it has been fluctuating for some time. From A100 to H100, from H100 to H200, and then to the future Blackwell, the birth of new products will inevitably lead to the decline of previous generation products, not to mention the possibility that the computational cost of Blackwell may be further reduced compared to Hopper.
Nvidia's 'helm' Huang Renxun also recently spoke out. In an interview with Altimeter Capital, he emphasized that Nvidia's sustained bullishness is entirely different from the frenzy surrounding Cisco during the peak of the internet bubble. Nvidia is "reshaping computing" and the future will be an era of "highly machine learning."
"Moore's Law has essentially come to an end," he stated. In order to provide the necessary computing power to keep up with the pace of future compute-intensive software, existing datacenters will need approximately $1 trillion worth of GPUs for upgrades in the next 4-5 years.
It must be acknowledged that the alarm bells for the 'AI bubble' theory have been ringing repeatedly. Doubts about the difficulty of achieving expected returns on AI investments are on the rise: while OpenAI complains about insufficient and untimely computational resources and Nvidia's new products selling out, on the other hand, computational leasing prices continue to decline, and companies are even 'clearing out' GPUs.
However, it seems that local and short-term computational oversupply or shortages are becoming less indicative of the overall picture of AI. For the AI field, which involves a back-and-forth between supply and demand, long and short positions, apart from hardware, perhaps more new stories are urgently needed.