share_log

英伟达GPU,警钟敲响

Nvidia GPU, alarm bells ringing

Source: Semiconductor Industry Watch. At yesterday's Conputex conference, Dr. Lisa Su released the latest roadmap. Afterwards, foreign media morethanmoore released the content of Lisa Su's post-conference interview, which we have translated and summarized as follows: Q: How does AI help you personally in your work? A: AI affects everyone's life. Personally, I am a loyal user of GPT and Co-Pilot. I am very interested in the AI used internally by AMD. We often talk about customer AI, but we also prioritize AI because it can make our company better. For example, making better and faster chips, we hope to integrate AI into the development process, as well as marketing, sales, human resources and all other fields. AI will be ubiquitous. Q: NVIDIA has explicitly stated to investors that it plans to shorten the development cycle to once a year, and now AMD also plans to do so. How and why do you do this? A: This is what we see in the market. AI is our company's top priority. We fully utilize the development capabilities of the entire company and increase investment. There are new changes every year, as the market needs updated products and more features. The product portfolio can solve various workloads. Not all customers will use all products, but there will be a new trend every year, and it will be the most competitive. This involves investment, ensuring that hardware/software systems are part of it, and we are committed to making it (AI) our biggest strategic opportunity. Q: The number of TOPs in PC World - Strix Point (Ryzen AI 300) has increased significantly. TOPs cost money. How do you compare TOPs to CPU/GPU? A: Nothing is free! Especially in designs where power and cost are limited. What we see is that AI will be ubiquitous. Currently, CoPilot+ PC and Strix have more than 50 TOPs and will start at the top of the stack. But it (AI) will run through our entire product stack. At the high-end, we will expand TOPs because we believe that the more local TOPs, the stronger the AIPC function, and putting it on the chip will increase its value and help unload part of the computing from the cloud. Q: Last week, you said that AMD will produce 3nm chips using GAA. Samsung foundry is the only one that produces 3nm GAA. Will AMD choose Samsung foundry for this? A: Refer to last week's keynote address at imec. What we talked about is that AMD will always use the most advanced technology. We will use 3nm. We will use 2nm. We did not mention the supplier of 3nm or GAA. Our cooperation with TSMC is currently very strong-we talked about the 3nm products we are currently developing. Q: Regarding sustainability issues. AI means more power consumption. As a chip supplier, is it possible to optimize the power consumption of devices that use AI? A: For everything we do, especially for AI, energy efficiency is as important as performance. We are studying how to improve energy efficiency in every generation of products in the future-we have said that we will improve energy efficiency by 30 times between 2020 and 2025, and we are expected to exceed this goal. Our current goal is to increase energy efficiency by 100 times in the next 4-5 years. So yes, we can focus on energy efficiency, and we must focus on energy efficiency because it will become a limiting factor for future computing. Q: We had CPUs before, then GPUs, now we have NPUs. First, how do you see the scalability of NPUs? Second, what is the next big chip? Neuromorphic chip? A: You need the right engine for each workload. CPUs are very suitable for traditional workloads. GPUs are very suitable for gaming and graphics tasks. NPUs help achieve AI-specific acceleration. As we move forward and research specific new acceleration technologies, we will see some of these technologies evolve-but ultimately it is driven by applications. Q: You initially broke Intel's status quo by increasing the number of cores. But the number of cores of your generations of products (in the consumer aspect) has reached its peak. Is this enough for consumers and the gaming market? Or should we expect an increase in the number of cores in the future? A: I think our strategy is to continuously improve performance. Especially for games, game software developers do not always use all cores. We have no reason not to adopt more than 16 cores. The key is that our development speed allows software developers to and can actually utilize these cores. Q: Regarding desktops, do you think more efficient NPU accelerators are needed? A: We see that NPUs have an impact on desktops. We have been evaluating product segments that can use this function. You will see desktop products with NPUs in the future to expand our product portfolio.

Shortly after France announced it would launch an antitrust investigation against Nvidia, there was more bad news.

According to Margrethe Vestager, the head of the EU's competition affairs, Nvidia's AI chip supply is facing a "huge bottleneck", but regulatory agencies are still considering how to address the issue.

"We have been asking them questions, but these are just preliminary issues," she told Bloomberg during her visit to Singapore. So far, this "does not meet the conditions for regulatory action."

Since Nvidia became the biggest beneficiary of the AI spending spree, regulatory agencies have been monitoring it. Its graphics processing unit (GPU) has been favored by data center operators for its ability to handle vast amounts of information needed to develop AI models.

Chips have become one of the hottest commodities in the tech industry, with cloud computing providers competing with each other for these chips. It is estimated that Nvidia's H100 processor is in high demand and has helped them gain more than 80% of the market share, ahead of competitors Intel and AMD.

Despite the supply shortage, Vestager said the secondary market for AI chip supply could help stimulate innovation and fair competition.

But she said dominant companies may face certain behavioral restrictions in the future.

"If you have that kind of dominance in the market, there are things you cannot do that small companies can do," she said. "But beyond that, as long as you do your business and respect this, you're fine."

The $600 billion dilemma

Although high-tech giants have invested heavily in AI infrastructure, revenue growth from AI has not been realized, indicating a huge gap in end-user value in the ecosystem. In fact, David Cahn, an analyst at Redwood Capital, believes that AI companies need to earn about $600 billion a year to pay for their AI infrastructure (such as data centers).

Last year, Nvidia's data center hardware revenue reached $47.5 billion (most of which was computing GPU for AI and HPC applications). AWS, Google, Meta, Microsoft, and other companies invested heavily in their AI infrastructure for applications such as OpenAI's ChatGPT in 2023. But can they make a profit on this investment? David Cahn believes this may mean we are witnessing the growth of a financial bubble.

According to David Cahn's algorithm, the $600 billion figure can be obtained by simple math.

All you have to do is multiply Nvidia's operating revenue forecast by 2 to reflect the total cost of AI data centers (GPUs account for half of the total investment, and the other half includes energy, buildings, backup generators, and so on). Then, multiply it by 2 to reflect the 50% gross margin of GPU end-users (for example, startups or businesses that buy AI calculations from Azure, AWS, or GCP also need to make money).

Let's see what has changed since September 2023 (when he believed AI was a $200 billion problem)?

1. The supply shortage has dissipated: The end of 2023 was the peak of GPU supply shortages. Startups were calling venture capital firms, calling anyone who would talk to them, seeking help to get GPUs. Now, these concerns are almost completely gone. For most of the people I've talked to, getting GPUs on reasonable delivery times is relatively easy now.

2. GPU inventory continues to grow: Nvidia reported in the fourth quarter that about half of its data center revenue comes from large cloud providers. Microsoft alone could account for about 22% of Nvidia's fourth-quarter revenue. Capital expenditures on a massive scale are reaching historic levels. These investments are a major theme for the colossal tech companies' revenues in the first quarter of 2024, with CEOs effectively telling the market: "Like it or not, we're going to invest in GPUs." Storing hardware is not a new phenomenon, and once inventory is large enough to cause demand to fall, it becomes a catalyst for reset.

3. OpenAI still dominates AI revenue: The Information recently reported that OpenAI's revenue is now $3.4 billion, up from $1.6 billion at the end of 2023. While we have seen a few startups with revenue scales of less than $0.1 billion, the gap between them and other companies is still significant. Besides ChatGPT, how many AI products do consumers really use today? Think about how much value you get from Netflix or Spotify for $15.49 or $11.99 per month. In the long run, AI companies need to provide huge value to consumers in order to continue to pay.

Fourth, the shortfall of $125 billion has now become a shortfall of $500 billion: In my final analysis, I generously assumed that Google, Microsoft, Apple, and Meta can each generate $10 billion in new AI-related revenue annually. I also assumed that Oracle, ByteDance, Alibaba, Tencent, X, and Tesla will each have $5 billion in new AI revenue each year. Even if this is still correct, and we add a few more companies to the list, the $125 billion shortfall will now become a $500 billion shortfall.

But that's not all — B100 is coming: Earlier this year, NVIDIA announced the release of the B100 chip, which is 2.5x more powerful while only 25% more expensive. I expect this will ultimately lead to a surge in demand for NVDA chips. Compared to the H100, the cost/performance of the B100 is significantly improved, and with everyone looking to buy B100 later this year, it is likely that there will once again be a shortage of supply.

When David Cahn was asked about GPUs, one of the main objections he received was that "GPU capital expenditures are like building railroads", the train will eventually come and the destination will be reached — new agricultural exports, amusement parks, shopping centers, etc.

David Cahn said he agreed with this, but he believed that the argument ignored some key points:

1. Lack of pricing power: In the case of physical infrastructure construction, the infrastructure you are building has some inherent value. If you have a track between San Francisco and Los Angeles, you may have some kind of monopolistic pricing power because there are only that many tracks between A and B. In the case of GPU data centers, pricing power is much smaller. GPU computing is increasingly becoming a commodity billed hourly. Unlike CPU clouds that become oligopolies, new entrants building dedicated AI clouds continue to enter the market. In the absence of a monopoly or oligopoly, high fixed cost + low marginal cost companies almost always see pricing competition at the margin (e.g. airlines).

2. Investment waste: even in the railroad industry, as in many new technology industries, speculative investment manias often lead to high capital waste. The Engines that Moves Markets is the best textbook on technology investing, and its main point (indeed, focusing on the railroad industry) is that many people have lost big in speculative technology manias. Picking winners is hard, but picking losers (in the railroad industry, canals) is much easier.

3. Depreciation: We learn from the history of technological development that semiconductors tend to get better and better. Nvidia will continue to produce better next-generation chips like the B100. This will lead to faster depreciation of the previous generation of chips. Because the market underestimates the pace of improvement of B100 and the next generation of chips, it overestimates the value of H100 purchased today in 3-4 years. Similarly, there is no such similarity in physical infrastructure, which does not follow any kind of "Moore's Law" type curve, so the relationship between cost and performance keeps improving.

4. Winners and losers: I think we need to look carefully at winners and losers—in periods of infrastructure oversupply, there are always winners. AI is likely to be the next transformative technology wave, and the drop in GPU computing prices actually benefits long-term innovation and start-ups. If David Cahn's predictions come true, they will mainly hurt investors. Founders and company builders will continue to develop in the field of AI—they are more likely to succeed because they will benefit from lower costs and accumulated experience during this trial period.

5. Artificial intelligence will create tremendous economic value. Creators who focus on providing value to end users will reap rich rewards. We are experiencing a technological wave that could define a generation. Companies like Nvidia have played an important role in driving this transformation, are worthy of commendation, and are likely to play a crucial role in the ecosystem for a long time to come.

However, David Cahn also reiterated that speculative mania is part of technology, so there is nothing to fear. Those who remain level-headed at this moment have the opportunity to create extremely important companies. But we must ensure that we do not believe in the delusion that has spread from Silicon Valley to the entire country and even the world, that we will all get rich quickly because AGI will come tomorrow and we all need to hoard the only valuable resource, which is GPUs.

"In fact, the road ahead will be long. It will have ups and downs. But it is almost certain that it is worth it," David Cahn emphasized.

Potential challengers

Although this is a theme that has been discussed many times, it seems to have also produced results. As Futurum Group CEO Daniel Newman said, "There is currently no Nvidia nemesis in the world."

The reasons are as follows: Nvidia's graphics processing unit (GPU) was originally created in 1999 for ultra-fast 3D graphics in PC video games, and was later found to be very suitable for training large-scale generative AI models. Models from companies such as OpenAI, Google, Meta, Anthropic, and Cohere have grown larger and larger, requiring the use of a large number of AI chips for training. For many years, Nvidia's GPU has been considered the most powerful and most in demand.

Of course, these costs are not cheap: training top-level generative AI models requires tens of thousands of top-end GPUs, with a price of $0.03 million to $0.04 million per GPU. For example, Elon Musk recently said that his company xAI's Grok 3 model needs to be trained on 0.1 million Nvidia top-end GPUs to be "special", which will bring Nvidia more than $3 billion in chip revenue.

However, Nvidia's success is not just the product of chips, but also of software that makes chips easy to use. Nvidia's software ecosystem has become the preferred choice of a large number of developers focused on AI, and they have little incentive to switch. At last week's annual shareholder meeting, NVIDIA CEO Huang Renxun called the company's software platform CUDA (Computing Unified Device Architecture) a "virtuous circle." With more users, Nvidia has the ability to invest more money to upgrade the ecosystem, attracting more users.

In contrast, Nvidia's semiconductor competitor AMD controls about 12% of the global GPU market. The company does indeed have competitive GPUs and is improving its software, Newman said. However, although it can provide another option for companies that don't want to be tied to Nvidia, it doesn't have an existing developer user base, which considers CUDA easy to use.

In addition, while large cloud service providers such as Amazon's AWS, Microsoft Azure and Google Cloud all produce their own proprietary chips, they do not intend to replace Nvidia. Instead, they hope to have a variety of AI chips to choose from to optimize their data center infrastructure, lower prices, and sell their cloud services to the widest potential customer base.

J. Gold analyst Jack Gold explained, "Nvidia has early momentum, and when you're building a fast-growing market, it's tough for others to catch up." He said Nvidia has done a good job of creating a unique ecosystem that others do not have.

Matt Bryson, Senior Vice President of Stock Research at Wedbush, added that replacing Nvidia's chips for training large-scale AI models would be particularly difficult. He explained that most of the current spending on computing power flows into this area. "I don't think this dynamic will change anytime soon," he said.

However, more and more artificial intelligence chip start-ups, including Cerebras, SambaNova, Groq, and the latest Etched and Axelera, see opportunities to take a share of the pie from Nvidia's artificial intelligence chip business. They focus on meeting the special needs of artificial intelligence companies, especially the so-called “inference”, which allows the model to output information by running data through the already trained AI model (for example, each answer of ChatGPT requires inference).

For example, just last week, Etched raised $0.12 billion to develop a dedicated chip for running transformer models, an AI model architecture used by OpenAI's ChatGPT, Google's Gemini, and Anthropic's Claude. According to reports, the chip will be produced by TSMC using its 4nm process. The company also claimed that Sohu is an order of magnitude faster and cheaper than Nvidia's upcoming Blackwell GPU, and an eight-chip Sohu server can process more than 500,000 Llama 70B tokens per second. Etched CEO Uberti said in an interview that one Sohu server will replace 160 H100 GPUs.

Dutch startup Axelera AI is developing chips for artificial intelligence applications. The company claimed to have raised $68 million in funding last week and is raising funds to support its ambitious growth plans. The Eindhoven-based company aims to become the European version of Nvidia, offering AI chips allegedly 10 times more energy-efficient and five times cheaper than its competitors. The innovation core of Axelera is the Thetis Core chip, which can perform an astonishing 260,000 computations in one cycle, while a regular computer can only perform 16 or 32 computations. This capability makes it ideal for AI neural network computing, mainly vector matrix multiplication. Their chips offer high performance and availability at a fraction of the cost of existing market solutions. This can make AI more popular, allowing more applications and users to use it.

Meanwhile, Groq, which focuses on running models at lightning speed, is reportedly raising new funds at a valuation of $2.5 billion, while Cerebras reportedly secretly submitted its first public offering application shortly after releasing its latest chip, claiming that the chip can train AI models ten times larger than GPT-4 or Gemini.

All of these startups may initially focus on a small market, such as providing more efficient, faster, or cheaper chips for certain tasks. They may also focus more on dedicated chips for specific industries or artificial intelligence devices such as personal computers and smartphones. "The best strategy is to develop a niche market rather than trying to conquer the world, which is what most of them are trying to do," said Jim McGregor, chief analyst at Tirias Research.

Therefore, perhaps the more relevant question is: how much market share can these startups take with cloud providers and semiconductor giants such as AMD and Intel? This remains to be seen, especially as the chip market for running AI models or inferences is still very new.

Editor/Lambor

The translation is provided by third-party software.


The above content is for informational or educational purposes only and does not constitute any investment advice related to Futu. Although we strive to ensure the truthfulness, accuracy, and originality of all such content, we cannot guarantee it.
    Write a comment