share_log

又一次超预期后,英伟达财报电话会说了什么?(附全文)

After exceeding expectations once again, what will Nvidia's earnings call say? (full text attached)

wallstreetcn ·  May 23 11:09

Source: Wall Street News

Huang Renxun said that this year he will see a significant increase in Blackwell's chip revenue, and another chip will be launched after Blackwell. The chief CFO said that over the past four quarters, inference has driven about 40% of data center revenue, and Spectrum-x, the world's first high-performance Ethernet architecture designed specifically for AI, will achieve multi-billion dollars in revenue within a year.

Hwang In-hoon held on Wednesday local time$NVIDIA (NVDA.US)$The earnings call said that Blackwell chip products will be shipped in the second quarter of this year, production will increase in the third quarter, and be launched in data centers in the fourth quarter. This year, we will see a “significant increase in Blackwell chip revenue”, and another chip will be launched after Blackwell chips to achieve “one generation per year.”

Regarding network revenue, Nvidia split network revenue separately for the first time. Hwang In-hoon said that the company will work on three network links, from NVLink for a single computing domain to InfiniBand, to an Ethernet network computing structure.

Regarding the revenue contribution of the Ethernet network, Chief Financial Officer Colette Kress, who also attended the conference call, said that Spectrum-X, the world's first high-performance Ethernet architecture designed specifically for AI, is opening up a new market for Nvidia and is currently being mass-produced with many customers, including a large cluster of 100,000 GPUs. The company expects Spectrum X to jump to a multi-billion dollar product line within a year.

Regarding inference revenue, Colette Kress said the company expects reasoning capabilities to expand as model complexity, number of users, and number of queries per user increase. Over the past four quarters, estimated reasoning has driven approximately 40% of our data center revenue.

Regarding AI PCs, Hwang In-hoon said that even the PC computing stack will undergo revolutionary changes and is currently “completely redesigning the way computers work.”

Below is the full text of Nvidia's Q1 earnings call analyst Q&A session:

Bernstein Research Stacy Rascan:

My first question is, I'd like to take an in-depth look at Blackwell's review, which is important for production.

Right now. Does this hint at shipping and delivery times? If the product is no longer just a sample, what does it mean when it actually reaches the customer?

Huang Renxun:

If it were in production now, we would ship it, and we've been making it for a while.

But our production shipments will begin in the second quarter and increase in the third quarter, and customers should set up data centers in the fourth quarter.

Stacy Rascan:

Understood. So this year, it sounds like we're going to see Blackwell's earnings.

Huang Renxun:

We're going to see a lot of Blackwell's revenue this year.

CBS Timothy R. Curry:

I'd like to ask Jensen about Blackwell's deployment. You know, compared to Hopper, how is deploying this product different from Hopper considering the characteristics of the system and your requirements for, you know, GB?

I'm asking this because liquid cooling on a large scale hasn't been done before, and there are some engineering challenges at the node level and within the data center. So will this complexity prolong the transition period? How do you think all of this is progressing? Thank you.

Huang Renxun:

Yes, Blackwell has many configurations. Blackwell is a platform, not a GPU. The platform includes support for air cooling, liquid cooling x 86, and Grace Infinity Band. Now Spectrum X, a very large MV Link domain, I'm showing GTC on GTC.

So for some customers, they'll be moving into their existing data center installations that are already shipping hoppers. They will easily transition from H100 to H200 to BE 100.

As a result, the Blackwell system has been designed to be backwards compatible, from an electrical and mechanical point of view, if you prefer. Of course, software running on Hopper will work very well on Blackwell.

We've also been preparing our entire ecosystem to prepare them for liquid cooling. We've been talking to Blackwell in the ecosystem for a long time. And no one from CSPs, data centers, ODMs, system manufacturers, our supply chain, their supply chain, liquid cooling supply chain, data center supply chain, will be surprised by Blackwell's arrival and our ability to deliver 200 GB with Grace Blackwell. GB 200 would be excellent.

Bank of America Securities Analyst Beck Aria:

Thank you for taking my questions, Jensen. How can you ensure that your products are being utilized sufficiently and are not being pulled or held ahead of time due to tight supply, competition, or other factors?

What checks have you put in place in your system to convince us that monetization keeps pace with your very, very strong shipping path?

Huang Renxun:

I'd like to give you a macro perspective and then answer your questions directly.

The demand for GPUs in all data centers is amazing. We're racing every day, and that's why, because apps like ChatGPT and GPT 4 o, which will now be multi-modal, and Gemini and its ramps, and all the work that CSPS is doing, is consuming every GPU. There is also a long list of generative AI startups, about 15,000 to 20,000 startups, in various fields, from multimedia to digital characters, and of course, various design tool applications, productivity applications, and digital biology.

The autonomous vehicle industry is turning to the use of video data to enable them to train end-to-end models and expand the field of autonomous vehicle operation. This list is pretty special.

We're racing; in fact, our customers are putting a lot of pressure on us to deliver and set up the system as soon as possible.

Of course, I haven't even mentioned all the countries that want to train their region's natural resource data, that is, their data, to train their regional models. There is a lot of pressure to set up these systems as soon as possible. Anyways, I think demand is really high, and there's not enough supply in the long run. That's why I'm posting some long comments, right?

You know, we're completely redesigning how computers work. Of course, this has been compared to the transformation of other platforms in the past, but time will clearly show that this is far more profound than the transformation of previous platforms. The reason is that a computer is no longer just an instruction-driven computer. It's a computer that is meant to understand. It certainly understands how we interact with it, but it also understands what we're asking it to do, and has the ability to reason, plan, and iterate, and come up with solutions.

As a result, every aspect of the computer is changing in such a way that it no longer retrieves pre-recorded files, but now generates context-sensitive, intelligent answers. As a result, this will transform computing stacks around the world. You've seen a build that, in fact, will revolutionize even the PC computing stack.

And that's just what we saw; you know, what we're seeing today is just the beginning of the work we're doing in the lab and what we're doing with all the startups, big companies, and developers around the world, which will be extraordinary.

Morgan Stanley Joe Moore:

I understand your comment just now about how strong the demand is, the market has a lot of demand for H200 and Blackwell products. Do you expect Hopper and H100 to be suspended as we transition from these products? Will people wait for these new products and will this be a good one? Or do you think demand for the H100 is enough to keep growing?

Huang Renxun:

We've seen demand for Hopper increase this quarter, and we expect demand to exceed supply for some time as we transition to the H 200 as we transition to Blackwell. Me, everyone is anxious to get their infrastructure online. That's why, because they're saving and making money, and they want to do it soon.

Goldman Sachs Toshiahari:

I want to ask a question about competition. I think many of your cloud customers have announced, you know, new or updates to existing internal plans, in parallel with their work with you, to what extent do you view them as medium- to long-term competitors? Do you think they're mostly limited to solving internal workloads in the near future? Or can they address future issues more broadly? Thank you.

Huang Renxun:

Yes, we have a few differences.

First, Nvidia's accelerated computing architecture allows customers to handle every aspect of their entire process, from processing unstructured data preparation training, to structured data processing, data frame processing (such as SQL) preparation training, to training and inference. As I mentioned in my speech, reasoning has fundamentally changed. Now it's spawning, no longer just trying to detect cats (although this itself is already very difficult). Now we need to generate every pixel of the cat. So the generation process is a fundamentally different processing architecture, which is one of the reasons Tensor RTLM is popular.”

We have tripled the performance of our architecture using the same chip, which shows the richness of our architecture and software. So first, you can use Nvidia to handle all modes of computation from computer vision to image processing to computer graphics. As the world now suffers from rising computational costs and computational energy consumption, because general computing has reached its limit, accelerated computing is the only way to achieve sustainable development. Accelerated computing is a way for you to save on computing costs and energy consumption. As a result, the versatility of our platform provided their data center with the lowest total cost of ownership (TCO).

Second, we're never in the cloud. So for developers looking for a platform to develop on, starting with Nvidia is always a good choice. We, we're local. We're in the cloud. You're on a computer of any size and shape. We're almost everywhere. That's the second reason.

The third reason has to do with the truth, you know, we're building AI factories, and it's becoming more and more obvious that AI isn't just a chip issue. Of course, it started with a very good chip, we built a lot of chips for our AI factory, but it's a system issue.

In fact, even AI is now a systemic issue. It's more than just a big language model. It's a complex system of a bunch of large language models that work together. So the fact that Nvidia built the system allows us to optimize all of our chips so that they work together as a system, have software that runs as a system, and can be optimized throughout the system.

Now, let's look at it from a simple numerical perspective, you know that if you have a $500 million infrastructure, you've doubled the performance, which is what we usually do. When you double the performance of an infrastructure, its value increases by $1 billion. Not all data center chips can pay for it. So its value is really very special. That's why performance is critical today.

You know, this is the time for highest performance and lowest cost, because the infrastructure to carry all these chips is expensive, funding the data center, the complexity that comes with it, the electricity that comes with it, and the real estate that comes with it; you know, it's all over. So the highest performance is also the lowest TCO.

Matt Ramsey, analyst at market research firm TD Cowen:

I've worked in the data center industry my whole life. I've never seen the speed of the new platform you've launched, and the performance improvements you've received. The speed of training is 5 times faster, and the speed of reasoning is 30 times faster on GTC, which is amazing.

But it also creates an interesting contrast: the current generation of products, where your customers are spending billions of dollars, will stop competing with your new products faster than the depreciation cycle of the product. So I'd like to ask you, if you don't mind, to talk about how you think this situation has evolved with the launch of Blackwell.

They will have a very large installation base, which is obviously software compatible, but a product with a very large installation base that is far less performant than your next generation product. I'd love to hear what you think customers are on this path. were

Huang Renxun:

I would like to make three points. If your construction progress was 5% and your construction progress was 95%, you would feel a very different way. And because you've only done 5% of the construction anyway, you know you have to build as fast as you can, and when Blackwell arrives, it will be fantastic.

Then, after Blackwell, as you mentioned, we have other Blackwells coming soon. Then there's a short one, you know, we explain to the world at the pace of a year. We want our customers to see our roadmap and get to know them as best they can.

But they're in the early stages of construction anyway, so they have to keep building, so lots of chips will hit them, and they'll just have to keep building, and if they want, get into it through performance averaging, so this is a smart move.

They need to make money today. They want to save money today, and their time is really valuable to them.

Let me give you an example of how valuable time is, why the idea of setting up a data center right away is so valuable, and getting time to train is so valuable. That's why, because the next company to reach its next major milestone will announce a groundbreaking AI. The second company after that was only able to announce something 0.3% better than the original.

So the question is, would you like to be a company that repeatedly delivers breakthrough AI, or do you want to be a company that delivers 0.3% better than before? That's why this game is so important because of all technical competitions, this game is very important. And you're seeing this competition unfold across multiple companies because it's critical for companies to trust leadership and want to build on your platform, and know that the platform they're building will get better and better.

So leadership is very important, and training time is very important. The time to complete the program is three months ahead of schedule in order to gain time for training. You know, start a three-month project three months early. Starting three months early is everything. So that's why we're now so crazy about building the Hopper system because the next milestone is just around the corner. So that's the second reason.

The first comment you just made was really great, and that is, you know, we're moving so fast, improving so fast because we have all these stats. We've built our entire data center here, and we can monitor everything, measure everything, and optimize across everything.

We know where all the bottlenecks are. We're not speculating. We're not making powerpoint slides that look good. We actually, you know, we love that our Powerpoint slides look good, but we're delivering a system that works great on a scale. The reason we know they're great at scale is because we built them all.

Now, one thing we're doing is a bit of a miracle, and that's where we build the entire AI infrastructure, but then we unaggregate and integrate it into the customer's data center, whatever way they prefer. But we know how it's going to perform, we know where the bottlenecks are, we know where we need to optimize with them, and we know we need to help them improve their infrastructure for optimal performance.

This deep, intimate understanding of the scale of the entire data center is basically what makes us unique today. You know, we build every chip from scratch. We know exactly how the entire system is handled, so we know how it will perform and how to get the best results from each generation. So I appreciate these three points you've made.

Independent Investment Bank Consulting Evercore Analyst Marco Pacas:

In the past, you've said that general-purpose computing ecosystems generally dominate every era of computing. I believe the argument is that they can adapt to different workloads, achieve higher utilization rates, and drive down computational cycles. It's your motivation to drive the general-purpose GPU computing ecosystem to accelerate computing. Please let me know if I've misunderstood your observation.

So the problem is, given that the workloads driving your solution requirements are driven by neural network training and inference, and on the surface, this seems like a finite number of workloads, then they might also lean towards custom solutions. So the question is, are general computing frameworks becoming more risky? Or is there enough variability or rapid evolution of these workflows to support that historic generic framework. Thank you.

Huang Renxun:

Yes, these accelerated calculations are versatile, but I wouldn't call them universal. For example, we're not very good at running spreadsheets; you know, this is designed for general purpose computing. Also, the loop that controls the operating system code may not be good for general purpose computing; it's not good for us, or for us, for accelerated computation.

So I'd say I'd say we're multi-purpose, which is usually how I describe it. There are plenty of areas of applications we've been able to accelerate over the years, but they all have a lot in common. You know, there may be some profound differences, but they all have one thing in common: they can all run in parallel. They're all highly threaded. 5% code represents 99% run time,

For example. These are all properties of accelerated computation. The versatility of our platform and the fact that we designed the entire system is that over the past 10 years or so, the number of startups you asked me about during these conference calls is quite large. And each, because their architecture is so fragile, when generative AI appears, when diffusion models appear, when the next models appear, now the next batch of models is about to appear.

Then all of a sudden, look at these large language models with memories. Because large language models require memory so they can talk to you and understand the context. All of a sudden, the versatility of Grace memory became very important. So every one of these generative AI advancements, and AI advancements, really need not a small component designed for a model, but some components that are really good for this whole field.

However, following the first principles of software, software will continue to evolve, and software will continue to get better and bigger. We believe in the expansion of these models. There are plenty of reasons we'll easily scale a hundred times over the next few years, we're looking forward to it, and we're ready for it. So the versatility of the platform is really the key. Also, if you're too weak or too specific, you're probably just building an FPGA or an ASIC or something like that, but that's hardly a computer.

Raymond James analyst Serena Pajuri:

Actually, I want to clarify what you just said about the GB200 system. The demand for the system appears to be very strong. Historically, I think you've sold a lot of AGX boards and some GPUs, and the systems business is relatively small. So I'm curious, why are you seeing such strong demand for the system now? Is this because of TCO (Total Cost of Ownership)? Or some other reason? Or is it because of the architecture?

Huang Renxun:

Thank you for your question. In fact, the way we sell GB200 is the same. We break down all the meaningful components and integrate them into the computer manufacturer. We have 100 different computer SIS configurations coming soon for Blackwell this year, and this is beyond expectations.

Frankly, Hoppers are only half as many. And this is its pinnacle, and at first it was far less than that. You'll see liquid-cooled versions, air-cooled versions, x86 versions, Grace versions, etc. Many systems are being designed and delivered by all of our excellent partners. Basically nothing has changed.

Of course, the Blackwell platform has greatly expanded our product offering. With CPU integration and higher density computing power, liquid cooling technology will save data centers a lot of money and power supply. Not only is it more energy efficient, but it is also a better solution. It's more expensive, meaning we're providing our data center with more components, and everyone benefits.

Data centers will get higher performance from high-performance networks and network switches. And, of course, Ethernet. In this way, we can bring Nvidia AI on a large scale to customers who only operate or are familiar with Ethernet, because that's what their ecosystem is. As a result, Blackwell is more expensive, and our generation has more products available to our customers.”

William Stein, analyst at Truest Securities:

“Very good. Thanks for answering my questions, Jensen. At some point, Nvidia decided that while there were some pretty good CPUs for data center operations, your ARM-based Grace CPUs provided some real advantages that made this technology worth delivering to customers, which could be related to cost or power consumption or technical collaboration between Grace and Hopper, Grace and Blackwell.

Can you talk about whether similar dynamics might also occur on the client side? Although there are very good solutions, such as the Intel and AMD you mentioned, are excellent partners and provide excellent x86 products, Nvidia may offer some advantages that others cannot achieve in emerging AI workloads.”

Huang Renxun:

You mentioned some very good reasons. For many of the apps, we've worked really well with our x86 partners, and together we've built an amazing system. But Grace allows us to do things that aren't possible with today's system configurations. The memory system between Grace and Hopper is coherent and connected. The interconnection between these two chips, you know, is almost strange to call them two chips because they're like a superchip. They are connected together via this terabyte per second interface. Yes, this is fantastic. Also, Grace's memory is LPDDR. This is the first data center class low power memory. So we're saving a lot of electricity on every node.

Finally, because of the architecture, because we can use the entire system to create our own architecture, we can now create a really large MV Link domain, which is critical for the next generation of large language models to reason. So you see GB 200 has a 72-node MV Link domain. It's like 72 Blackwells connected together to form a giant GPU. So we need Grace Blackwells to do that. So I, if there's an opportunity like that, you know, we'll explore it. Also, as you saw in yesterday's build, I think it's pretty awesome.

As Satya announced that the next PC is the Copilot plus PC, it runs very well on Nvidia's RTX GPUs, which are being shipped on laptops. But it also supports ARM, which is excellent. So it opens up opportunities for system innovation, even for PCs.

Kendra Fitzgerald analyst CJ Muse:

I'd like to ask Jensen a slightly longer term question. I know Blackwell hasn't even released it yet, but clearly investors are forward-looking, and in the face of growing competition from potential GPUs and customers, what do you think of Nvidia's pace of innovation, million-fold expansion over the past decade, and truly impressive progress in precision, grace, and consistent connectivity. Looking ahead, what frictions will need to be resolved in the next ten years? I think more importantly, what would you like to share with us today?

Huang Renxun:

Well, I can announce that after Blackwell, there will be another chip, which we will update every year. As a result, you can also count on us to roll out new web technology very fast.

We're announcing Spectrum X for Ethernet, but we're fully committed to Ethernet, and we have a very exciting Ethernet roadmap. We have a rich ecosystem of partners. Dell announced they are bringing Spectrum X to market. We have a wealth of partners and customers who will announce the launch of our entire AI factory architecture to market.

Therefore, for companies looking for ultimate performance, we have an infinite bandwidth computing structure. Infiniband is a computing fabric, Ethernet, and network. Over the years, Infiniband began as a computing structure and gradually became a better and better network. Ethernet is a network. With Spectrum X, we're going to make it a better computing structure. We are committed to all 3 links, from the Envy link, the commuter computing structure to the infinite bandwidth computing structure, to the Ethernet network computing structure.

So we're going to advance all three at a very fast pace. You'll see new switches, new next steps, new features, and a new software stack running on all three aspects. The new CPU, the new GPU, the new network, the next new switch, the new number of chips, and all of these wonderful things run on Kuda, all running our entire software stack.

So if you invest in our software stack today without having to do anything, it's only going to get faster. If you invest in our architecture today without having to do anything, it will become more and more clouds and more data centers, everything will be better. As a result, I think the pace of innovation we are bringing will increase capacity on the one hand and lower TCO on the other. Therefore, we should be able to expand this new era of computing through the Nvidia architecture and begin this new industrial revolution, where we are no longer just making software, but making artificial intelligence tokens, and we will achieve this goal on a large scale. Thank you.

Editor/jayden

The translation is provided by third-party software.


The above content is for informational or educational purposes only and does not constitute any investment advice related to Futu. Although we strive to ensure the truthfulness, accuracy, and originality of all such content, we cannot guarantee it.
    Write a comment