share_log

NVIDIA's Market Value Reaches $5 Trillion! A Recap of Jensen Huang's GTC Speech with Chinese Translation

Barron's ·  Oct 29, 2025 19:04

Source: Barron's

During his keynote speech at the GTC Washington conference, Jensen Huang, CEO of NVIDIA, announced a $1 billion investment in Nokia, introduced new technologies and products such as NVIDIA Arc, NVQLink, and the next-generation Vera Rubin superchip, and consistently emphasized ensuring the United States' leadership through investments in AI infrastructure.

The enthusiasm for artificial intelligence is growing to such an extent that the "AI Spring Festival Gala" is now held twice a year.

From October 27 to 29 Eastern Time, NVIDIA’s GTC conference, a major AI event, was held in Washington, D.C. for the first time as an additional session. Unlike the flagship spring conference in San Jose in March, which focused more on technology and products, the October Washington conference was more of a “policy-focused” event, with many discussions centered on industrial policies and the role of government in the AI field.

NVIDIA had previously announced that the highlight of the conference — a keynote speech by CEO Jensen Huang — would not only reveal product-related information but also outline a roadmap for how AI is reshaping industries, infrastructure, and the public sector. Pre-conference reports from Wccftech noted that geopolitical issues have become increasingly important to NVIDIA, and the Washington conference might place greater emphasis on “how to ensure the U.S. maintains its leading position in the AI race.” More notably, U.S. President Trump, while the conference was underway, expressed his desire to congratulate Jensen Huang and mentioned that the two would meet shortly thereafter.

At approximately noon Eastern Time on October 28, Jensen Huang took the stage as scheduled to deliver his speech. After reviewing the company’s development journey once again, he focused on introducing NVIDIA’s latest progress and ambitions in areas such as 6G, quantum computing, and AI infrastructure. Behind these announcements lay pervasive considerations about maintaining the United States' competitive edge in the AI domain.

In the field of 6G, Jensen Huang provided a detailed explanation of NVIDIA’s recently announced $1 billion investment plan in Nokia. He stated that the two companies would collaborate to develop a 6G AI platform, and Nokia’s future base stations would fully adopt the new product line, NVIDIA Arc architecture, unveiled during the keynote speech. “This will propel the U.S. back to a leadership position in telecommunications,” NVIDIA stated in a press release issued concurrently with Jensen Huang’s keynote address.

Jensen Huang also introduced NVQLink during his speech, a system architecture designed to connect quantum processors with GPU computing systems. He explained that in the future, every supercomputer using NVIDIA GPUs would be hybrid, tightly coupled with quantum processors to expand computational possibilities. “This is the future of computing,” he said, though without disclosing specific technical advancements. Jensen Huang added that 17 quantum computing companies had already committed to supporting NVQLink. Additionally, the keynote speech explicitly mentioned that NVQLink allows quantum processors to connect with supercomputing systems at nine U.S. national laboratories, ensuring the country’s leading position in high-performance computing.

Meanwhile, Jensen Huang announced that NVIDIA would partner with the U.S. Department of Energy to build seven new AI supercomputers to advance American scientific research, which would become the largest AI supercomputers under the Department of Energy.

During the nearly two-hour speech, Jensen Huang also touched on numerous hot topics, including robotics, physical AI, and the reshoring of U.S. manufacturing. He showcased the next-generation Vera Rubin superchip and released NVIDIA’s data center/AI GPU roadmap. Notably, echoing Trump’s distant congratulations, Jensen Huang thanked Trump primarily for his efforts in promoting enhanced energy supply for data centers.

Reiterating his previous statements, Jensen Huang believes that humanity is at the dawn of an artificial intelligence industrial revolution, and this technology will define the future of every industry and nation. "The United States must lead this race toward the future; it is our generation's Apollo moment. The next wave of inventions, discoveries, and advancements will depend on the nation’s ability to expand its AI infrastructure. We are working with our partners to build the most advanced AI infrastructure ever created, ensuring that America has the foundation for a prosperous future and that global AI benefits everyone based on American innovation, openness, and collaboration," he stated.

As the AI competition between China and the United States intensifies and trade relations remain volatile, Jensen Huang and NVIDIA are under pressure from both sides. The Trump administration previously banned NVIDIA from exporting high-end chips to China. However, after the ban was lifted, NVIDIA faced consecutive antitrust investigations and security backdoor issues in China. According to the company’s quarterly financial report and Huang’s own admission, NVIDIA’s market share in China has plummeted from 95% to zero, and since the second quarter of this year, the company has been unable to sell chips to the Chinese market normally. Meanwhile, China is accelerating the localization of AI manufacturing, with domestic chip companies like Cambricon Technologies gaining significant attention, carrying expectations of creating an impact akin to DeepSeek. As of October 28, Cambricon Technologies’ stock price had once again surpassed that of Kweichow Maotai, making it the highest-priced stock on the A-share market.

However, NVIDIA is the world's most valuable company by market capitalization. Notably, during Jensen Huang's keynote speech, NVIDIA's stock price continued to rise, hitting another all-time high.

Below is the English version of Jensen Huang’s keynote speech, translated with AI assistance and human editing:

Welcome to GTC.

We have so much to share with you today. GTC is where we discuss industries, science, computing, the present, and the future. So today, I’ll cover a wide range of topics.

But before I begin, I want to thank all of our partners who helped sponsor this event. You’ll see them around the exhibition hall. They’re here to meet with all of you, which is fantastic. Without the support of our entire ecosystem of partners, we wouldn’t be able to achieve what we do.

Everyone says this is the 'Super Bowl' of AI. Well, every Super Bowl deserves a great opening show. What did you think of the opening show (referring to the short film reviewing the development of the U.S. and global technology industries) and our 'ALL-CAPS ALL-STARS' lineup (referring to the list of sponsors displayed on the big screen)? All-star athletes and an all-star cast. Just look at these guys.

Over the next three and a half hours, various sectors of the AI ecosystem will come together to discuss key topics.

So, let’s get started.

As you can see in the video, NVIDIA has invented a new computing model for the first time in 60 years. A new computing model rarely emerges. It takes a significant amount of time and a set of conditions that we have observed. We developed this computing model because we want to solve problems that general-purpose computers, ordinary computers, cannot address. We also noted that one day, although the number of transistors will continue to increase, improvements in transistor performance and power efficiency will slow down. This means Moore's Law will no longer be sustainable as it is constrained by the laws of physics. That moment has now arrived, and our scaling has come to a halt. This is known as 'Dennard Scaling,' which essentially stopped about a decade ago. In fact, the improvement in transistor performance and power consumption has significantly slowed.

However, the number of transistors continues to grow. We noticed this a long time ago, and over the past 30 years, we have been advancing a form of computing that we call 'accelerated computing.' We invented the GPU. We developed a programming model called CUDA. We observed that if we could add a processor that leverages an increasing number of transistors, applies parallel computing, and combines it with sequential processing CPUs, we could scale computing power far beyond what was previously possible. And that moment has indeed arrived.

We are now at this inflection point: accelerated computing, its era has now come.

However, accelerated computing represents a fundamentally different programming model. You cannot simply take manually written, sequentially executed CPU software and place it on a GPU expecting it to run properly. In fact, if you do that, it would actually run slower. Therefore, you need to reinvent algorithms, create new libraries, and in fact, rewrite applications. That is why it has taken so long. It took us nearly 30 years to get here, but we have taken it step by step. This is truly the jewel of our company.

Most people talk about the importance of GPUs, but without a programming model built on top of them, and without generational compatibility dedication (we are now at CUDA 13, soon to release 14), without millions of GPUs running perfectly across every computer, developers would not target this computing platform. If we had not created these libraries, developers would not know how to utilize the algorithms and fully leverage the architecture's potential. I mean, this is truly the crown jewel of our company.

(Referring to the on-screen display explaining in sequence) It took us nearly seven years to bring cuLitho to its current level. Taiwan Semiconductor is using it, Samsung is using it, and ASML is using it as well. This is an incredible library for computational lithography, which is the first step in chip manufacturing.

cuOpt has broken nearly every record in numerical optimization…

cuDF, a data frame method, essentially accelerates SQL, data frames, and specialized data frame databases.

This library is the foundational library for AI, cuDNN, and Megatron Core, which is built on top of it, enabling us to simulate and train extremely large language models. There are many examples like this. MONAI, very important, is the world’s leading medical imaging AI framework. By the way, we won’t go into too much detail about healthcare today, but be sure to watch Kimberly’s keynote. She will discuss our work in healthcare in depth. There are many more examples, such as genomics processing, aerial imagery... Please note, today we are going to do something very important: cuQuantum, quantum computing.

What you see on the screen is just representative of 350 different libraries within our company. Each of these libraries has redesigned the algorithms required for accelerated computing. Each library enables all our ecosystem partners to harness accelerated computing, and each opens up new markets for us. Let’s take a look at what CUDA-X can achieve.

Are you ready? Let's begin.

(The event showcased several segments of games and animated CG clips.)

Isn't it amazing? Everything you saw is a simulation. No art, no animation. This is the beauty of mathematics. This is deep computer science, deep mathematics, and its beauty is incredible. It spans every industry, from healthcare and life sciences to manufacturing, robotics, autonomous vehicles, computer graphics... even video games. The first shot you saw was the first application NVIDIA ever ran. That’s where we started in 1993. We have always believed in what we were striving to achieve. It’s hard to imagine that you can see that initial virtual fighter scene come to life. And the same company believes we are standing here today. What an incredible journey. I want to thank all NVIDIA employees; please give them a round of applause for everything they’ve done. It’s truly remarkable.

Today, we span multiple industries. My presentation will cover AI, 6G, quantum models, enterprise computing, robotics, and factories.

Let’s get started. We have a lot to cover, many major announcements, and new partnerships that might surprise you.

Telecommunications is the backbone and lifeline of our economy, our industries, and our national security. Since the advent of wireless technology, we have defined technologies, set global standards, and exported American technology worldwide so that the world could build upon American technology and standards. However, that era feels like a distant past now.

Much of the wireless technology deployed globally today is based on foreign technology. Our fundamental communication architecture relies heavily on foreign innovations. This must change.

And we have the opportunity to do just that, especially during this pivotal platform transition.

As you know, computer technology serves as the foundation for every industry. It is the most essential tool for science and the most critical instrument for industry. I mentioned earlier that we are undergoing a platform transition. This platform transition represents a once-in-a-lifetime opportunity to bring us back into the game and start innovating based on American technology. Today, we are announcing that we are doing just that. We have formed a significant partnership with Nokia.

Nokia is the world’s second-largest telecommunications equipment manufacturer. This is a $3 trillion industry, encompassing hundreds of billions of dollars in infrastructure and millions of base stations worldwide. If both sides collaborate, we can build upon this incredible, fundamentally new technology rooted in accelerated computing and AI. Moreover, we can position the United States at the center of the next 6G revolution. So today, we are announcing that NVIDIA has a new product line. It’s called the NVIDIA Aerial RAN Computer Arc. Arc is built upon three fundamental new technologies: an outstanding CPU—the Blackwell GPU—and our ConnectX networking solution designed for this application.

All of this allows us to operate this library. The CUDA-X library I mentioned earlier, called Aerial, is essentially a wireless communication system running on top of CUDA.

For the first time, we will create a software-defined, programmable computer that can communicate wirelessly while simultaneously performing AI processing.

This is absolutely revolutionary. We call it NVIDIA Arc.

Nokia will partner with us to integrate our technology and rewrite their software stack. This is a company that owns 7,000 essential patents for 5G.

It's hard to imagine a greater leader in telecommunications. So, we will collaborate with Nokia. Their future base stations will fully adopt NVIDIA Arc, and NVIDIA Arc is also compatible with Nokia’s current AirScale base stations.

This means that we will adopt this new technology and be able to upgrade millions of base stations worldwide with 6G and AI. Now, 6G and AI are indeed crucial, and for the first time, we will be able to utilize AI technology — 'AI for RAN' — by employing artificial intelligence and reinforcement learning to adjust beamforming in real-time based on environmental conditions, traffic, mobility, weather, and more, thereby enhancing spectral efficiency in wireless communications.

All of these factors can be taken into account so that we can improve spectral efficiency. Base stations consume approximately 1.5% to 2% of the world’s electricity. Therefore, improving spectral efficiency means we can transmit more data over wireless networks without increasing energy consumption.

Another thing we can do is 'AI on RAN.' This represents an entirely new opportunity. Remember, the internet enabled communication, but what was remarkable was that smart companies like AWS built cloud computing systems on top of the internet. We are now going to do the same thing on top of the telecom network. This new cloud will be an edge industrial robotics cloud.

That is to say, 'AI for RAN' to improve radio spectrum efficiency, while 'AI on RAN' essentially represents cloud computing for telecommunications. Cloud computing will extend directly to the edge, where there are no data centers because we have base stations all around the world. This announcement is incredibly exciting. Justin Hodar, Nokia’s CEO, I believe he is somewhere in the venue, thank you for partnering with us and helping bring telecom technology back to the United States. This is truly an extraordinary collaboration. Thank you very much.

(The classic Nokia ringtone plays in the venue) This is the best way to celebrate Nokia.

Next, let us talk about quantum computing. In 1981, particle physicist and quantum physicist Richard Feynman envisioned a new type of computer that could directly simulate nature. He called it a quantum computer.

Forty years later, the industry has achieved a fundamental breakthrough.

Just last year, a fundamental breakthrough was realized. It is now possible to create a logical qubit—a coherent, stable, and error-corrected logical qubit. This single logical qubit is composed of sometimes dozens, sometimes hundreds of physical qubits working together. As you know, qubits, these particles, are extremely fragile. They can be highly unstable.

Any observation, any sampling, or any environmental conditions may cause them to lose coherence.

Therefore, they require an extremely controlled environment. Now, many different physical qubits are also needed to work together so that we can perform error correction on them. These are referred to as ancillary qubits or syndrome qubits, which allow us to correct errors and infer the state of that logical qubit. There are various types of quantum computers—superconducting, photonic, trapped ions, neutral atoms—all employing different methods for building quantum computers.

We now realize that it is crucial to connect quantum computers directly to GPU supercomputers so that we can perform error correction, enable AI-driven calibration and control of quantum computers, conduct collaborative simulations, and ensure cooperation between the two systems. The right algorithms run on GPUs, while the appropriate algorithms operate on QPUs (Quantum Processing Units), allowing these two processors, these two computers, to work side by side.

This is the future of quantum computing. Let’s take a look.

(A video related to quantum computing was played on-site, stating that quantum error correction is the solution, and NVQ Link is a new interconnect architecture that directly connects quantum processors to NVIDIA GPUs. They will also be able to coordinate quantum devices with AI supercomputers to run quantum GPU applications.)

So today, we officially launch NVQ, NVQ Link. Thanks to two things, of course, this interconnect enables quantum computer control and calibration, quantum error correction, and hybrid simulation by connecting the two computers (the QPU and our GPU supercomputer). It is also fully scalable. It is not only designed for error correction with the small number of qubits available today but also for the future when we scale up these quantum computers from the current few hundred qubits to tens of thousands of qubits, and eventually even hundreds of thousands of qubits.

So now we have an architecture capable of control, collaborative simulation, quantum error correction, and future scalability. The support from the industry has been incredible.

During the development of CUDA-Q, remember that CUDA was designed for GPU-CPU accelerated computing—essentially using the right tool for the right job. Now, CUDA-Q has expanded beyond CUDA so that we can support QPUs and enable both processors (QPU and GPU) to work together, with computations moving back and forth in just a few microseconds—a key latency requirement for collaborating with quantum computers. Thus, CUDA-Q represents an incredible breakthrough and is being adopted by a wide range of developers. Today, we are announcing that 17 different companies in the quantum computing industry support the NVQ Link architecture. I am incredibly excited about this. Additionally, eight different laboratories under the U.S. Department of Energy are involved: Berkeley, Brookhaven, Fermilab, Lincoln Laboratory, Los Alamos, Oak Ridge… nearly every U.S. Department of Energy laboratory is collaborating with us, partnering with our quantum computing companies and the ecosystem of these quantum controllers, to integrate quantum computing into the science of the future.

We also have one more significant announcement.

Today, we are announcing that the U.S. Department of Energy is collaborating with NVIDIA to build seven new AI supercomputers to advance our nation’s scientific progress.

I must pay tribute to Secretary Chris Wright (U.S. Secretary of Energy). He has brought so much energy, a surge of vitality and passion to the Department of Energy, ensuring that the United States once again leads in science.

As I mentioned, computing is a fundamental tool for science, and we are in the midst of several major platform transitions. On one hand, we are shifting towards accelerated computing. This is why every future supercomputer will be a GPU-based supercomputer. Moreover, we are transitioning to AI, where both AI and principle-based solvers, as well as physics-based simulations, remain relevant but can be augmented to scale through the use of proxy models and AI models working collaboratively.

We also understand that principle-based solvers and classical computing can be enhanced with quantum computing to better understand natural states. Additionally, we know that in the future, there will be so many signals and vast amounts of data to sample from the world. Remote sensing is more important than ever. And unless these laboratories operate as robotic factories or robotic labs, we cannot conduct experiments at the scale and speed we need. Therefore, all these various technologies are simultaneously entering the field of science.

Secretary Wright understands us, and he wants the Department of Energy to seize this opportunity, inject itself with tremendous momentum, and ensure that the United States remains at the forefront of science.

I want to thank all of you for this. Thank you.

Next, let's talk about AI.

What is AI? Most people would say AI is a chatbot, and that is indeed correct. Undoubtedly, ChatGPT is considered the cutting edge of AI. However, as you can now see, these scientific supercomputers are not intended to run chatbots. They will perform foundational scientific AI.

The world of AI goes far beyond chatbots, of course. Chatbots are extremely important, while AGI is fundamentally crucial. Deep computer science, incredible computing power, and great breakthroughs remain essential for AGI. But beyond that, AI has even more depth.

In fact, I will describe AI in several different ways.

The first way, which comes to mind immediately, is that AI has completely reshaped the computing stack.

In the past, the way we wrote software was through manually coded programs running on CPUs. Today, AI involves machine learning, training, and data-intensive programming—training and learning via AI, if you will—and it runs on GPUs. To achieve this, the entire computing stack has changed. Note that you don't see Windows here. You don't see CPUs.

What you see is a completely different and fundamentally distinct stack.

We can also start by discussing energy needs, another area where our government and President Trump deserve significant credit for supporting initiatives promoting energy development. He recognized that the industry requires energy to grow, energy to progress, and that we need energy to succeed. He acknowledged this and placed the nation's strength behind supporting energy growth, completely changing the game. If this had not happened, we might have been in a dire situation. For this, I thank President Trump.

Above energy are these GPUs. These GPUs are connected and built into the infrastructure that I will show later. On top of this infrastructure are massive data centers, easily many times the size of this room, consuming enormous amounts of energy, then transforming that energy through new machines called GPU supercomputers to generate numbers. These numbers are referred to as tokens, which can be considered the language, computational units, or vocabulary of artificial intelligence. You can tokenize almost anything. You can certainly tokenize English words, which is why you can recognize or generate images. You can tokenize videos, 3D structures, chemicals, proteins, genes, even ourselves, or nearly anything with structure, anything with informational content.

Once you can tokenize something, AI can learn that language and its meaning. Once it understands the meaning of that language, it can translate, respond, just as you interact with ChatGPT. It can also generate, much like ChatGPT generates. So, all the basic things you see ChatGPT doing, you just need to imagine: what if it were proteins?

What if it were chemicals? What if it were a 3D structure like a factory? What if it were a robot, and the tokens represented understanding behaviors, actions, and movements? All these concepts are essentially the same, which is why AI is making such extraordinary progress.

Above these models are the applications. The Transformer is not a universal model. It is a highly efficient model, but there is no single universal model. Rather, AI has universal impact. We have so many different types of models. In recent years, we have witnessed the invention of multimodal systems and experienced innovative breakthroughs. There are so many different types of models: CNN models, state-space models, graph neural network models.

Above these model architectures are the applications, which are essentially the software of the past. This represents a profound understanding of artificial intelligence—a deep insight.

The software industry of the past was about creating tools. Excel is a tool, Word is a tool, and a web browser is also a tool. We know they are tools because people use them.

Tools, much like screwdrivers and hammers, define an industry of a certain size. These IT tools are valued at approximately $1 trillion.

But AI is not a tool; AI is a worker. That is the fundamental distinction. AI is essentially a worker capable of using tools.

One of the things that truly excites me is the work done by Irvin at Perplexity, where they use a web browser to book holidays or shop.

Essentially, it is an AI that uses tools. Cursor is an AI and agent-based AI system that we use at NVIDIA. Every software engineer at NVIDIA uses Cursor. It has significantly increased our productivity. It is essentially a partner for every one of our software engineers in generating code. It also uses a tool called VS Code. So, Cursor is an AI, an agent-based AI system, and VS Code is the tool it utilizes.

All these various industries—whether chatbots, digital biology (where we have AI research assistants), or robotaxis—involve AI. Speaking of robotaxis, though unseen, there is clearly an AI driver inside. That driver is working, and the tool it uses to perform its job is the car.

Everything we have created so far in this world—all of it—are tools for our own use. For the first time in history, technology is now able to operate autonomously and assist us in boosting productivity.

The list of opportunities continues, which is why the economic sectors touched by AI are unprecedented in scope.

It encompasses trillions, if not quadrillions, of dollars of the global economy underneath the layer of tools. Now, for the first time, AI will tap into this quadrillion-dollar economy, making it more productive, enabling faster growth, and achieving greater scale. We face severe labor shortages, and having AI that augments the workforce will help drive growth.

From the perspective of the technology industry, what is interesting is that beyond the fact that AI is a new technology pioneering new economic frontiers, AI itself is also an emerging industry.

As I explained earlier, these tokens, these numbers, after all different modalities and information are tokenized, require a factory to produce these numbers, which is different from the past. The computer and chip industries of the past — note this, if you look at the chip industry of the past, it accounted for only about 5% to 10% of the IT industry, perhaps even less.

The reason is that using Excel does not require much computation, using a browser does not require much computation, and using Word does not require much computation.

But in this new world, there needs to be a computer that constantly understands context. It cannot precompute because every time you use an AI computer, every time you ask AI to do something, the context is different. So it must process all this information environment. For example, in the case of autonomous vehicles, it must handle the context of the car. What is the instruction you give? What do you ask the AI to do? Then it must break down the problem step by step, reason, plan, and execute. Each step requires generating a large number of tokens, which is why we need a new type of system, which I call an 'AI factory.'

It is, of course, an AI factory. It is different from the data centers of the past; it is an AI factory.

Because this factory produces only one thing, unlike the data centers of the past that did everything — storing files for all of us, running various applications. You could use those data centers like computers to handle all kinds of applications. One day you might use it to play games, another day to browse the web, and yet another day to do accounting.

So that was the computer of the past: a general-purpose, multi-functional computer.

The computer I am talking about here is a factory. It essentially runs only one thing: it runs AI. Its purpose is designed to produce tokens that are as valuable as possible, meaning they must be intelligent, and you want to produce these tokens at an astonishing speed. Because when you ask AI a question, you want it to respond quickly. And notice that during peak hours, the responses from these AIs become slower because it has to perform a lot of work for many people.

So you want to produce valuable tokens at an astonishing speed and do so efficiently. Every word I use aligns with the concept of an AI factory, a car factory, or any factory.

It absolutely is a factory. And these factories have never existed before. Inside these factories are mountains of chips, which brings us to today's topic.

What has happened over the past few years. In fact, something rather profound occurred last year.

In fact, if you look at the beginning of this year, everyone had an opinion about AI. That opinion was generally: this is going to be big, this is the future. A few months ago, it entered turbocharged mode. There are several reasons for this. First, over the past few years, we have figured out how to make AI smarter. This is not just about pre-training. Pre-training essentially means giving AI all the information created by humans to learn from. This is fundamentally about memory and generalization. It’s not like when we were children in school. The first phase of learning, pre-training, was never meant to be the endpoint of education, just as preschool was never the end of education. Pre-training merely teaches the basic skills of intelligence so that one can understand how to learn everything else.

Next comes post-training. After pre-training, it teaches you problem-solving skills, breaking down problems, and reasoning.

It involves how to solve math problems, how to code, how to think through these issues step by step, using first-principles reasoning. And after that is when computation really starts to play a role. As you know, I went to school, which was decades ago for me. But since then, I’ve learned more and thought more. The reason is that we are constantly reinforcing ourselves with new knowledge. We are constantly researching, constantly thinking, and that is really the essence of intelligence.

So now we have three fundamental technical skills. We have these three techniques: pre-training, which still requires massive amounts of computation. We now have post-training, which uses even more computation. And now, 'thinking' places an incredible computational load on the infrastructure because it is thinking for each and every one of us.

The amount of computation required for AI to reason and 'think' is truly enormous. I’ve heard people say that reasoning is easy, and NVIDIA should focus on training; NVIDIA needs to do it.

But how could thinking be easy?

Thinking is difficult, and that’s why these three scaling laws put such immense pressure on computational capacity. Another thing has happened: from these three scaling laws, we get smarter models, and these smarter models require more computation. The smarter the model, the more people use it, which requires even more computation. And now, they are worth paying for. NVIDIA pays for every Cursor license, and we are happy to do so because Cursor is helping employees—software engineers or AI researchers—worth hundreds of thousands of dollars become many times more productive. So, of course, we are very willing to do this.

These AI models have become good enough to be worth paying for. Cursor, 11 Labs, Cynthia, Abridge, OpenEvidence—there are many examples. Of course, there’s also OpenAI and Claude. These models are now so good that people are willing to pay for them. Because people pay and use them more, and every time they use them more, you need more computation. Now we have two exponential growths. One exponential is the exponential increase in computational demand driven by the three scaling laws.

The second exponential is that the smarter the model, the more people use it, and user growth and model capability improvements reinforce each other, both leading to exponential growth in computational demand.

Both indices are now simultaneously exerting pressure on the world's computing resources, precisely at a time when I have just informed you that Moore’s Law has essentially come to an end.

So the question is, what should we do?

If we face these two exponentially growing demands and fail to find ways to reduce costs, this positive feedback system will collapse. A virtuous cycle is crucial for almost any industry, especially for any platform-based industry. It is also vital for NVIDIA.

We have now reached a virtuous cycle for CUDA.

The more applications created, the more valuable CUDA becomes; the more CUDA computers purchased, the more developers want to create applications for this platform. This virtuous cycle of NVIDIA has now been realized after 30 years. In 15 years, we will also achieve this virtuous cycle for AI.

AI has now reached a virtuous cycle. Therefore, the more you use it, because AI is intelligent and we are willing to pay for it, the more profits can be generated. The more profits generated, the more computational resources invested into the power grid, the more computation dedicated to AI factories, the smarter AI becomes, the more people use it, the more applications utilize it, and the more problems we can solve. This virtuous cycle is now beginning to operate.

What we need to do is significantly reduce costs in order to improve user experience—when you prompt AI, it responds much faster—and sustain this virtuous cycle by driving down costs so that it becomes smarter, leading to more users and thus perpetuating this virtuous cycle which is now accelerating.

But when Moore’s Law truly reaches its limit, what should we do? The answer is co-design. You cannot simply design a chip and expect that whatever runs on it will automatically become faster. The best you can do in chip design is increase the number of transistors by perhaps 50% every few years. If you add more transistors, or keep adding more transistors, that is merely percentage growth, not exponential growth. We need compounded exponential growth to sustain this virtuous cycle, which we call extreme co-design. NVIDIA is currently the only company in the world that can start with a blank slate and simultaneously rethink new infrastructure, computer architecture, new chips, new systems, new software, new model architectures, and new applications. Many of you here today are part of different layers of this stack. By collaborating with NVIDIA, we fundamentally redesign everything from the bottom to the top.

Then, because AI is such a massive problem, we scale it up. For the first time, we have created a computer that has expanded into an entire rack. That is one computer, one GPU, and then we scale it horizontally by inventing a new AI Ethernet technology (which we call Spectrum-X Ethernet).

Everyone says Ethernet is just Ethernet. But Spectrum-X Ethernet is designed for AI performance, which is why it has been so successful. Even that is not large enough. We fill entire rooms with AI supercomputers and GPUs. That is still insufficient because the number of AI applications and users continues to grow exponentially. We connect multiple data centers together, which we call scaling across dimensions, using Spectrum-X GS, Gigascale X, Spectrum-X Gigascale XGS.

By doing so, we are engaging in co-design on such a massive scale, an extreme level, that the performance improvements are astonishing. Not a 50% increase per generation, not a 25% increase per generation, but much more.

This is the most extreme co-designed computer we have ever built. Frankly, in the modern era since IBM System 360, I don’t think there has been a computer that has been so thoroughly reinvented from scratch. Creating this system was extremely challenging. I will show you its benefits later. But essentially, what we did was create NVLink 72. If we were to create a massive chip, a huge GPU, it would look like this. This is the level of wafer-scale processing we had to achieve, which is incredible. All of these, all these chips are now placed into a massive rack, this enormous rack that allows all these chips to work together as one. It looks absolutely incredible.

(Live demonstration session)

Anyway, essentially, what we used to create before was this. These are NVLinks, NVLink 8. Now these models are so large that the way we solve them is by taking this model, this massive model, and breaking it down into a group of experts, somewhat like a team. So these experts are skilled at certain types of problems, and we gather a large number of experts together. Therefore, this massive trillion-dollar AI model has all these different experts, and we place all these various experts onto GPUs. Now it’s NVLink 72. We can put all the chips into a massive switching network, allowing each expert to communicate with one another. The primary expert can communicate with all subordinate experts, as well as all necessary context, prompts, and the vast amount of data we need to send to all experts. Among the experts, whichever expert is chosen to answer the question, we try to respond more effectively, and it proceeds layer by layer, sometimes eight layers, sometimes sixteen layers. Sometimes these experts, sometimes ****, sometimes two hundred fifty-six. But the key point is that the number of experts keeps increasing.

So here, with NVLink 72, we have 72 GPUs. Because of this, we can fit four experts into one GPU. The most important thing you need to do for each GPU is generate tokens, depending on the bandwidth of your HBM memory.

We have an H100 GPU generating 'thoughts' for four experts. And here, because each machine can only hold eight GPUs, we must fit thirty-two experts into one GPU. So this GPU has to perform thirty-two 'thoughts' for one expert.

In contrast, in this system, each GPU performs only four 'thoughts' for one expert. Because of this, the speed difference is astonishing. This just came out, benchmarked by SemiAnalysis. They did a very thorough job, benchmarking all GPUs that could be benchmarked. It turns out there aren’t that many. If you look at the list of GPUs, about 90% of those that can actually be benchmarked are NVIDIA.

Alright. But, so we’re comparing ourselves against ourselves, yet the second-best GPU in the world is the H200, running all workloads.

Grace Blackwell delivers ten times the performance per GPU compared to the H200. How do you achieve ten times the performance when you’ve only increased transistors by two times? The answer is extreme co-design. By understanding the nature of future AI models and thinking through the entire stack, we can create architectures for the future. This is a big deal. It means we can now respond faster. But then comes the even bigger deal.

This image shows that the lowest-cost tokens in the world are generated by Grace Blackwell NVLink 72, which is the most expensive computer. On the one hand, GB200 is the most expensive computer. On the other hand, its token-generating capability is so powerful that it produces tokens at the lowest cost.

Because the number of tokens per second divided by Grace Blackwell's total cost of ownership is so favorable that it represents the lowest-cost way to generate tokens. By doing this, it delivers amazing performance with a 10x performance improvement and offers 10x lower costs, allowing the virtuous cycle to continue.

In any case, there are two platform transitions happening simultaneously. One platform transition is from general-purpose computing to accelerated computing.

Remember, as I previously mentioned to you, accelerated computing handles data processing, image processing, computer graphics, and various types of computations. It runs SQL, runs Spark—whatever you need, I'm pretty confident we have an excellent library for you. You might be a data center trying to create photomasks for semiconductor production. We’ve got a great library for you. So, at the foundational level, irrespective of AI, the world is transitioning from general-purpose computing to accelerated computing, which has nothing to do with AI. In fact, many CSPs had services well before AI emerged. Remember, they were born in the era of machine learning, with classical machine learning algorithms like XGBoost used for recommendation systems, collaborative filtering, content filtering, and algorithms such as data frames—all these technologies were created during the era of general-purpose computing.

Even those algorithms, even those architectures, now perform better through accelerated computing. So, even without AI, the world’s CSPs will invest in acceleration. And our GPU is the only one capable of handling all these tasks.

An ASIC may be able to handle AI, but it can't do anything else.

NVIDIA can handle all of that.

This explains why relying solely on NVIDIA's architecture is so secure. We have now reached our virtuous cycle, our inflection point. This is quite unusual.

I have many partners here today, and you are all part of our supply chain. I know how hard you work.

I want to thank all of you for working so diligently. Thank you very much.

Now let me show you why. This is what’s happening with our company’s business. We see astonishing growth in Grace Blackwell, for the reasons I just mentioned. It is driven by two exponential factors. Our visibility is currently very high, and I think we may be the first technology company in history to foresee cumulative Blackwell orders and early Ruben (the next-generation platform) orders totaling $500 billion by 2026. As you know, 2025 hasn’t ended yet, and 2026 hasn’t begun. This represents pre-booked business valued at $500 billion.

To date, we have shipped 6 million Blackwell GPUs...

In the initial few quarters, I believe it was the first four production quarters or three and a half production quarters. We still have one quarter left in 2025 to complete, followed by four more quarters. Therefore, over the next five quarters, achieving $500 billion would represent a fivefold increase.

This tells you something. This figure represents the entire lifecycle of the previous generation Hopper. It does not include the China and Asian markets.

Thus, Hopper reached 4 million GPUs over its entire lifecycle. Each Blackwell module contains two GPUs, making it a large package. In its early stages, Blackwell already has 20 million GPUs, showing remarkable growth.

I would like to thank all our supply chain partners. I know how hard you have worked. I created a video to celebrate your efforts. Let’s watch it.

(A video about U.S. manufacturing is played)

We are once again manufacturing in the U.S., which is absolutely incredible.

The first thing President Trump spoke about was bringing manufacturing back because it is essential for national security. Bringing manufacturing back because we want those jobs. We want that part of the economy.

Nine months later, we are now in full production of Blackwell in Arizona. Extreme Blackwell, GB200 NV Grace, Blackwell NVLink 72. Extreme co-design has brought us a 10x generational improvement. This is absolutely extraordinary. The truly incredible part is that this is the first AI supercomputer we have built.

This was in 2016, when I delivered it to a startup in San Francisco, which later turned out to be OpenAI. That is the computer.

And in order to build that computer, we designed a new chip so that we could co-design.

Now we have to design all the chips. That's what it takes. You can't take one chip and make a computer 10 times faster. That won't happen. The way to make a computer 10 times faster, the way for us to continue to exponentially increase performance and drive costs down exponentially, is extreme co-design, developing all these different chips at the same time. We now have the next generation Ruben chip.

This is our third-generation NVLink 72 rack-scale computer. GB200 is the first generation. All of our partners around the world, I know I've heard how hard you work. The first generation was very difficult, the second generation went much smoother. And this generation, look at this (live demonstration), it's really nothing for us. These are now in the lab. This is the next generation Ruby. As we ship them, we are preparing to put them into production, you know, probably around this time next year, maybe even a little earlier. So every year, we will introduce the most extreme co-designed systems so that we can continue to improve performance and reduce token generation costs. Look at this, it's beautiful. So this is amazing.

(Next is a live demonstration and introduction, including Vera Rubic computing trays, BlueField, NVLink switches, etc.)

Now, as you've noticed, NVIDIA initially started by designing chips, then we began designing systems, we designed AI supercomputers. Now we're designing entire AI factories.

Every time we expand outward and integrate more problems to solve, we can come up with better solutions. We now build entire AI factories. This AI factory is what we built for Vera Ruben. We created a technology that allows all of our partners to digitally integrate into this factory. Let me show you.

(A related video is played on-site)

Fully digital. Long before Vera Ruben existed as a physical entity, long before these AI factories existed, we optimized it and operated it as a digital twin. So all of our partners working with us, we're glad you're all supporting us. Together we build the AI factory.

Next, let's talk about models, open-source models.

Over the past few years, several things have happened. One is open-source models, which, due to fairly powerful inference capabilities, such as Stability AI, these various capabilities have made open-source models useful to developers for the first time, and they are now the lifeblood of startups.

Every industry has its own use cases, and startups in different industries need to embed domain-specific expertise into a model. Open source makes this possible. Researchers need open source, developers need open source. Companies around the world—we need open-source models; it’s very important.

The United States must also lead in open source. We have amazing proprietary models, but we also need amazing open-source models.

Our nation depends on it, our startups depend on it, so NVIDIA is committed to making this happen.

We are now the largest leader in open-source contributions. We have 23 models on the leaderboard. We cover all these different domains, from language models to physics AI models to biology models. Each model has a large team behind it. This is one of the reasons we built our supercomputers—to facilitate the creation of all these models. We have the top-ranked speech model, the top-ranked inference model, and the top-ranked physics AI model. The download numbers are significant. We are committed to this because science needs it, researchers need it, and startups need it.

I am thrilled that AI startups are building on NVIDIA. There are several reasons for this. Of course, our systems are robust, and our tools work well. All our tools run across all our GPUs. Our GPUs are ubiquitous, available on every cloud, and you can download our software stack, and it just works. We have the advantage of a rich developer ecosystem that is continuously enriching itself.

So I am genuinely excited to build relationships with all the startups we collaborate with. Thank you all. Similarly, many of these startups are now starting to find more ways to leverage our GPUs, hire talent, and scale up.

Nibias, Lama, love, Lambda—all these companies are fantastic.

All the CUDA-X libraries I mentioned. I told you about how we open-source AI, monetize all the models I discussed, integrate them into AWS, integrate them into Google Cloud... We also integrate real libraries into global SaaS offerings so that every SaaS eventually becomes an agent-based SaaS.

One day, I would love to hire an AI agent as a basic designer to work with our ACS—essentially Synopsys' Cursor, if you will. We are collaborating with Anirudh Devgan at Cadence.

Earlier today, he was part of the opening show, and Cadence is doing incredible work accelerating their stack to create AI agents, enabling us to have Cadence AI, AC designers, and system designers working together. Today, we are announcing a new collaboration. AI will greatly enhance productivity. AI will transform nearly every industry.

But AI will also greatly exacerbate large-scale cybersecurity challenges, those malicious AIs. Therefore, we need an extraordinary defender. I can't imagine a better defender than CrowdStrike.

George Voltage was here just now. Yes, I saw him earlier. We are collaborating with CrowdStrike to bring cybersecurity to the speed of light. We are building a system that has a cloud-based cybersecurity AI agent as well as truly excellent AI agents on-premises or at the edge. This way, whenever a threat emerges, you can detect it instantly. We need speed; we need fast agent-based AI, super-intelligent agents.

Then, there is another announcement to make.

This is the fastest-growing and most valuable enterprise in the world, and perhaps the most important enterprise stack in today’s global market—Palantir. They take information, data, human judgment, and transform it into business insights. We are partnering with Palantir to accelerate everything they do so that we can process data on a larger scale and at faster speeds.

Whether it's structured data from the past or unstructured data, we will then process this data for our government agencies for national security purposes, as well as for enterprises around the world. Process this data at the speed of light and derive insights from it. This is what the future looks like. Palantir will integrate NVIDIA technology so that we can process data at the speed of light.

Next, let’s talk about physical AI.

Physical AI requires three computers, just as training a language model requires two: one for training and one for evaluation and inference. Okay, so that's the large GB200 you see. To achieve this for physical AI, you need three computers. You need a computer to train it. That is the GB, Grace Blackwell NVLink 72, which is the computer that performs all the simulations I showed you earlier using Omniverse DSX. Essentially, that is a big window for robots to learn how to be good robots, allowing factories to essentially become digital tools.

(Live demonstration) This computer must excel in generative AI and also be proficient in computer graphics, sensor simulation, ray tracing, and signal processing.

This computer is called the Omniverse computer. Once we have trained the model, simulating that AI in a digital twin—which could be a digital twin of a factory or a large number of robots—you then need to operate that robot. And this is the robotics computer. It goes into autonomous vehicles. Half of it can go into a robot. Okay? Or you can actually have robots that are quite agile and capable of rapid operation. It might require two such computers. All three of these computers run CUDA.

This enables us to advance physical AI, allowing AI to understand the physical world, comprehend physical laws, causality, and persistence. We have incredible partners working with us to create physical AI for factories. We are also using it ourselves to build our factory in Texas. Once we have created a robotic factory with a bunch of robots inside, these robots will also need physical AI, applying physical AI and operating within visualized twins.

Let us examine the re-industrialization of the United States. In Houston, Texas, Foxconn is constructing a state-of-the-art robotics facility for manufacturing NVIDIA's AI infrastructure systems. Amid labor shortages and skill gaps, digitalization, robotics, and physical AI are more crucial than ever. The factory is digitally conceived in Omniverse. Foxconn engineers assemble their virtual factory using Siemens' digital twin solution based on Omniverse technology. Every system—mechanical, electrical, and plumbing—is validated before construction.

Siemens Plant Simulation conducts design space exploration optimization to determine the optimal layout. When bottlenecks arise, engineers use changes managed by Siemens TeamCenter to update the layout.

In Isaac Sim, the same digital twin is used to train and simulate robotic AI. In the assembly area, Fanuc robotic arms build GB300 pallet modules. FII’s dexterous robotic hands and skilled AI install busbars into the pallets. AMRs (Autonomous Mobile Robots) transport pallets to testing chambers. Foxconn uses Omniverse for large-scale sensor simulation, where robotic AI learns to operate as a fleet. In Omniverse, visual AI agents built on NVIDIA Metropolis and Cosmos monitor the fleet of robots and workers from above to oversee operations and respond to anomalies, safety violations, or even human-robot collaboration. This is the future of manufacturing—the future of factories. I would like to thank our partner Foxconn, whose CEO is here with us today.

All these ecosystem partners have made it possible for us to create robotic factories. The amount of software required to accomplish this task is so vast that unless you can complete it within a digital twin—design it on this planet and operate it in a digital twin—the likelihood of success diminishes significantly. I am also thrilled to see Caterpillar, my friend Joe Creed and his century-old company, integrating digital twins into their manufacturing processes.

We are moving toward the future of robotic systems, and one of the most advanced systems is Figure. Brett Abcock is here with us today; he founded his company just three years ago, and it is now valued at nearly $40 billion. Together, we are collaborating to train AI, train robots, simulate robots, and, of course, develop the computing systems for Figure’s robots. It is truly remarkable. I had the privilege to witness it, and it is extraordinary.

My friend Elon Musk is also working on humanoid robots, which will likely become one of the largest new consumer electronics markets and certainly one of the largest industrial equipment markets. Peggy Johnson and colleagues from Agility are collaborating with us on warehouse automation robots. Colleagues from Johnson & Johnson are once again partnering with us to train robots, simulate them in digital twins, and operate them. These surgical robots developed by Johnson & Johnson will perform fully modern non-invasive surgeries with unprecedented precision. And of course, there are the most endearing robots: Disney robots. They hold a special place in our hearts. We are collaborating with Disney Research to develop a revolutionary framework and simulation platform based on groundbreaking technology that allows robots to learn how to be good robots in a physically accurate, physics-based environment. Let’s take a look at it.

(A video related to robots was played live.)

Now, remember everything you just saw… That wasn’t animation or a movie—it was a simulation. That simulation was realized in Omniverse as a digital twin. These digital twins of factories, warehouses, and operating rooms allow robots to learn how to operate, navigate, and interact with the world—all in real time. This will become the largest consumer electronics product line in the world—the future of humanoid robots. Of course, humanoid robots are still under development. Meanwhile, one type of robot has clearly reached an inflection point and is already here: wheeled robots. Robotaxis, essentially, are AI drivers.

One thing we are going to do today is launch NVIDIA Drive Hyperion.

This is a big deal! We have created this architecture so that every automobile company in the world can manufacture vehicles. These vehicles can be commercial, passenger, or dedicated to robotaxis. Vehicles designed specifically as robotaxis.

Essentially, three surround cameras, radar, and lidar enable us to achieve the highest level of surround cocoon sensor perception and redundancy, which is essential for the highest safety standards. Drive Hyperion has now been adopted by companies such as MERCEDES-BENZ GROUP AG UNSP ADR EACH REP 0.25 ORD SHS, with many other automotive enterprises expected to follow in the future.

(A video related to robot autonomous driving was played on site.)

Well, so that's what we've covered today. We discussed a wide range of topics, but at the core were two platform transitions: from general-purpose computing to accelerated computing, and NVIDIA CUDA. The set of libraries known as CUDA-X enables us to enter almost every industry, and we are at an inflection point. It is now growing as the virtuous cycle predicted, and the second inflection point is now upon us.

The second platform transition is AI, moving from traditional handwritten software to artificial intelligence. Two platform transitions happening simultaneously—this is why we are experiencing such tremendous growth.

We talked about quantum computing, open-source models. On the enterprise side, we are working with CrowdStrike and Palantir to accelerate their platforms. We discussed robotics, which is poised to become one of the largest consumer electronics and industrial manufacturing sectors. And of course, we also mentioned 6G, where NVIDIA provides a new platform called Arc. For robotic vehicles, we offer a new platform called Hyperion.

We even have a new platform for factories, covering two types: AI factories, which we call DSX, and AI-enabled manufacturing facilities, which we call Mega. So now we are also engaged in manufacturing within the United States.

Ladies and gentlemen, thank you for being here today and allowing me to bring GTC to Washington, D.C. We hope to hold it here annually. Thank you all for making America great again. Thank you!

(The keynote speech concluded amidst the audience's applause and video footage of robot dancing.)

Editor/Rocky

The translation is provided by third-party software.


The above content is for informational or educational purposes only and does not constitute any investment advice related to Futu. Although we strive to ensure the truthfulness, accuracy, and originality of all such content, we cannot guarantee it.
    Write a comment