share_log

产业链更新全面加速!英伟达宣布“一年一迭代”

Industry chain updates are accelerating comprehensively! Nvidia announced "one iteration per year".

wallstreetcn ·  Jun 3 21:09

Source: Hard AI
Author: Zhang Yifan

HBM4 memory and 3.2T optical modules may become mainstream in 2026.

At Computex 2024, Jensen Huang (Jensen Huang) proved it once again by holding a Blackwell chip$NVIDIA (NVDA.US)$Full stack capabilities.

Compared to GPU sellers, full-stack vendors need to consider not only GPUs, but also products such as software platforms, network services, cooling products, and supporting CPUs.

Nvidia CEO Hwang In-hoon gave answers one by one at this conference.

• Chip iteration every year: Blackwell Ultra GPU launched in 2025, Rubin GPU 2026, Rubin Ultra GPU in 2027;

• Launch of next-generation architecture: Rubin, the next-generation architecture, will be launched in 2026;

• Spectrum-X “annual update”: In 2026, Spectrum-X1600 can connect millions of GPUs;

• Cooling is not limited to “liquid cooling”: Blackwell architecture simultaneously launched both air-cooled DGX and liquid-cooled MGX servers;

• Software platform: The software business is not only a moat for Nvidia, but will also become a huge business;

I. PROCESSOR

First, let's take a look at Nvidia's processors, which are GPU and CPU respectively.

At the conference, Hwang In-hoon said, “Next, the pace of updates will be on a one-year cycle to push all products to the limit of technology.”

It also heavily revealed the next three generations of the technology stack (see figure below):

• Blackwell Ultra GPU (8S HBM3e 12H) launched in 2025;

• Rubin GPU (8S HBM4), next-generation Arm-based Vera CPU, and NVLink 6 Switch (3600Gb/s) will be launched in 2026;

• Rubin Ultra GPU (12S HBM4) launched in 2027;

In terms of performance, the specific parameters of Rubin GPUs and Vera CPUs have not been disclosed. However, Nvidia has fully demonstrated the purpose of improving physical fitness and reducing prices in model training:

• In the past 8 years, the training energy consumption of GPT-4 with 1.8 trillion parameters has been drastically reduced to 1/350, and the energy consumption for inference has dropped to 1/45000;

• Over the past 8 years, computing power has increased 1000 times;

II. Processor architecture

Huang Renxun revealed that Blackwell's next generation architecture will be the Rubin architecture, which will be launched for the first time in 2026. The new highlight is that it will be equipped with HBM4 memory.

According to reports from foreign media wccftech, Nvidia's Rubin GPU will use TSMC's Cowos-L advanced packaging technology and use N3 process technology.

Additionally, Nvidia will equip the Rubin GPU to be launched in 2026 with next-generation HBM4 memory. Currently, Nvidia uses the fastest HBM3E memory in its B100 GPU.

This means that by the end of 2025, HBM4 memory will probably be produced on a large scale.

In addition, Nvidia will also launch a new generation CPU, the Vera CPU, based on the ARM architecture, in combination with Rubin GPUs to form a new Vera Rubin platform superchip. The platform will support the new CX9 SuperNIC and NVLink 6 technology, providing connectivity speeds of up to 1600Gb/s and 3600Gb/s.

3. Communication network -- Ethernet

At this conference, Nvidia first mentioned an Ethernet solution for multi-million GPU interconnections, and it is expected to be launched in 2026. At that time, 3.2T optical modules may become mainstream.

“The era of multi-million GPU data centers is coming!” At the conference, Hwang In-hoon proposed the Ethernet Spectrum product route for the next three years and announced that new Spectrum-X products will be launched every year.

• In 2024, Spectrum-x800 will be designed for tens of thousands of GPUs;

• In 2025, the X800 Ultra is designed for hundreds of thousands of GPUs;

• In 2026, the X1600 can be expanded to millions of GPUs;

Previously, both Arista and Nvidia announced only 100,000 GPU-connected products:

• Nvidia: Spectrum-X has entered mass production with a number of customers, including a large cluster of 100,000 GPUs;

• Arista: Predicts the company will be able to connect 100,000 GPUs by 2025;

According to the conference (see figure below), the switch rate in 2026 will double compared to 2024, which means that optical modules may enter the 3.2T era in 2026 (currently 1.6T).

4. Air-cooled DGX and liquid-cooled MGX

After the launch of Blackwell, it was once rumored in the market that servers would use liquid cooling to dissipate heat.

At this conference, Nvidia mentioned that it will simultaneously build server products with two cooling modes: air-cooled DGX and liquid-cooled MGX.

Furthermore, compared to the previous GTC conference, Hwang In-hoon revealed more detailed data on Blackwell's architecture:

• The AI computing power of DGX has increased 45 times that of the previous generation, reaching 1440 PFLOPS, while the energy consumption is only 10 times that of the previous generation;

• The new generation DGX can be equipped with 72 GPUs, and the backbone is supported by NVLink 5000 cables, which can save 20kW of electricity for one rack;

5. Software development platform

The software business is not only a moat for Nvidia, it will become a huge business.

These software businesses include CUDA, NIM, Omniverse, etc. (see figure below).

At the conference, Nvidia once again emphasized the importance of NIN and Omniverse,

1) NVIDIA NIM inference microservices can reduce the time for enterprises to deploy generative AI applications from a few days to a few minutes;

2) Omniverse: Omniverse is a virtual world simulation development platform that minimizes the gap between simulation and reality. Developers can test, train, and integrate everything in the Omniverse. As mentioned in the video, robots can learn how to become robots in a virtual world;

Facing the future, Nvidia is actively laying out the field of robotics and developing an application based on AI technology — Earth. Through continuous innovation and exploration, Nvidia is expected to play a greater role in advancing global technology and improving human lives.

Editor/jayden

The translation is provided by third-party software.


The above content is for informational or educational purposes only and does not constitute any investment advice related to Futu. Although we strive to ensure the truthfulness, accuracy, and originality of all such content, we cannot guarantee it.
    Write a comment