share_log

英伟达曝AI专用「核弹」:GPT速度提高30倍,三大云厂商抢着买

Nvidia reveals AI-specific “nuclear bomb”: GPT speed increased 30 times, and the three major cloud vendors rushed to buy

Geekpark News ·  Mar 22, 2023 07:39

Source: Geek Park
Author: Zheng Xuan

At 11:00 p.m. on March 21, Nvidia CEO Hwang In-hoon's speech kicked off GTC 2023.

After ChatGPT and GPT-4 set off this generative AI craze, Yingwei, which provides the heart for AI, achieved the big winner behind it, and made this year's GTC bound to be the most watched session ever.

Hwang In-hoon did not disappoint his followers.

“The iPhone time for AI has arrived.” In the 70-minute speech, Lao Huang repeated it four or five times.

Before every time he says this, he shares a new development about generative AI — a revolution in creative, medical, industrial, etc., cloud services that allow ordinary people to train big models using a browser, and a superchip that reduces the processing cost of big models by 10 times...

“AI will evolve beyond anyone's imagination.” This sentence is the best footnote to this talk.

01 Reduce the processing cost of big language models by an order of magnitude

In 2012, Alex Kerchevsky, Ilya Suskever, and their mentor Geoff Hinton trained AlexNet using 14 million images on two GeForce GTX 580s—this is considered the beginning of this AI revolution because it proved for the first time that GPUs can be used to train artificial intelligence.

Four years later, Hwang In-hoon personally delivered the first NVIDIA DGX supercomputer to OpenAI. Over the next few years, OpenAI's breakthrough in large-scale language models made AIGC enter people's eyes, and completely broke the circle after ChatGPT was launched at the end of last year. Within a few months, the conversational AI product attracted over 100 million users, making it the fastest-growing app in history.

Originally used as a research device for AI, NVIDIA DGX is now widely used by enterprises to optimize data and process AI. According to Hwang In-hoon, half of Fortune 100 companies have DGX installed.

Among these, deploying LLMs like ChatGPT is becoming an increasingly important part of DGX's work.In response, Hwang In-hoon announced a new GPU — the H100 NVL with dual graphics processor NVLink.

Based on Nvidia's Hopper architecture, the H100 uses a Transformer engine designed specifically to handle models like GPT. Compared to the HGX A100 used for GPT-3 processing, a standard server with four pairs of H100 and NVLINK processes 10 times faster. According to official website data, H100's comprehensive technological innovation. Large language models can be made 30 times faster.

“The H100 can reduce the processing cost of large language models by an order of magnitude,” Hwang In-hoon said.

Additionally, cloud computing has grown 20% per year over the past ten years, becoming a $1 trillion industry. For AI and cloud computing, Nvidia designed the Grace CPU. Under the new architecture, the GPU is responsible for processing the AI workload, and the Grace CPU is responsible for sampling. The two are connected via a 900 Gb/s high-speed transmission device.

“Grace-Hopper is the best choice for processing large data sets.” Hwang In-hoon said, “Our customers want to build big AI models with training data several orders of magnitude larger. Grace-Hopper is the ideal engine.”

In a sense, computational costs have become the core issue that hampers the development of generative AI today. OpenAI has burned billions or even tens of billions of dollars, and Microsoft has never opened up new Bing to a wider range of people due to cost considerations, even limiting the number of conversations users have every day.

Nvidia introduced a more efficient computing power solution at this time, which undoubtedly solved a major problem for the industry.

02 DGX Cloud: Enabling Any Company to Build AI Capabilities

Another focus on generative AI at GTC this year was DGX Cloud.

In fact, this isn't the first time Nvidia has announced DGX Cloud. Earlier, when Nvidia Quarterly was published, Huang Renxun revealed to the outside world that Nvidia would cooperate with cloud service providers so that customers can use DGX computers through NVIDIA DGX Cloud using a web browser to train and deploy large-scale language models or complete other AI workloads.

Nvidia has already partnered with Oracle, and it is expected that Microsoft Azure will also start hosting the DGX Cloud next quarter, and Google Cloud will soon join this ranks to provide DGX cloud services to companies that are willing to build new products and develop AI strategies in a managed manner.

Wong In-hoon said that this partnership brought Nvidia's ecosystem to cloud service providers while expanding Nvidia's market size and coverage. Businesses will be able to rent DGX cloud clusters on a monthly basis, ensuring they can quickly and easily scale large-scale multi-node AI training.

03 ChatGPT is just the beginning

“Accelerated computing is the warp engine, and AI is its energy source.” Hwang In-hoon said, “The rapidly changing capabilities of generative AI have given us a sense of urgency to reimagine its products and business models.”

The big language model represented by ChatGPT and GPT-4 has been popular all over the world over the past few months, but for Nvidia, ChatGPT and big models aren't the entirety of AI. At the conference, Hwang In-hoon also shared more of Nvidia's exploration and his own observations in the field of AI.

First, generative AI is the hottest.

All it takes is a hand-drawn sketch to generate a 3D modeled house.

There's no need to worry about writing code.

There's also composing music.

To accelerate the work of those seeking to leverage generative AI, Nvidia announced the establishment of NVIDIA AI Foundations, a cloud service and foundry for users who need to build, refine, and customize LLM and generative AI. These customers use their proprietary data to train AI in specific fields.

AI NEMO's services include NVIDIA NeMo to build text-to-text generation models; Picasso, a visual language modeling service for users who want to build training models based on authorized content; and BioNemo to help biomedical researchers.

As a productivity tool, AI is also playing great value. In his speech, Hwang In-hoon introduced a few very interesting examples.

The first one was with US telecom giant AT&T. AT&T needs to dispatch 30,000 technicians on a regular basis to serve 13 million customers in 700 regions. With this huge amount of data, scheduling is a pain point. If it runs on the CPU, scheduling optimization takes a full night to complete.

With Nvidia's CuOpt, AT&T can optimize the scheduling plan 100 times faster and update its scheduling plan in real time.

In a sense, with Nvidia's help, AT&T did what Internet companies such as Meituan and Didi, which require real-time matching, took years of accumulation to do.

Another example is a partnership with a chip company. With the technological war between China and the US, most people are aware of the lithography machine, a key device in the semiconductor industry. What few people know, however, is that with the development of process technology, the demand for computing power in chip design is also a major pain point in the semiconductor industry.

Today, computational lithography is the biggest computing workload in chip design and manufacturing. It consumes tens of billions of CPU hours every year, and as algorithms become more complex, the cost of computational lithography is increasing.

In response, Nvidia announced the launch of cuLiTHO — a computational lithography library. It is also cooperating with giants such as ASML and TSMC to drastically reduce computing power consumption and save energy during the chip design process.

In fact, reducing energy consumption and improving computational efficiency is another major value that AI technology will bring to human society in Huang Renxun's eyes.At a time when Moore's Law has failed, the advent of accelerated computing and AI comes at an opportune time.

“All industries are facing the challenges of sustainability, generative AI, and digitalization. “Industrial companies are racing to digitize and reinvent software-driven technology companies — to be disruptors, not disruptors,” Accelerated Computing allows these companies to meet these challenges, Wong In-hoon said. “Accelerated computing is the best way to reduce electricity consumption, achieve sustainability, and be carbon neutral.”

Editor/jayden

The translation is provided by third-party software.


The above content is for informational or educational purposes only and does not constitute any investment advice related to Futu. Although we strive to ensure the truthfulness, accuracy, and originality of all such content, we cannot guarantee it.
    Write a comment