share_log

单节点算力飙至5 PFLOPS!NVIDIA最新AI超级计算机开启测试

The computing power of a single node soared to 5 PFLOPS! NVIDIA's latest AI supercomputer starts testing

智东西 ·  Oct 20, 2020 11:08

Original title: single node computing power soared to 5 PFLOPS! NVIDIA's latest AI supercomputer open test source: smart stuff

481a-kavypmp5439786.jpg

Wen Shu / Wen Shu

Ping an Technology Group, a subsidiary of Ping an Insurance Group, a veteran financial player, has built a model of "safe and easy translation" of AI translation products for corporate office scenes in just one month.

Compared with the traditional solution, the data processing speed of "safe and easy Translation" is increased by 7 times, and the average translation time of translating 1000 English characters is reduced from 8.3s to 0.97s.

1c03-kavypmp5439787.png

Behind the efficiency improvement of "safe and easy Translation" is the super AI performance provided by DGX-1, the first generation AI supercomputer of the NVIDIA DGX family. Recently, a new member of the DGX family has been added to the NVIDIA star product.

At the GTC 2020 conference held in May this year, NVIDIA, the global AI computing leader, released the world's most advanced AI system, NVIDIA DGX A100. At the same time, NVIDIA DGX A100 is about to open product test applications.

AI computing is just when the wind is setting sail, intelligent change in thousands of industries is imminent, IT and other industries with consumer Internet genes stand at the "head of the tide". In contrast, the transformation of industries that lack intelligent capabilities, such as medical care and traditional industries, are undoubtedly facing greater challenges.

NVIDIA Corp (NVIDIA), the global leader in AI computing, is providing an AI solution to help enterprises "reduce costs and increase efficiency" through DGX family products.

At a time when NVIDIA DGX A100 is about to open product testing applications, today, Zhi Dong is with you to interpret which industries the NVIDIA DGX family has brought changes and reveal the "cool techs" behind the NVIDIA DGX A100.

一、NVIDIA DGXThe family products areAICalculate to reduce cost and increase efficiency

Viral genome sequencing, infection analysis and prediction. In every subfield of biomedicine, how to deal with complex medical data and simulate complex infection process is a big problem. In 2020, when the importance of drug research and development has never been so valued, NVIDIA's DGX SuperPOD is providing assistance for novel coronavirus's drug research and development.

Biomedical companies GlaxoSmithKline PLC and AstraZeneca PLC will be the first to use Cambridge-1, the 29th supercomputer in the world, to solve medical problems, including novel coronavirus. The supercomputer Cambridge-1 is equipped with NVIDIA's latest NVIDIA DGX SuperPOD solution.

NVIDIA DGX SuperPOD is the world's first one-stop AI infrastructure built by 20,140 NVIDIA DGX A100 systems.

ModularDGX SuperPODThe architecture is available in just a few weeks to install and run systems that traditional supercomputers take years to deploy

▲Cambridge-1超级计算机

In fact, this is not the first time that NVIDIA DGX family products have brought efficiency improvement to the industry by virtue of their eye-catching performance.

For example, the first DGX A100 systems for this open product test have been delivered to the Argonne National Laboratory in the United States to accelerate novel coronavirus drug research. Its previous generation of AI supercomputer products DGX-1 and DGX-2 have also landed in many industries.

In order to solve the problem of fabric defect detection in the traditional textile industry with poor digital foundation, the Central Plains Institute of Technology (formerly Zhengzhou Institute of Textile Technology) trains the AI model based on NVIDIA DGX-1 supercomputer. NVIDIA DGX-1 integrates 8 NVIDIA V100 GPU, whileThe efficiency of a single V100 average training picture per second can reach more than 30 times that of a two-way CPU server.

In addition, in the process of accelerating the landing of advanced scientific and technological achievements in scientific research institutes, there is also the figure of the NVIDIA DGX family. For scientific research institutes, the lack of IT ability of researchers is a major reason why advanced achievements are difficult to fall to the ground.

In view of this situation, the Computing Department of the Network Information Center of Shanghai Jiaotong University has built an AI computing platform based on NVIDIA DGX-2 supercomputer, and the AI computing platform has a peak computing power of 16 PFLOPS.

As of now, Shanghai Jiaotong University's AI computing platform optimizes AI computing and HPC applications for research teams from Shanghai Jiaotong University's Institute of artificial Intelligence, Bio-X Institute, Michigan Union College and other departments.Increase the efficiency of scientific research by up to 18000 times

▲NVIDIA DGX A100系统

On the basis of efficiency improvement, NVIDIA DGX family products also have advantages in reducing costs.

A typical AI data center has 50 DGX-1 systems for AI training and 600 CPU systems for AI reasoning, using 25 racks, power of 630kW, and a cost of more than $11 million. In addition, the time, capital and talent cost of model training and reasoning need to be calculated separately.

By contrast, a rack consisting of five DGX A100 systems can achieve the same effect with only 28kW power and a cost as low as $1 million.

Different from the science and technology enterprises that have accumulated a lot of software and hardware foundation and digital experience in the consumer Internet era, more traditional enterprises such as biomedicine, light and heavy industry, and even scientific research institutes that lack IT experience, face challenges in the wave of AI technology, from the basic configuration of software and hardware to the allocation of digital talents.

NVIDIA launched DGX-1, an AI supercomputer for deep learning platform construction tasks in 2017, and DGX-2, a supercomputer for the speed and scale challenges of AI in 2018, to propose solutions to the dilemma of the industry's intelligent transformation.

NVIDIA DGX A100 is already the third generation AI supercomputer product of the NVIDIA DGX family, and it is also the "protagonist" of this NVIDIA product test application program.

二、DGX A100For allAIWorkload, computing power comparable to data center

Three years after the launch of DGX-1, at the GTC 2020 Conference held on May 14 this year, NVIDIA launched DGX A100, a global AI infrastructure common system for all AI workloads, which provides customers with super computing power of 5 petaFLOPS AI performance and rapid deployment capability "out of the box".

1, comparable to the super computing power of the data center

NVIDIA DGX A100 system, which integrates training, reasoning and data analysis into one platform, is the first server in the world with a single node AI computing power of 5 petaFLOPS.

The rack of five NVIDIA DGX A100 systems has the computing power of an AI data center consisting of 50 DGX-1 systems and 600 CPU systems.

▲NVIDIA DGX A100系统与传统的AI数据中心参数比较

2Right out of the boxRapid deployment capability

NVIDIA DGX system can not only meet the data processing and intelligent deployment needs of enterprises, but also reduce the intelligent "threshold", committed to provide "out of the box" convenient experience.

Take the Selene system built by Argonne National Laboratory (Argonne Nationl Laboratory) based on NVIDIA DGX SuperPOD as an example. The Selene, the seventh fastest computer in the world, can be used to study ways to contain novel coronavirus and is driving the development of AI in automotive, health care and natural language processing.

Surprisingly, based on the open architecture shared by NVIDIA and customers, Selene is assembled as a large AI system.It takes less than a month to complete by a small team.

By contrast, it would take dozens of engineers months to build such a large AI system based on other solutions.

Third, strong enterprise levelAISolution: comprehensive technical support from hard to soft, and test application will be opened soon.

Behind the powerful AI solution provided by the NVIDIA DGX A100 and the NVIDIA DGX SuperPOD solution is the powerful software and hardware support provided by NVIDIA in the NVIDIA DGX A100 product.

1, integration8NVIDIA A100 GPU

Each NVIDIA DGX A100 system integrates 8 NVIDIA A100 GPU to create a general AI solution for training, reasoning, and data analysis.

The NVIDIA A100 GPU is based on the ampere architecture of the eighth generation GPU architecture of NVIDIA and the 7nm process, and contains more than 54 billion transistors. The peak computing power of AI training reaches 312 TFLOPS,AI reasoning and the peak computing power is 1248 TFLOPS.

Compared with the previous generation of Volta architecture GPU, the peak computing power of AI training and AI reasoning of NVIDIA A100 GPU is increased by 20 times.

▲NVIDIA A100 GPU

2, provide320 GBSuper large memory

Each NVIDIA A100 system connects eight A100 GPU using 600 GB/s NVSwitch links, with 320 GB of super-large memory and a bandwidth of 12.4 TB per second.

3、由NVIDIA DGXSoftware stack support

In addition to the hardware support provided by the 8 NVIDIA A100 GPU, the NVIDIA DGX A100 system is also supported by the NVIDIA DGX software stack.

NGC (NVIDIA GPU Cloud) is a GPU optimized software center suitable for deep learning, machine learning and high performance computing, which can accelerate the workflow of AI model from deployment to development.

It is understood that users can run software in the NGC directory locally, in the cloud, and at the edge, as well as using hybrid and multi-cloud deployments. NGC catalog software can be deployed in bare metal servers, Kubernetes, or virtualized environments to maximize the use of GPU while maximizing the portability and scalability of applications.

4, throughMellanoxAchieve superior data center scalability

On April 27th, NVIDIA completed its acquisition of Mellanox, an Israeli server hardware company. By integrating Mellanox's high-performance network technology, NVIDIA will have end-to-end technology from AI computing to the network, as well as full stack products from processors to software.

This means that the NVIDIA DGX A100 system will further improve network performance and scalability compared to the previous two generations.

At present, the NVIDIA DGX A100 is about to open product test applications.

Interested users can register through the registration link and apply for product testing opportunities. Interested users can test the DGX A100 remotely or on site through NVIDIA certified partners, or consult how to build an AI super data center made up of DGX A100 in a short period of time.

NVIDIA staff will contact the applicant within one week after receiving the test application.

Application pageHttps://jinshuju.net/f/u69TOj

Conclusion: a tailor-made solution for intelligent transformation of enterprises

The wave of industrial intelligence is surging, and thousands of industries are among them. For enterprises with weak AI computing foundation, it is particularly important to take this road of intelligent transformation.

On the one hand is to reshape the efficiency gains brought about by each business link, and on the other is the unaffordable high costs, the lack of technical personnel in AI and other problems. How to bridge the gap between the two has become a "must-answer" in the face of this industrial change.

From this point of view, NVIDIA's DGX family products contribute a set of solutions to this "must-answer question".

The NVIDIA DGX A100 provides strong computing power, out-of-the-box deployment, and strong team technical support. In the upcoming product testing, this AI computing "killer" may collide with more sparks with enterprises eager to achieve intelligence.

The translation is provided by third-party software.


The above content is for informational or educational purposes only and does not constitute any investment advice related to Futu. Although we strive to ensure the truthfulness, accuracy, and originality of all such content, we cannot guarantee it.
    Write a comment