share_log

2024年,人工智能芯片还能给市场哪些惊喜?

What other surprises can artificial intelligence chips bring to the market in 2024?

半導體行業觀察 ·  Jan 1 14:36

Source: Semiconductor Industry Watch

In 2023, as the artificial intelligence market represented by big language models continues to boom, we saw artificial intelligence become the biggest driving force in the semiconductor industry, and also saw Nvidia's amazing sales performance and market capitalization reach new highs. With the arrival of the new year, we are also looking forward to 2024 artificial intelligence chips.

Market demand: Artificial intelligence will continue to be popular

From the market demand side, we believe artificial intelligence will continue to be popular in 2024, which will continue to drive the relevant chip industry to maintain a good trend. However, unlike 2023, we believe that in 2024, demand in the artificial intelligence market will slowly expand from the cloud to terminals at the same time, which will also drive the corresponding chip market development.

First, judging from cloud demand, the big language model will still be the main growth point. At the same time, image generation models will continue to grow rapidly. Specifically, the big language model is still the core technology being developed by major technology companies. Chinese and foreign technology companies such as OpenAI, Microsoft, Google, Huawei, Alibaba, and Baidu are vigorously developing the next generation of big language models, while companies in traditional industries such as China Mobile are also entering the big language model field. At the same time, a large number of startups are also vigorously developing big language models with the support of venture capital. The era of great language models has only just begun and is far from over. Demand for chips is also expected to grow rapidly at this point in time. The characteristics of big language models are that they require massive amounts of data and training chip resources. At the same time, since the dust has not yet settled on the pattern, a large number of companies are developing new models, so the overall demand for training chips will be very large.

As artificial intelligence interaction in the cloud enters the multi-modal era, chatbots can not only answer with words, but can also complete tasks such as reading pictures, speaking, and generating images and even videos. Therefore, we believe that image generation models, as well as multi-modal models combining images and language, will also become an important growth point for cloud artificial intelligence.

In addition to the cloud, we believe that terminals (including mobile phones and smart cars) will also become a new growth point for artificial intelligence. Although artificial intelligence on mobile phones is no longer new, as generative models mature, we can expect to see such models land on mobile phones and enable new user experiences. Mobile phone generation models are also divided into two types. One is an image generation model, that is, a model represented by a diffusion model (diffusion model). This type of model can achieve high-quality super-resolution and high-quality retouching, which is expected to revolutionize users' photography and photo editing. Another type of application is language models — as opposed to big language models (LLMs) running in the cloud, we've seen the rise of small language models (SLM) over the past few months. The small language model, like the big language model, is mainly used for language understanding and generation (or conversation with people); after the number of parameters is reduced, the small language model can be applied more flexibly in specific scenarios (rather than trying to cover all scenarios like the big language model) and provides high accuracy, and is also likely to run on terminal devices.

In the field of smart cars, on the one hand, revolutionary performance improvements brought about by the large end-to-end multi-tasking model (such as the improvement in panoramic scene recognition performance brought by Bevformer and the significant increase in multitasking performance brought by Shang Tang in 2023) will drive the further implementation of such models and drive the demand for chips. On the other hand, artificial intelligence applications originating from the cloud, such as human-computer interaction from language models, migrate to smart car scenarios.

Therefore, we predict that 2024 will continue to be a hot year for artificial intelligence. Unlike 2023, in addition to maintaining popularity of artificial intelligence in the cloud, we also expect that terminal application scenarios will also become a new growth point for artificial intelligence demand.

Cloud market pattern analysis

In the cloud-based artificial intelligence chip market, we expect Nvidia to maintain its leading position, but competitors such as AMD are also expected to gain more market share.

First, as mentioned earlier, the main demand in the cloud market currently lies in the training and inference of large language models and generative images. Since these models require a lot of computational resources and account for a large proportion of training tasks, a high threshold has been set for the corresponding chips. The thresholds here include:

- Chip computing power: In order to support huge computational volumes, chips need sufficient computing units, memory capacity, and bandwidth

- Distributed computing support: distributed computing is a must for large models

- Software compatibility and ecology: For training, iterating models over and over again is a strong requirement, so there must be a good enough ecosystem to support rapid iteration of different model operators

Currently, Nvidia is still leading in this field, and is the first choice in terms of its chip and distributed computing performance, as well as software ecosystem compatibility. This is why in 2023, Nvidia's H100 became the most valuable resource for artificial intelligence-related companies, and is in short supply in the market. In 2024 H2, Nvidia will begin shipping the H200. Compared to the H100, the H200 has 40% increase in memory bandwidth and 80% increase in memory capacity, so it is expected that artificial intelligence companies will compete to buy it.

In 2024, we expect AMD to gain a stronger foothold in the field of artificial intelligence in the cloud, and slowly start moving towards a larger market share as a result. In the second half of 2023, AMD released the latest MI300X GPU module for high-performance computing. The chip contains a large chip (12 processors/IO chips), and compared to the H200, it has higher FP8 computing power (1.6 times), and greater memory capacity and bandwidth (1.2 times). Judging from the measured data released by AMD, the MI300X's reasoning ability is about 20%-40% better than the H100, and the training performance is on par with the H100. We believe that the software ecosystem (including compiler performance) will be a decisive factor in whether AMD can succeed in the cloud AI market, and this is expected to improve in 2024: OpenAI will add support for the AMD MI300X to the newly released Triton framework, and AI acceleration software frameworks from major startups are also strengthening support for AMD GPUs. With the improvement of chip performance and the software ecosystem, and the concerns of major technology companies about the dominant position of Nvidia GPUs, we expect 2024 to be an important year for AMD GPUs in the artificial intelligence market, and we expect to see more customer applications.

From a supply chain perspective, since cloud-based artificial intelligence chips have strong demand for high-bandwidth memory such as HBM3, we believe that production capacity for HBM memory and advanced packaging (such as CoWoS) will still be hot. This will also drive corresponding semiconductor companies to expand production capacity and actively develop next-generation memory and advanced packaging technology. From this perspective, the application of artificial intelligence will still be the core driving force for the rapid development of new semiconductor technologies.

Terminal market pattern analysis

In addition to the cloud market, we expect artificial intelligence to also have stronger demand for the terminal market, which will also push artificial intelligence to become an increasingly important differentiating element in terminal computing chips.

On mobile phones, the frequency of use of artificial intelligence will be further increased, which will also push the chip to add more corresponding computing power, and use artificial intelligence support as the core highlight of the SoC. For example, the Snapdragon 8 Gen 3 released by Qualcomm uses “less than one second to perform image generation tasks” as an important selling point. It is estimated that such artificial intelligence capabilities will be deeply integrated into mobile phone manufacturers' operating systems. In addition to third-party chip companies such as Qualcomm, system makers that develop their own mobile phone chips are expected to continue to increase artificial intelligence; although Apple keeps a low profile in this regard, it is expected that in the future, it will use various methods (increase NPU computing power or increase software support) to begin empowering more artificial intelligence to create new shooting user experiences on iPhones. Vivo has accumulated several years in the field of self-developed ISP chips, and currently generative artificial intelligence has a strong synergy with ISP in improving the shooting experience. This is also the v3 ISP chip released by vivo in August 2023, which emphasizes generative artificial intelligence as a core highlight. In the future, it is expected that more and more such chips will emphasize the enabling role of artificial intelligence in the user experience.

In the smart car field, although Nvidia is not as strong as in the cloud, its Orin series chips are still the standard chip modules considered by major car manufacturers. We believe that as the AI model gradually strengthens its enabling role in intelligent driving, whether it is third-party chips or chips developed by car manufacturers, it will further strengthen investment in artificial intelligence computing power, and also drive rapid improvements in chip performance — recently, whether NIO released the computing power specifications of new self-developed chips or Tesla's announcement that it will use TSMC 3nm as the next generation chip production process, all suggest that artificial intelligence will play an increasingly important role in the field of smart car chips in 2024.

What new technologies are worth watching?

In addition to the chips discussed above, what new technologies are expected to bring new changes to the field of artificial intelligence chips?

First, in-memory computing and near-memory computing/processing techniques are expected to receive more and more attention. For artificial intelligence in the cloud, memory access costs have always been a performance bottleneck, and as the number of parameters in large models increases, so does the cost of memory access. The main purpose of in-memory computing and near-memory computing/processing techniques is to reduce such overhead so that some computational and processing tasks can be completed in memory. In this field, Samsung's PIM (process in memory) and PNM (process near memory) technologies are very worthy of our attention, and these technologies are also expected to become the key to further improving the differentiated competitiveness of Samsung's memory technology in the future.

For terminal artificial intelligence, since there is a strong demand for latency in the smart car scenario, there are many opportunities for new technologies to have an impact. In the cloud, acceleration chips represented by GPUs are mainly based on optimized throughput considerations rather than latency, so new architecture designs are necessary in the field of smart cars. For automotive applications, data enters the processor in the form of data streams (not in batches), so artificial intelligence chips must be able to process these data streams at high speed and with low latency. On the other hand, large models are entering smart car applications, so how to support large model inference with low latency will be a key breakthrough direction for new technologies in smart car chips.

Editor/Jeffrey

The translation is provided by third-party software.


The above content is for informational or educational purposes only and does not constitute any investment advice related to Futu. Although we strive to ensure the truthfulness, accuracy, and originality of all such content, we cannot guarantee it.
    Write a comment