Huafu Securities: How to calculate the computing power required for large-scale AI text model training?

Zhitong Finance · Jun 4 19:36

智通财经APP获悉，华福证券发布研究报告称，根据侧算力供给需求公式，需求侧假设行业依然沿Scaling Law发展方向进一步增长，供给侧通过对英伟达GPU的FP16算力、训练市场、算力利用率等进行假设，推导得出GPU需求量，以英伟达Hopper/Blackwell/下一代GPU卡FP16算力衡量，该行认为2024-2026年全球文本大模型AI训练侧GPU需求量为271/592/1244万张。建议关注算力芯片及服务器产业链。

华福证券主要观点如下：

需求侧：Scaling Law驱动大模型算力需求不减

Scaling Law仍然是当下驱动行业发展的重要标准。Scaling Law的基本原理是，模型的最终性能主要与计算量、模型参数量和数据大小三者相关，当不受其他两个因素制约时，模型性能与每个因素都呈现幂律关系。因此，为了提升模型性能，模型参数量和数据大小需要同步放大，从大模型数量上看，近年来呈现爆发式增长趋势，且由于尖端AI模型对于资源投入的大量需求，产业界对于大模型的影响力逐步加深，该行统计了产业界诸多公开披露的大模型训练数据，从大模型算力需求来看，GPT-3到GPT-4参数上从175B快速提升到1.8TB(提升9倍)，训练数据量(Token数)同方向快速增长，由0.3TB提升至13TB(提升42倍)。绝对值上看，根据该行的非完全统计情况，国内外主流大模型在参数量上基本已来到千亿量级，在预训练数据规模上均已来到个位数乃至十位数的TB量级。

供给侧：黄氏定律推动英伟达GPU一路高歌

英伟达GPU持续引领全球AI算力发展，虽然“摩尔定律”逐步放缓，但“黄氏定律”仍在支撑英伟达GPU算力快速提升，一方面，英伟达寻求制程工艺迭代、更大的HBM容量和带宽、双die设计等方法，另一方面，数据精度的降低起到关键作用，Blackwell首度支持FP4新格式，虽然低精度可能会存在应用上的局限性，但不失为一种算力提升策略。若仅考虑英伟达FP16算力，A100/H100/GB200产品的FP16算力分别为前代产品的2.5/6.3/2.5倍，在数量级上持续爆发，自2017年至今，GB200的FP16算力已达到V100的40倍。与之对比，AI大模型参数的爆发速度相对更快,以GPT为例,2018年至2023年，GPT系列模型从1亿参数规模大幅提升至18000亿,相较于AI大模型由Scaling Law驱动的参数爆发，GPU算力增速仍亟待提升，

结论：预计24-26年全球文本大模型训练卡需求为271/592/1244万张

该行根据侧算力供给需求公式，需求侧假设行业依然沿Scaling Law发展方向进一步增长，供给侧通过对英伟达GPU的FP16算力、训练市场、算力利用率等进行假设，推导得出GPU需求量，以英伟达Hopper/Blackwell/下一代GPU卡FP16算力衡量，该行认为2024-2026年全球文本大模型AI训练侧GPU需求量为271/592/1244万张。

建议关注：

算力芯片：寒武纪(688256.SH)、海光信息(688041.SH)、龙芯中科(688047.SH)。

服务器产业链：工业富联(601138.SH)、沪电股份(002463.SZ)、深南电路(002916.SZ)、胜宏科技(300476.SZ)。

风险提示：AI需求不及预期风险、Scaling Law失效风险、GPU技术升级不及预期的风险、测算模型假设存在偏差风险。

According to a research report released by Huafu Securities, based on the formula of computing power supply and demand, and assuming that the industry will continue to develop in the direction of Scaling Law, the demand side can derive the amount of GPU demand through assumptions regarding various factors such as the FP16 computing power and training market of Nvidia, and utilization rate of such computing power. The Securities firm believes that from 2024 to 2026, the global demand for GPUs in large-scale text AI training, calculated by the FP16 computing power of Nvidia's Hopper/Blackwell/next generation GPU, will be 27.1/59.2/124.4 million units. The Securities firm recommends paying attention to the industry chain of computing chips and servers.

Huafu Securities' main points are as follows:

Demand side: Scaling Law drives the increasing demand for large-scale computing

Scaling Law remains an important criterion driving industry development. The basic principle of Scaling Law is that the final performance of a model mainly correlates with the amount of calculation, model parameter amount, and data size. When the other two factors are not restricted, model performance is related to each factor's power law. Therefore, to improve model performance, model parameter amount and data size must be scaled up in sync. From the number of large models, there has been a significant growth trend in recent years. As a result of the significant demand for resources required by advanced AI models, the industry's influence on large models has gradually deepened. The Securities firm has gathered together many publicly disclosed large model training data from the industry, and from the perspective of large model computing power demand, GPT-3's parameter size rose from 175B to 1.8TB in GPT-4 (a 9-fold increase) while the training data amount (Token number) rose from 0.3TB to 13TB (a 42-fold increase). In terms of absolute value, according to the Securities firm's incomplete statistics, the parameter sizes of mainstream large models have already reached billions, if not tens of billions, while pre-training data sizes are at TB-levels.

Supply side: Huang's Law propels Nvidia GPUs upward

Nvidia GPUs remain at the forefront of global AI computing power development. Although Moore's Law has gradually slowed, Huang's Law continues to support the rapid increase in Nvidia GPU computing power. On one hand, Nvidia seeks to use manufacturing process iterations, larger HBM capacity and bandwidth, dual-die design, and other methods; on the other hand, the accuracy of data reduction plays a crucial role. Blackwell supports the new FP4 format, although low precision may limit its application, it represents a type of computing power improvement strategy. Focusing solely FP16 computing power, the FP16 computing powers of A100/H100/GB200 products are 2.5/6.3/2.5 times those of previous generation products, and their levels of growth have continued to explode in recent years. As a comparison, the explosive growth rate of AI large model parameters is relatively faster. For example, from 2018 to 2023, the GPT series expanded its model parameters from a scale of 1 billion to 1.8 trillion. In comparison to AI's large model-driven parameter explosion, the growth rate of GPU computing power still requires improvement.

Conclusion: It is expected that the global demand for GPU training cards for large-scale text models will be 27.1/59.2/124.4 million units from 2024 to 2026.

The Securities firm derived the amount of GPU demand on the demand side through assumptions regarding various factors such as the FP16 computing power and training market of Nvidia, and utilization rate of such computing power, while a formula on computing power supply and demand was taken into consideration for the supply side. Based on this, the Securities firm believes that from 2024 to 2026, the global demand for GPUs in large-scale text AI training, calculated by the FP16 computing power of Nvidia's Hopper/Blackwell/next generation GPU, will be 27.1/59.2/124.4 million units.

Recommendations:

Computing chip: Cambricon Technologies (688256.SH), Hygon Information (688041.SH), and Loongson Technology (688047.SH).

Server industry chain: Foxconn Industrial Internet (601138.SH), Wus Printed Circuit (002463.SZ), Shennan Circuits (002916.SZ), and Victory Giant Technology (300476.SZ).

Risk warning: Risks related to AI demands falling short of expectations, risks related to Scaling Law becoming invalid, risks related to GPU technology upgrades falling short of expectations, risks related to computational model deviation.

The translation is provided by third-party software.

The above content is for informational or educational purposes only and does not constitute any investment advice related to Futu. Although we strive to ensure the truthfulness, accuracy, and originality of all such content, we cannot guarantee it.

华福证券：如何测算文本大模型AI训练端算力需求？

Huafu Securities: How to calculate the computing power required for large-scale AI text model training?

Risk Disclaimer

Statement