From NVIDIA to Broadcom, a "paradigm shift" in the AI Industry.

wallstreetcn · Dec 16 23:46

随着AI大模型从预训练阶段转向逻辑推理阶段，以ASIC为代表的专用芯片可能将逐步取代以GPU为代表的通用芯片，成为各大AI公司的“新宠”。分析称，如果博通CEO对ASIC市场的预测准确，未来三年博通ASIC相关的AI业务有望实现每年翻倍增长。

上周五晚美股市场惊现“买博通、卖英伟达”： $博通 (AVGO.US)$ 股价大涨24%，创下历史最高单日涨幅，公司市值突破1万亿美元，同时芯片龙头 $英伟达 (NVDA.US)$ 股价录得3.3%的下跌。

今日，这种行情还在延续，截至发稿，博通一度涨近10%，而英伟达再跌2%，创2个月新低。

引爆买盘的导火索来自博通CEO Hock Tan在当天业绩会上的大胆预测：2027年市场对定制款AI芯片ASIC的需求规模将达600亿至900亿美元。

有分析指出，这个数据如果实现，意味着未来三年（2025-2027年），博通ASIC相关的AI业务将实现每年翻倍增长，这大幅提升了市场对于ASIC的预期，可能意味着ASIC有望迎来爆发期。

数据枯竭、边际效益递减……大模型从训练转向推理

作为AI模型的第一阶段，预训练是个不断给模型“喂”数据并不断迭代更新的过程。

为了提升模型性能，秉持着数据量、计算量和模型参数量规模越大越好（即Scaling law）的原则，各大科技巨头纷纷哄抢当前市面上性能最为强悍的英伟达GPU，让囤积的GPU数量给AI模型效用“做担保”。

然而，高强度、大规模的模型训练正在“榨干”全球数据库，并且模型扩展边际效益递减的同时算力成本仍然高企，这引发了对AI训练阶段是否已经即将终结的讨论。

近日，前OpenAI联合创始人、SSI创始人Ilya Sutskever在NeurIPS 2024大会上发表演讲时表示，预训练时代即将结束，数据作为AI的化石燃料是有限的，目前用于AI预训练的数据已经达到了峰值。

OpenAI大神Noam Brown也于近日回应称，AI从2019年到现在，难以置信的成就都来自于数据和算力规模的扩大，但大语言模型仍然无法解决像井字棋这样的简单问题。

随之而来的问题是：Is Scaling is All you Need？我们真的还需要再消耗更高的成本来训练更好的AI吗？

外界的目光开始移至AI大模型的下一阶段——逻辑推理。

作为大模型预训练的下一阶段，逻辑推理是指基于现有大模型，开发AI在各细分垂直领域的应用以实现终端落地。

从市面上的大模型产品看，包括谷歌的Gemini 2.0、OpenAI的o1等在内，AI Agent（智能体）目前已经成为了各大公司的主攻方向之一。

随着AI大模型日趋成熟，有观点认为，以ASIC（专用集成电路）为代表的推理芯片将逐步取代以GPU为代表的训练芯片，成为各大AI公司的“新宠”。

而博通CEO对ASIC市场的乐观预期在一定程度上印证了外界对AI范式转向的预期，进而引发了上周五的股价暴涨。

ASIC是什么？比GPU更“专一”

半导体大致可以分为标准半导体和专用集成电路（ASIC）。标准半导体拥有标准化规格，只要满足基本要求，就可以应用于任何电子设备，而ASIC则是半导体制造商根据特定产品要求生产的半导体。

因此，ASIC一般会被应用于特定设计和制造的设备中，执行必要的功能。

AI运算也就由此衍生出两条路径：一种是英伟达GPU代表的通用路径，适合通用高性能计算，另一种是ASIC定制芯片为代表的专用路径。

作为标准半导体成品，GPU在处理大规模并行计算任务时表现出色，但当处理大规模矩阵乘法时，存在内存墙问题，而经过特殊设计的ASIC就可以解决这个问题，一旦大规模量产，ASIC的性价比会更高。

简单来说就是，GPU贵在目前产品成熟、产业链成熟，而ASIC的想象力在于更“专一”，并且在处理单个运算任务时能实现更高的处理速度和更低的能耗，因此也更适用于推理端边缘计算。

为科技巨头定制AI芯片，成了迈威尔和博通的“摇钱树”

由于GPU产能趋紧且价格昂贵，众多科技巨头开始加入自研ASIC芯片的队伍，仅供自家使用。

有观点认为，谷歌是AI ASIC芯片的先驱，因其于2015年发布第一代TPU（ASIC）产品，同样具有代表性ASIC芯片还包括亚马逊的Tranium和Inferentia、微软的Maia、Meta的MTIA以及特斯拉的Dojo等。

在自研AI芯片的上游供应链上，迈威尔和博通是常年称霸的两大制造商。

迈威尔的崛起离不开其新领导层的成功战略。仿佛早有预见般，该公司CEO Matt Murphy自2016年上任以来，趁公司重组之际将公司战略重心转向为科技巨头定制芯片，成功在AI大潮中抓住了机遇。

除了谷歌和微软两个大客户，迈威尔还在近期与亚马逊AWS签订了为期5年的合作协议，帮助亚马逊设计自有AI芯片。业内人士认为，这将助推迈威尔AI定制芯片业务在下一财年实现翻倍增长。

作为迈威尔的主要竞争对手，博通同样拥有谷歌、Meta和字节三家大客户。

有分析预计，到2027-2028年，每家客户都会达到1年百万片ASIC的采购规模，随着第四和第五大客户也开始快速爬升，这些科技公司的芯片订制订单将在未来几年给博通带来十分可观的AI收入。

随着AI大模型进入“下半场”，真正的推理端刚刚开始，关于芯片的又一场鏖战即将打响。正如博通的CEO Hock Tan此前所预言：

“未来50%的AI Flops（算力）都会是ASIC，甚至CSP（超大规模云计算产商）内部自用100%都将是ASIC”。

编辑/Somer

As AI large models transition from the pre-training phase to the logical reasoning phase, dedicated chips represented by ASICs may gradually replace general-purpose chips represented by GPUs, becoming the "new favorite" of major AI companies. Analysis suggests that if Broadcom's CEO's predictions about the ASIC market are accurate, the AI Business related to Broadcom's ASICs is expected to achieve annual growth of doubling over the next three years.

Last Friday, the US stock market witnessed a surge in "Buy Broadcom, Sell NVIDIA": $Broadcom (AVGO.US)$ The stock price soared by 24%, setting a record for the highest single-day increase, and the company's Market Cap surpassed 1 trillion USD, while the chip leader $NVIDIA (NVDA.US)$ recorded a 3.3% decline.

Today, this trend continues, and as of the time of writing, Broadcom shot up nearly 10%, while NVIDIA dropped another 2%, hitting a new two-month low.

The catalyst for the Bid came from Broadcom CEO Hock Tan's bold prediction at the same day's Earnings Conference: By 2027, the market demand for custom AI Chips (ASIC) will reach between 60 billion and 90 billion dollars.

Some analysts pointed out that if this figure is realized, it means that in the next three years (2025-2027), Broadcom's AI business related to ASIC will achieve double growth each year, significantly raising market expectations for ASICs, potentially signaling an explosive period for ASIC.

Data depletion, diminishing marginal returns... large models are shifting from training to inference.

As the first stage of AI models, pre-training is a process of continuously "feeding" data to the model and iteratively updating it.

In order to enhance model performance, following the principle that larger data volumes, computational power, and model parameters are better (i.e., the Scaling law), major Technology giants are competing to acquire the most powerful NVIDIA GPUs available on the market, allowing the accumulated number of GPUs to guarantee the utility of AI models.

However, high-intensity and large-scale model training is "exhausting" Global databases, and as model expansion yields diminishing returns, the cost of computing remains high, sparking discussions on whether the AI training phase is nearing its end.

Recently, Ilya Sutskever, co-founder of OpenAI and founder of SSI, stated during a speech at the NeurIPS 2024 conference that the era of pre-training is about to end, as data, the fossil fuel of AI, is limited, and the data currently used for AI pre-training has reached its peak.

OpenAI's expert Noam Brown also recently responded that the incredible achievements of AI from 2019 to now stem from the expansion of data and computing power, yet large language models still cannot solve simple problems like tic-tac-toe.

The ensuing question is: Is Scaling is All you Need? Do we really need to incur even higher costs to train better AI?

External attention is beginning to shift to the next stage of AI large models—logical reasoning.

As the next stage of pre-training for large models, logical reasoning refers to the development of AI applications in various segmented vertical fields based on existing large models to achieve terminal implementation.

Looking at the large model products on the market, including Google's Gemini 2.0 and OpenAI's o1, AI Agents have now become one of the main focus areas for major companies.

As AI large models become increasingly mature, there are views that ASICs (Application-Specific Integrated Circuits), represented by inference chips, will gradually replace GPUs, represented by training chips, becoming the new favorite of major AI companies.

The optimistic expectations of Broadcom's CEO for the ASIC market somewhat validate external expectations for a paradigm shift in AI, leading to the surge in stock prices last Friday.

What is ASIC? More 'specialized' than GPUs.

Semiconductors can generally be divided into standard semiconductors and Application-Specific Integrated Circuits (ASICs). Standard semiconductors have standardized specifications and can be applied to any electronic device as long as they meet basic requirements, while ASICs are semiconductors manufactured by semiconductor manufacturers according to specific product requirements.

Therefore, ASICs are generally applied in devices that are specifically designed and manufactured to perform necessary functions.

AI computation thus gives rise to two paths: one is the general path represented by NVIDIA's GPUs, suitable for general high-performance computing, and the other is the specialized path represented by custom ASICs.

As a standard Semiconductor product, GPUs perform excellently in handling large-scale parallel computing tasks, but when dealing with large-scale matrix multiplication, there is a memory wall issue. However, specially designed ASICs can solve this problem, and once mass production ramps up, the cost-performance ratio of ASICs will be higher.

In simple terms, the GPU is expensive due to its mature products and Industry Chain, while the imagination of ASIC lies in being more 'specialized', achieving higher processing speed and lower energy consumption when handling individual computation tasks, making it more suitable for inference at the Edge Computing.

Customizing AI Chips for Technology giants has become a 'cash cow' for Marvell and Broadcom.

Due to tight GPU supply and high prices, many Technology giants have begun to join the ranks of developing their own ASIC Chips for in-house use.

There is a viewpoint that Google is a pioneer in AI ASIC Chips, having released the first generation TPU (ASIC) product in 2015. Other representative ASIC Chips include Amazon's Tranium and Inferentia, Microsoft's Maia, Meta's MTIA, and Tesla's Dojo.

In the upstream supply chain for self-developed AI Chips, Marvell and Broadcom are the two dominant manufacturers year after year.

Marvell's rise cannot be separated from the successful strategy of its new leadership. As if foreseen, the company's CEO Matt Murphy has shifted the company's strategic focus toward customizing chips for Technology giants since taking office in 2016 during the company's restructuring, successfully seizing opportunities in the AI wave.

In addition to Google and Microsoft as major clients, Marvell recently signed a five-year cooperation agreement with Amazon AWS to help Amazon design its own AI Chips. Industry insiders believe this will boost Marvell's AI custom chip business to achieve double growth in the next fiscal year.

As a major competitor of Marvel, Broadcom also has three major clients: Google, Meta, and Byte.

Analysis estimates that by 2027-2028, each customer will reach an annual procurement scale of one million ASIC chips. With the fourth and fifth largest customers also beginning to grow rapidly, the chip customization orders from these Technology companies will bring significant AI revenue to Broadcom in the coming years.

As the AI large models enter the 'second half', the real inference phase has just begun, and another fierce battle over chips is about to start. As Broadcom's CEO Hock Tan previously predicted:

"In the future, 50% of AI Flops (computing power) will be ASICs, and even CSPs (Cloud Service Providers) will exclusively use 100% ASICs internally."

Editor/Somer

The translation is provided by third-party software.

The above content is for informational or educational purposes only and does not constitute any investment advice related to Futu. Although we strive to ensure the truthfulness, accuracy, and originality of all such content, we cannot guarantee it.

从英伟达到博通，AI行业“范式大转变”