Where is the “volume” of the big model price war going? |depth

cls.cn · May 25 15:29

①大厂告别“烧钱出奇迹”思维，将基础模型做强做实，尽快缩小与国外技术差距； ②创业公司向垂直纵深探索，做“精”做“尖”大模型行业应用，或许才是国内大模型生态可持续发展之道。

《科创板日报》5月25日讯（记者黄心怡毛明江）你降价，我免费。

在经过一年多的“百模大战”后，AI大模型从5月初几乎毫无预兆地掀起了“降价潮”，阿里、$百度 (BIDU.US)$、腾讯等纷纷加入战团。大厂“钱多、卡多、算力足”无惧价格战，以惯有的免费模式先聚集用户与开发者。中小大模型创业公司普遍还在勒紧腰带苦苦打造爆款应用，也不得不匆忙接招。

大模型价格战“卷”向何方？在《科创板日报》行业采访与观察中，大厂告别“烧钱出奇迹”思维，将基础模型做强做实，尽快缩小与国外技术差距；创业公司向垂直纵深探索，做“精”做“尖”大模型行业应用，或许才是国内大模型生态可持续发展之道。

▍算法革新与模型优化才是价格战背后真正推手

让人始料不及的是，国内率先掀起大模型价格战的是一家“不务正业”的量化私募巨头——幻方。

这些年国内金融量化交易异军突起，幻方通过先进的量化交易在国内市场中赚得盆满钵满。尝到甜头的幻方全力押注AI驱动下的量化交易，所以花重金大量购入$英伟达 (NVDA.US)$、AMD的GPU。业内一度传出“幻方储备的英伟达H100芯片比大厂还多”。

5月6日，幻方旗下DeepSeek发布的DeepSeek-V2，其价格为GPT-4-Turbo的近百分之一。这是大模型降价潮的第一块多米诺骨牌。

5月11日，智谱的个人版GLM-3Turbo调用价格下降5倍。5月15日，字节豆包主力模型将其在企业市场的定价将至0.0008元/千tokens。

接着，阿里通义千问、百度文心一言加入战团。阿里云通义千问9款模型一齐降价，GPT-4级别主力模型Qwen-Long价格直降97%。百度智能云则直接官宣文心大模型两大主力模型ERNIE Speed和ERNIE Lite免费。

最新则是$科大讯飞 (002230.SZ)$、腾讯两家大模型厂商也加入了“价格战”。科大讯飞宣布，讯飞星火API能力免费开放，讯飞星火Pro/Max API低至0.21元/万tokens。而腾讯云主力模型之一混元-lite模型，价格从0.008元/千tokens调整为全面免费。

不过，《科创板日报》记者从采访中了解到，幻方与大厂的降价并不全是“钱多任性”，更多的原因是，ChatGPT大模型推出近两年后，大模型算法革新与模型优化。

国内某头部大模型创业公司负责人告诉《科创板日报》记者，从目前看，算法框架革新主要有两条思路：轻量化和线性化。他特别提到了其公司刚发布的千亿参数MoE架构模型。MoE架构中基于多个专家并行机制，推理时只激活部分专家，以稀疏性压缩了参数数量和推理成本。“这就可以极大地节省算力耗费”。

而率先掀起价格战的幻方DeepSeek-V2，就是其通过架构创新，实现了大模型成本尤其是推理成本下降的结果。据为DeepSeek提供运维技术支持的AMD相关人士表示，DeepSeek-V2采用稀疏MoE架构进行了共享专家等改进，节约了42.5%的训练成本。

无问芯穹发起人汪玉曾用公开数据做了一次针对算力成本数量级的测算，假设GPT-4 Turbo每天要为10亿活跃用户提供服务，每年的算力成本可能超过两千亿，这还不包括模型训练的投入。

因此，算法革新与模型优化下的推理成本下降，才是将来打开AI应用的重要“推手”。

▍抢用户，更要抢“开发者”

阿里云智能集团资深副总裁、公共云事业部总裁刘伟光在宣布降价时明确表示，“（降价）它的目的一定是普惠于市场”，“要真正加速市场的提前爆发”。

对于本轮大模型降价潮，$猎豹移动 (CMCM.US)$董事长兼CEO傅盛评论认为，大模型降价的目标不是为了用户直接使用，而是为了吸引开发者。

短期来看，大模型的性能遇到了瓶颈。“谁也甩不开谁，谁也拿不出杀手锏，降低推理成本、降低售价成了现在每一家的高优先级任务。”

“现在各个大模型App基本都是免费的，本质上是各个大模型App用户量涨不动了，包括OpenAl。最近有些大模型App推广费用奇高，ROI算不过来……必须让更多的开发者参与进来，开发出应用让用户更方便地使用”。

他还认为，大模型降价的核心原因不仅是大模型厂商自己内卷，而是苹果、$微软 (MSFT.US)$、$高通 (QCOM.US)$、联想等设备厂商纷纷重兵投入本地化AI算力和本地大模型的结果，AI PC、AI Phone将成为主流，通用大模型的使用场景会大幅度受限。

对于这乱价格战影响，傅盛表示，这波大模型降价，对企业用户影响有限，开源小参数大模型加应用套件可以满足绝大多数企业需求，虽然套件定制的费用谁也省不掉，但私有化大模型的成本已经很低。

工信部信息通信经济专家委员会委员盘和林认为，降价是为了扩大客户群，未来大模型领域竞争后，最后活下来的企业不会太多，所以，占据规模是从长期考虑，想要在大模型领域获得头部地位。

“中文大模型市场空间当前有限，不可能所有的大模型都获得成功。尤其是大模型作为生态产品，要么赢者通吃，要么落寞退出。当前价格战的背后，是国内大模型应用大同小异的现状，并没有一家企业做出显著的差异性，各家差距不大。只能卷价格。“

对于此次大模型行业的“降价潮”，科尔尼咨询全球合伙人宋旭军认为，主要受供求关系和成本两个方面因素的影响。首先是供求关系变化，竞争的加剧推动各厂商主动降价以争取用户。第二是成本的下降，随着算力成本的下降和模型算法的优化，模型厂商自身的成本也在下降。典型的例子是英伟达GPU和谷歌TPU、华为昇腾性能都在快速提升。

▍拼技术、拼落地、拼解决行业痛点

$阿里巴巴 (BABA.US)$原副总裁贾扬清在朋友圈发布观点表示，“站在整个AI业界的角度，降价是个拍脑袋就可以做的简单策略，但是真正的To B商业成功更难。”

贾扬清现在身份是Lepton AI创始人，他引用一位国际一流咨询公司CIO的话:“今天企业在使用AI的时候，并不是成本驱动的。”“不是说API贵，才没有人用，而是企业首先得搞清楚‘到底怎么用起来产生业务价值’，否则再便宜也是浪费。而今天恰恰实施的这一层是缺的。”

他指出，前面几年各大云厂商都被“项目制”，“咨询服务”这种业务形态给搞怕了，但是新兴技术落地的过程中，必要的咨询服务还是需要的。在贾扬清看来，“也许不是最便宜地赢得商战，而是最能落地的赢得利润。”

元始智能COO罗璇则向《科创板日报》记者表示，现在的降价和免费并不解决当前大模型落地的核心问题，更关键在于模型计算效率要提升10-100倍，算力芯片成本尤其是推理要下降到1/10-1/100，以及解决可解释性的问题，这三点限制了大模型的落地。“单纯的降价，现阶段只是烧钱形成垄断，劣币驱逐良币。”

盘和林分析，此轮头部云厂商的轮番降价，无疑将对大模型初创企业造成竞争压力。

“中小模型企业和初创企业进入这个领域的成本更高，除非做出差异性，在规模上中小企业和初创企业没有机会。”盘和林称。

多名业内人士告《诉科创板日报》记者，AI大模型不能只是拼价格，更要拼技术、拼落地、拼解决行业痛点。大厂告别“烧钱出奇迹”思维，将基础模型做强做实，尽快缩小与国外技术差距；创业公司向垂直纵深探索，做“精”做“尖”大模型行业应用。仅依靠降价带来的利好有限，大模型需要在落地实施、模型计算效率等方面进一步提升。

值得一提的是，在被问及大模型降价对创业公司影响几何时，国内AI圈两个“顶流”李开复与王小川的表态耐人寻味。

百川智能创始人王小川认为，如果（大模型）是面向B端企业客户，那往后就是直接卖云服务的方式，中间的应用层反倒会繁荣起来，有不少新机会。

不过，在王小川也看来，价格免费是优势，但不一定是竞争力。“百川并不会掺和到价格战当中，因为To B不是公司的主要商业模式，价格战的影响也有限。公司会将更多的精力放在超级应用当中。”

创新工场董事长、零一万物CEO李开复谈及价格战时表示，零一万物目前不打算降低YI系列模型的API价格，并认为目前零一万物带给的性能、性价比都很高了，疯狂降价是双输。零一万物的最新千亿参数模型Yi-Large以总榜第7名的成绩，进入了世界权威的LMSYS 盲测竞技场排行榜。

“我觉得我们的价钱是合适、值得的。如果说以后可能中国就是这么‘卷’，大家都宁可赔光、双输也不让你赢，我们就走外国市场。”李开复说。

① Big manufacturers bid farewell to the “burn money to do miracles” mentality, strengthen and implement the basic model, and close the technology gap with foreign countries as soon as possible; ② Startups explore vertically in depth and apply “refinement” to the “advanced” big model industry, which may be the way for the sustainable development of the domestic big model ecosystem.

“Science and Technology Innovation Board Daily”, May 25 (Reporters Huang Xinyi and Mao Mingjiang) If you cut the price, I'm free.

After more than a year of “100 model wars,” the big AI model set off a “wave of price cuts” with almost no warning from the beginning of May. Ali,$Baidu (BIDU.US)$, Tencent, etc. joined the war team one after another. Big manufacturers “have more money, more cards, and enough computing power” are not afraid of price wars and gather users and developers first using their usual free model. Small, medium, and large model startups are generally still tightening their belts and struggling to create popular apps, and they have to take up the offer in a hurry.

Where is the “volume” of the big model price war going? In the “Science and Technology Innovation Board Daily” industry interviews and observations, big manufacturers bid farewell to the “burn money to do miracles” mentality, strengthen and implement the basic model, and narrow the technology gap with foreign countries as soon as possible; startups explore vertically and in depth and apply “refinement” to the “advanced” big model industry, which may be the way for the sustainable development of the domestic big model ecosystem.

▍ Algorithmic innovation and model optimization are the real drivers behind the price war

Surprisingly, the first domestic model price war started by a quantitative private equity giant that “doesn't do business properly” — Magic Square.

Domestic quantitative financial transactions have risen to prominence over the years, and Magic has made a lot of money in the domestic market through advanced quantitative trading. Magic, which has tasted sweetness, is fully betting on quantitative transactions driven by AI, so it spends a lot of money to buy$NVIDIA (NVDA.US)$, AMD GPU. The industry once reported that “Magic Square has more Nvidia H100 chips in reserve than major manufacturers.”

On May 6, DeepSEEK-v2 was released by DeepSeek, a subsidiary of Magic Square. The price is nearly one-percent of GPT-4-Turbo. This is the first domino in a wave of price cuts for big models.

On May 11, the calling price for the personal version of the GLM-3 Turbo of Smart Spectrum dropped 5 times. On May 15, ByteDoubao's main model will price it in the corporate market to 0.0008 yuan/thousand tokens.

Next, Ali joined the team with a thousand questions from Tongyi and Baidu Wenxin. The prices of the nine Alibaba Cloud Tongyi Qianwen models were all reduced, and the price of the GPT-4 flagship model Qwen-Long dropped 97%. Baidu Smart Cloud directly announced that Wenxin's two main models, ERNIE Speed and ERNIE Lite, are free.

The most recent one is$Iflytek Co.,ltd. (002230.SZ)$Two major model makers, Tencent and Tencent, have also joined the “price war.” iFLYTEK announced that the IFLY StarFire API capabilities are free of charge, and the iFLYX Pro/Max API is as low as 0.21 yuan/10,000 tokens. Meanwhile, the price of the mixed-lite model, which is one of Tencent Cloud's main models, was adjusted from 0.008 yuan/thousand tokens to completely free.

However, the “Science and Technology Innovation Board Daily” reporter learned from the interview that the price cuts of Magic Square and Dachang are not all “arbitrary”; the more reason is that almost two years after the launch of the ChatGPT big model, the big model algorithm was innovated and the model was optimized.

The head of a leading domestic model startup told the “Science and Technology Innovation Board Daily” reporter that from now on, there are two main ideas for algorithm framework innovation: weight reduction and linearization. He specifically mentioned the 100 billion parameter MoE architecture model that his company just released. The MoE architecture is based on the parallel mechanism of multiple experts, and only some experts are activated during inference, reducing the number of parameters and inference costs through sparseness. “This can greatly reduce computing power consumption”.

The Magic Square DeepSEEK-v2, which pioneered the price war, was the result of reducing the cost of large models, especially inference costs, through architectural innovation. According to AMD personnel who provide operation and maintenance technical support for DeepSeek, DeepSeekv2 uses a sparse MoE architecture to share experts and other improvements, saving 42.5% of training costs.

Wang Yu, the founder of Wuyuan Xinqiong, once used public data to calculate the cost of computing power on an order of magnitude. Assuming that GPT-4 Turbo will provide services to 1 billion active users every day, the annual computing power cost may exceed 200 billion dollars. This does not include investment in model training.

Therefore, the reduction in inference costs due to algorithm innovation and model optimization is an important “driver” for opening AI applications in the future.

▍ To rob users, we also need to grab “developers”

Liu Weiguang, senior vice president of Alibaba Cloud Intelligence Group and president of the Public Cloud Division, clearly stated when announcing the price reduction that “(price reduction) must be aimed at benefiting the market” and “it is necessary to really accelerate the early explosion of the market.”

Regarding this round of price cuts for big models,$Cheetah Mobile (CMCM.US)$Chairman and CEO Fu Sheng commented that the goal of reducing the price of the big model is not for users to use it directly, but to attract developers.

In the short term, the performance of the big model has hit a bottleneck. “No one can beat anyone, and no one can come up with a killer weapon. Reducing the cost of reasoning and lowering the selling price is now a high priority task for every company.”

“Nowadays, all major model apps are basically free. Essentially, the number of users of major model apps is not growing, including OpenAL. Recently, the promotion costs of some big model apps are too high, and the ROI cannot be counted... More developers must participate to develop apps that are more convenient for users to use”.

He also believes that the core reason for the price reduction of big models is not only the big model manufacturer's own internal affairs, but also Apple,$Microsoft (MSFT.US)$,$Qualcomm (QCOM.US)$As a result of equipment manufacturers such as Lenovo investing heavily in localized AI computing power and large local models, AI PCs and AI phones will become mainstream, and usage scenarios for general-purpose large models will be greatly limited.

Regarding the impact of this chaotic price war, Fu Sheng said that this wave of big model price cuts has limited impact on enterprise users. Open source small parameter large models plus application kits can meet the needs of the vast majority of enterprises. Although no one can save on the cost of customizing the package, the cost of the big privatization model is already very low.

Pan He Lin, member of the Information and Communication Economics Expert Committee of the Ministry of Industry and Information Technology, believes that the price reduction is to expand the customer base. After future competition in the big model field, not many companies will survive. Therefore, taking up scale is a long-term consideration, and they want to gain a leading position in the big model field.

“The market space for large Chinese models is currently limited, so it is impossible for all big models to be successful. In particular, the big model is an ecological product, and either the winner takes it all, or they quit when they are lonely. What is behind the current price war is the current situation where domestic big model applications are mostly the same. No one company has made significant differences, and there is not much difference between companies. You can only roll up the price. ”

Regarding this “wave of price cuts” in the big model industry, Song Xujun, a global partner at Kearney Consulting, believes that it is mainly affected by two factors: supply and demand, and cost. First, there are changes in the relationship between supply and demand. Increased competition is driving manufacturers to actively reduce prices to win over users. The second is the reduction in costs. As the cost of computing power decreases and model algorithms are optimized, the costs of model manufacturers themselves are also falling. Typical examples are Nvidia GPUs, Google TPU, and Huawei Ascend all rapidly improving performance.

▍ Competing for technology, building ground, and fighting to solve industry pain points

$Alibaba (BABA.US)$Former Vice President Jia Yangqing posted an opinion in his circle of friends, saying, “From the perspective of the AI industry as a whole, price reduction is a simple strategy that can be done by hitting the head, but it is even more difficult for a real ToB business to succeed.”

Jia Yangqing is currently the founder of Lepton AI. He quoted the CIO of a world-class consulting firm: “Today, when companies use AI, they are not cost driven.” “It's not that APIs are expensive, so no one uses them. Instead, companies first have to figure out 'how to use them to generate business value'; otherwise, even if they are cheap, it's a waste. And the exact layer that is being implemented today is missing.”

He pointed out that in the past few years, all major cloud vendors were “project-based”, and the “consulting service” business model was scary, but in the process of implementing emerging technologies, necessary consulting services are still needed. According to Jia Yangqing, “Maybe it's not the cheapest way to win a commercial war, but the most successful way to win profits.”

Luo Xuan, COO of Yuanshi Intelligence, told the “Science and Technology Innovation Board Daily” reporter that the current price reduction and free fees do not solve the core problems of the current big model implementation. More importantly, the calculation efficiency of the model should be increased by 10-100 times, the cost of computing power chips, especially inference, should be reduced to 1/10-1/100, and the problem of explainability. These three points limit the implementation of large models. “Simple price reduction. At this stage, it only burns money to form a monopoly, and bad money drives out good money.”

According to Panhelin's analysis, this round of price cuts by leading cloud vendors will undoubtedly put competitive pressure on big model startups.

“It is more expensive for SME model companies and startups to enter this field. Unless differences are made, there are no opportunities for SMEs and startups in terms of scale.” Pan He Lin said.

A number of industry insiders told the “Science and Technology Innovation Board Daily” reporter that the big AI model must not only compete for price, but also compete for technology, land, and solutions to industry pain points. Big manufacturers bid farewell to the “burn money to do miracles” mentality, strengthen and implement the basic model, and close the technology gap with foreign countries as soon as possible; startups explore vertically in depth and apply “refinement” to the “advanced” big model industry. The benefits brought by price reduction alone are limited, and large models need to be further improved in terms of implementation and model calculation efficiency.

It is worth mentioning that when asked about the geometric impact of big model price cuts on startups, the statements of Li Kaifu and Wang Xiaochuan, the two “top players” in the domestic AI industry were intriguing.

Wang Xiaochuan, founder of Baichuan Intelligence, believes that if (the big model) is aimed at B-side enterprise customers, then in the future it will be a method of directly selling cloud services; the middle application layer will instead prosper, and there are quite a few new opportunities.

However, in Wang Xiaochuan's opinion, the free price is an advantage, but it is not necessarily a competitive advantage. “Baichuan will not be involved in the price war because To B is not the company's main business model, and the impact of the price war is limited. The company will focus more on super applications.”

Li Kaifu, chairman of Innovation Factory and CEO of 01 Million Things, said during the price war that there is currently no plan to lower the API price of the YI series model, and believes that the performance and cost ratio brought by the current 0.1 million things are very high. The crazy price reduction is a double loss. Yi-Large, the latest 100 billion parameter model of 010,000, entered the world's authoritative LMSYS blind test arena ranking with a score of 7th place in the overall ranking.

“I think our price is right and worth it. If China is likely to 'roll' like this in the future, everyone would rather lose or lose and not let you win; we will go to foreign markets.” Li Kaifu said.

The translation is provided by third-party software.

The above content is for informational or educational purposes only and does not constitute any investment advice related to Futu. Although we strive to ensure the truthfulness, accuracy, and originality of all such content, we cannot guarantee it.

大模型价格战“卷”向何方？|深度

Where is the “volume” of the big model price war going? |depth

Risk Disclaimer

Statement