share_log

大模型价格战“卷”向何方?|深度

Where is the “volume” of the big model price war going? |depth

cls.cn ·  May 25 15:29

① Big manufacturers bid farewell to the “burn money to do miracles” mentality, strengthen and implement the basic model, and close the technology gap with foreign countries as soon as possible; ② Startups explore vertically in depth and apply “refinement” to the “advanced” big model industry, which may be the way for the sustainable development of the domestic big model ecosystem.

“Science and Technology Innovation Board Daily”, May 25 (Reporters Huang Xinyi and Mao Mingjiang) If you cut the price, I'm free.

After more than a year of “100 model wars,” the big AI model set off a “wave of price cuts” with almost no warning from the beginning of May. Ali,$Baidu (BIDU.US)$, Tencent, etc. joined the war team one after another. Big manufacturers “have more money, more cards, and enough computing power” are not afraid of price wars and gather users and developers first using their usual free model. Small, medium, and large model startups are generally still tightening their belts and struggling to create popular apps, and they have to take up the offer in a hurry.

Where is the “volume” of the big model price war going? In the “Science and Technology Innovation Board Daily” industry interviews and observations, big manufacturers bid farewell to the “burn money to do miracles” mentality, strengthen and implement the basic model, and narrow the technology gap with foreign countries as soon as possible; startups explore vertically and in depth and apply “refinement” to the “advanced” big model industry, which may be the way for the sustainable development of the domestic big model ecosystem.

▍ Algorithmic innovation and model optimization are the real drivers behind the price war

Surprisingly, the first domestic model price war started by a quantitative private equity giant that “doesn't do business properly” — Magic Square.

Domestic quantitative financial transactions have risen to prominence over the years, and Magic has made a lot of money in the domestic market through advanced quantitative trading. Magic, which has tasted sweetness, is fully betting on quantitative transactions driven by AI, so it spends a lot of money to buy$NVIDIA (NVDA.US)$, AMD GPU. The industry once reported that “Magic Square has more Nvidia H100 chips in reserve than major manufacturers.”

On May 6, DeepSEEK-v2 was released by DeepSeek, a subsidiary of Magic Square. The price is nearly one-percent of GPT-4-Turbo. This is the first domino in a wave of price cuts for big models.

On May 11, the calling price for the personal version of the GLM-3 Turbo of Smart Spectrum dropped 5 times. On May 15, ByteDoubao's main model will price it in the corporate market to 0.0008 yuan/thousand tokens.

Next, Ali joined the team with a thousand questions from Tongyi and Baidu Wenxin. The prices of the nine Alibaba Cloud Tongyi Qianwen models were all reduced, and the price of the GPT-4 flagship model Qwen-Long dropped 97%. Baidu Smart Cloud directly announced that Wenxin's two main models, ERNIE Speed and ERNIE Lite, are free.

The most recent one is$Iflytek Co.,ltd. (002230.SZ)$Two major model makers, Tencent and Tencent, have also joined the “price war.” iFLYTEK announced that the IFLY StarFire API capabilities are free of charge, and the iFLYX Pro/Max API is as low as 0.21 yuan/10,000 tokens. Meanwhile, the price of the mixed-lite model, which is one of Tencent Cloud's main models, was adjusted from 0.008 yuan/thousand tokens to completely free.

However, the “Science and Technology Innovation Board Daily” reporter learned from the interview that the price cuts of Magic Square and Dachang are not all “arbitrary”; the more reason is that almost two years after the launch of the ChatGPT big model, the big model algorithm was innovated and the model was optimized.

The head of a leading domestic model startup told the “Science and Technology Innovation Board Daily” reporter that from now on, there are two main ideas for algorithm framework innovation: weight reduction and linearization. He specifically mentioned the 100 billion parameter MoE architecture model that his company just released. The MoE architecture is based on the parallel mechanism of multiple experts, and only some experts are activated during inference, reducing the number of parameters and inference costs through sparseness. “This can greatly reduce computing power consumption”.

The Magic Square DeepSEEK-v2, which pioneered the price war, was the result of reducing the cost of large models, especially inference costs, through architectural innovation. According to AMD personnel who provide operation and maintenance technical support for DeepSeek, DeepSeekv2 uses a sparse MoE architecture to share experts and other improvements, saving 42.5% of training costs.

Wang Yu, the founder of Wuyuan Xinqiong, once used public data to calculate the cost of computing power on an order of magnitude. Assuming that GPT-4 Turbo will provide services to 1 billion active users every day, the annual computing power cost may exceed 200 billion dollars. This does not include investment in model training.

Therefore, the reduction in inference costs due to algorithm innovation and model optimization is an important “driver” for opening AI applications in the future.

▍ To rob users, we also need to grab “developers”

Liu Weiguang, senior vice president of Alibaba Cloud Intelligence Group and president of the Public Cloud Division, clearly stated when announcing the price reduction that “(price reduction) must be aimed at benefiting the market” and “it is necessary to really accelerate the early explosion of the market.”

Regarding this round of price cuts for big models,$Cheetah Mobile (CMCM.US)$Chairman and CEO Fu Sheng commented that the goal of reducing the price of the big model is not for users to use it directly, but to attract developers.

In the short term, the performance of the big model has hit a bottleneck. “No one can beat anyone, and no one can come up with a killer weapon. Reducing the cost of reasoning and lowering the selling price is now a high priority task for every company.”

“Nowadays, all major model apps are basically free. Essentially, the number of users of major model apps is not growing, including OpenAL. Recently, the promotion costs of some big model apps are too high, and the ROI cannot be counted... More developers must participate to develop apps that are more convenient for users to use”.

He also believes that the core reason for the price reduction of big models is not only the big model manufacturer's own internal affairs, but also Apple,$Microsoft (MSFT.US)$,$Qualcomm (QCOM.US)$As a result of equipment manufacturers such as Lenovo investing heavily in localized AI computing power and large local models, AI PCs and AI phones will become mainstream, and usage scenarios for general-purpose large models will be greatly limited.

Regarding the impact of this chaotic price war, Fu Sheng said that this wave of big model price cuts has limited impact on enterprise users. Open source small parameter large models plus application kits can meet the needs of the vast majority of enterprises. Although no one can save on the cost of customizing the package, the cost of the big privatization model is already very low.

Pan He Lin, member of the Information and Communication Economics Expert Committee of the Ministry of Industry and Information Technology, believes that the price reduction is to expand the customer base. After future competition in the big model field, not many companies will survive. Therefore, taking up scale is a long-term consideration, and they want to gain a leading position in the big model field.

“The market space for large Chinese models is currently limited, so it is impossible for all big models to be successful. In particular, the big model is an ecological product, and either the winner takes it all, or they quit when they are lonely. What is behind the current price war is the current situation where domestic big model applications are mostly the same. No one company has made significant differences, and there is not much difference between companies. You can only roll up the price. ”

Regarding this “wave of price cuts” in the big model industry, Song Xujun, a global partner at Kearney Consulting, believes that it is mainly affected by two factors: supply and demand, and cost. First, there are changes in the relationship between supply and demand. Increased competition is driving manufacturers to actively reduce prices to win over users. The second is the reduction in costs. As the cost of computing power decreases and model algorithms are optimized, the costs of model manufacturers themselves are also falling. Typical examples are Nvidia GPUs, Google TPU, and Huawei Ascend all rapidly improving performance.

▍ Competing for technology, building ground, and fighting to solve industry pain points

$Alibaba (BABA.US)$Former Vice President Jia Yangqing posted an opinion in his circle of friends, saying, “From the perspective of the AI industry as a whole, price reduction is a simple strategy that can be done by hitting the head, but it is even more difficult for a real ToB business to succeed.”

Jia Yangqing is currently the founder of Lepton AI. He quoted the CIO of a world-class consulting firm: “Today, when companies use AI, they are not cost driven.” “It's not that APIs are expensive, so no one uses them. Instead, companies first have to figure out 'how to use them to generate business value'; otherwise, even if they are cheap, it's a waste. And the exact layer that is being implemented today is missing.”

He pointed out that in the past few years, all major cloud vendors were “project-based”, and the “consulting service” business model was scary, but in the process of implementing emerging technologies, necessary consulting services are still needed. According to Jia Yangqing, “Maybe it's not the cheapest way to win a commercial war, but the most successful way to win profits.”

Luo Xuan, COO of Yuanshi Intelligence, told the “Science and Technology Innovation Board Daily” reporter that the current price reduction and free fees do not solve the core problems of the current big model implementation. More importantly, the calculation efficiency of the model should be increased by 10-100 times, the cost of computing power chips, especially inference, should be reduced to 1/10-1/100, and the problem of explainability. These three points limit the implementation of large models. “Simple price reduction. At this stage, it only burns money to form a monopoly, and bad money drives out good money.”

According to Panhelin's analysis, this round of price cuts by leading cloud vendors will undoubtedly put competitive pressure on big model startups.

“It is more expensive for SME model companies and startups to enter this field. Unless differences are made, there are no opportunities for SMEs and startups in terms of scale.” Pan He Lin said.

A number of industry insiders told the “Science and Technology Innovation Board Daily” reporter that the big AI model must not only compete for price, but also compete for technology, land, and solutions to industry pain points. Big manufacturers bid farewell to the “burn money to do miracles” mentality, strengthen and implement the basic model, and close the technology gap with foreign countries as soon as possible; startups explore vertically in depth and apply “refinement” to the “advanced” big model industry. The benefits brought by price reduction alone are limited, and large models need to be further improved in terms of implementation and model calculation efficiency.

It is worth mentioning that when asked about the geometric impact of big model price cuts on startups, the statements of Li Kaifu and Wang Xiaochuan, the two “top players” in the domestic AI industry were intriguing.

Wang Xiaochuan, founder of Baichuan Intelligence, believes that if (the big model) is aimed at B-side enterprise customers, then in the future it will be a method of directly selling cloud services; the middle application layer will instead prosper, and there are quite a few new opportunities.

However, in Wang Xiaochuan's opinion, the free price is an advantage, but it is not necessarily a competitive advantage. “Baichuan will not be involved in the price war because To B is not the company's main business model, and the impact of the price war is limited. The company will focus more on super applications.”

Li Kaifu, chairman of Innovation Factory and CEO of 01 Million Things, said during the price war that there is currently no plan to lower the API price of the YI series model, and believes that the performance and cost ratio brought by the current 0.1 million things are very high. The crazy price reduction is a double loss. Yi-Large, the latest 100 billion parameter model of 010,000, entered the world's authoritative LMSYS blind test arena ranking with a score of 7th place in the overall ranking.

“I think our price is right and worth it. If China is likely to 'roll' like this in the future, everyone would rather lose or lose and not let you win; we will go to foreign markets.” Li Kaifu said.

The translation is provided by third-party software.


The above content is for informational or educational purposes only and does not constitute any investment advice related to Futu. Although we strive to ensure the truthfulness, accuracy, and originality of all such content, we cannot guarantee it.
    Write a comment