Source: Titanium Media AGI
Author: Lin Zhijia
The AI chip dispute has intensified, and in the future, computing power will become the “nuclear weapon” of AI models.
Recently, after the announcement that OpenAI CEO Sam Altman (Sam Altman) raised 7 trillion US dollars (about RMB 50.26 trillion) to establish a “chip empire,” it attracted widespread attention and caused an uproar in public opinion.
The Titanium Media App learned that on February 11, Beijing time, Altman confirmed OpenAI's launch of “core building” through a social platform and stated that “building large-scale AI infrastructure and an elastic supply chain is essential for economic competitiveness.”
Altman also revealed that currently, OpenAI generates about 100 billion words every day and requires a large number of GPU (graphics processor) chips for training calculations — this is probably one of the important reasons why Altman “builds the core.”
Since this news was too “exciting,” Altman's brother Jack Altman, who has just raised a $150 million fund, also publicly shouted, “Sam, can you give me a week to show up? You need to calm down”.
“Take care of yourself, then we'll do the work together.” Altman responded.
In fact, the $7 trillion needed for Altman's “chip empire” is huge, not only equivalent to 10% of global GDP (gross domestic product), one-quarter (25%) of US GDP, and two-fifths (40%) of China's GDP, but also worth 2.5 Microsoft, 3.75 Google, 4 Nvidia, 7 Meta, and 11.5 Tesla market values.
At the same time, some netizens estimate that if Altman gets 7 trillion US dollars, he can buy 18 chip semiconductor giants including Nvidia, AMD, TSMC, Broadcom, ASML, Samsung, Intel, Qualcomm, and Arm. The remaining money could be used to “pack” Meta and take home another 300 billion dollars.
Furthermore, $7 trillion is more than 13 times the size of the global semiconductor industry last year, higher than the size of treasury bonds of some of the world's major economies, and even larger than large sovereign wealth funds.
Therefore, it was only at this moment that everyone suddenly discovered that Altman's ambitions were so great that they were beyond people's imagination.
Once the $7 trillion funding target is reached, Altman and his OpenAI will reshape the global AI semiconductor industry.
The US Consumer News and Business Channel (CNBC) directly commented, “This is an incredible number. This (OpenAI core design) is like a lunar landing plan.”
The CEO of OpenAI ended up “building the core”, and Sun Zhengyi and Middle Eastern investors may participate
For OpenAI, reducing costs and meeting demand are two key factors. However, these two factors are currently limited by Nvidia and “stuck in the neck”, causing OpenAI to face challenges in the development process.
In the big model boom, AI computing power is mainly limited by two aspects: first, demand for AI model training has increased dramatically, and second, computing power costs have continued to rise.
First, the surge in demand for AI model training is due to the continuous development and widespread application of deep learning technology. As models become more complex, so do the computational resources required for training. This has led to a surge in demand for high-performance computing equipment to meet large-scale model training tasks.
Currently, ChatGPT training requires about 25,000 Nvidia A100 chips at a time. If GPT-5 were to be trained, 50,000 Nvidia H100 would also be needed. Market analysts believe that with the continuous iteration and upgrading of the GPT model, GPT-5 may have no “core” available in the future.
Altman has “complained” about the shortage of AI chips many times, saying that Nvidia's current chip production capacity is insufficient to meet future needs.
Second, the rising cost of computing power is also a problem that cannot be ignored.
As computing power continues to grow, so does the cost of purchasing and maintaining high-performance computing equipment. This is a significant financial burden for many research institutions and companies, limiting their development and innovation in the field of AI.
The price of the Nvidia H100 has soared to $25,000-$30,000, which means that the cost of a single ChatGPT query will increase to around $0.04. And Nvidia has become an essential key partner in AI model training.
According to Wells Fargo statistics, Nvidia currently has 98% market share in the data center AI market, while AMD's market share is only 1.2%, and Intel's market share is less than 1%. In 2024, Nvidia will generate revenue of up to 45.7 billion US dollars in the data center market, or a record high.
According to some sources, the Nvidia H100 will be sold out by 2024.
“We have a huge shortage of GPUs, and the fewer people using our products, the better. If people use less, we'll be happy because we don't have enough GPUs.” Altman has said it to the outside world.
So it's easy for us to understand why Altman had to “build his own chip” — safer and more long-term manageable costs, and less reliance on Nvidia.
And in order to solve these problems, instead of spending money to buy it from Nvidia, OpenAI chose to simply make its own autonomous and controllable special chip.
As early as October 2023, at the WSJ Live event in the US, Altman first responded to rumors that the option of developing a self-developed chip was not ruled out.
“We are still evaluating whether to use custom hardware (chips). We're working to determine how we can scale up to meet the world's needs. “We may not be developing chips, but we are maintaining good cooperation with partners that have achieved outstanding results.” Altman said.
On January 20 of this year, Bloomberg reported that Altman is raising more than 8 billion US dollars with global investors such as the Middle East Abu Dhabi G42 Fund and Japan's SoftBank Group to establish a new AI chip company. The goal is to use the capital to establish a factory network to manufacture chips and directly target Nvidia. However, negotiations are still in the early stages, and the full list is uncertain.
In addition to advancing financing matters, Altman is also stepping up cooperation with leading chip manufacturers.
On January 25, Altman met with executives from SK Hynix and Samsung Electronics Group, leading memory chips in South Korea, focusing on the establishment of an “AI chip alliance”. The two sides may cooperate with Samsung and SK Group in AI chip design and manufacturing. Prior to that, Altman had already contacted Intel and TSMC to discuss cooperation to establish a new chip manufacturing plant.
On February 9, the Wall Street Journal reported that Altman is in talks with investors such as the UAE government, the US Department of Commerce, SoftBank, and other Middle Eastern sovereign wealth funds to raise funds for projects to enhance global chip manufacturing capabilities, totaling as much as 5 trillion to 7 trillion US dollars, which is expected to expand OpenAI's ability to power AI.
However, such a high investment would dwarf the current scale of the global semiconductor industry.
Global chip sales last year were 527 billion US dollars, and it is expected to increase to 1 trillion US dollars by 2030. Also, according to statistics from industry organization SEMI, global semiconductor manufacturing equipment sales in 2023 will be only 100 billion US dollars.
Therefore, Altman must seek government-type financial support.
According to people familiar with the matter, Oltman recently met with US Secretary of Commerce Gina Raimondo (Gina Raimondo) and discussed this topic; at the same time, Microsoft and SoftBank Group Sun Zhengyi are also aware of this plan and have supported it, and Altman is already discussing matters related to joint ventures with SoftBank and TSMC; in addition, Middle East investment institutions and the UAE government also intend to support OpenAI, including the UAE's top security official, the brother of UAE President Mohammed, and Sheikh Tanon bin Zayed Azad, the top head of several Abu Dhabi sovereign wealth funds Naha Yang (Sheikh Tahnoun bin Zayed Al Nahyan).
In addition to factory construction and supply chain cooperation, Altman has also invested in at least 3 chip companies. One of them is Cerebras, a well-known computing power chip company in the US.
According to reports, Cerebras has launched ultra-large chip products that have broken world records. Its second-generation AI chip WSE-2 has reached 2.6 trillion transistors, and the number of AI cores has reached 850,000.
The second company Altman invested in was Rain Neuromorphics, a chip startup based on the RISC-V open source architecture that mimics the way the brain works, to achieve algorithm training. In 2019, OpenAI signed a letter of intent and spent 51 million dollars to buy Rain's chips.
And in December of last year, the US forced a venture capital firm backed by Saudi Aramco (Saudi Aramco) to sell its shares in Rain.
The last one is Atomic Semi, which was co-founded by chip giants Jim Keller and Sam Zeloof. The former was the chief architect of AMD K8 and also participated in the development of Apple's A4/A5 chips. Atomic Semi's goal is to simplify the chip manufacturing process and achieve rapid production with a view to reducing chip costs. In January 2023, based on a valuation of 100 million US dollars, Atomic Semi completed a round of financing granted by OpenAI.
However, Altman's “core manufacturing” plan still faces many difficult problems. One of them is where to build a new chip factory. If they prefer the US, it is expected that they will receive billions of dollars in subsidies from TSMC and other manufacturers in the next few weeks. However, the US not only “costs money” to build a fab, but there will also be problems such as delays and a shortage of workers.
For example, TSMC pointed out that its $40 billion project in Arizona had problems such as delays, shortage of skilled workers, and high costs; Intel's $20 billion chip factory in Ohio also announced an extension, and production was delayed until the end of 2026.
In response to the report, an OpenAI spokesperson said, “OpenAI has had productive discussions on increasing the global infrastructure and supply chain for chips, energy, and data centers, which are critical for AI and other industries that depend on it. Given the importance of national priorities, we will continue to report the situation to the US Government and look forward to sharing more details later.”
The AI chip dispute has intensified this year, and computing power will become a “nuclear weapon” for AI models in the future
At the beginning of 2024, the AI chip war intensified as Nvidia chips were in short supply and expensive.
On February 5, Meta confirmed that in order to support its AI business, Meta plans to deploy a new customized chip in its data center this year, the second-generation self-developed AI chip Artemis, which is expected to be officially put into production in 2024 to reduce its dependence on Nvidia chips.
Meta said the chip will work in tandem with hundreds of thousands of off-the-shelf GPUs purchased by Meta, “We believe our internally developed accelerators will be highly complementary to commercially available GPUs to provide the best combination of performance and efficiency on Meta's specific workloads.”
Recently, Meta CEO Mark Zuckerberg stated that the primary requirement for building “general artificial intelligence” (AGI) is “world-class computing infrastructure.”
He revealed that by the end of this year, Meta will have about 350,000 H100 pieces, and if other GPUs are included, it will have a total computing power equivalent of 600,000 H100 blocks.
Dylan Patel (Dylan Patel), founder of semiconductor research and consulting company SemiAnalysis, said that successfully deploying its own Athena chips on Meta's operating scale can reduce the cost of each chip by one-third compared to Nvidia products, thereby potentially saving hundreds of millions of dollars in energy costs and billions of dollars in chip procurement costs every year.
It's not just Meta. In fact, compared to general-purpose chips, special purpose integrated circuit (ASIC) chips that Amazon, Google, and Microsoft have been independently developing can perform machine learning tasks faster and consume less power.
Among them, Google has revealed that its self-developed Cloud TPU v5p is currently Google's most powerful, expandable, and flexible AI accelerator. It provides more than 2 times the number of FLOPs (floating point operations per second) and 3 times the high-bandwidth memory (HBM) based on the TPU v4. The scalability is also 4 times that of the previous generation, providing more computing power support.
Microsoft launched its first artificial intelligence chip, the Maia 100, in November 2023, to compete with Nvidia's GPUs and reduce its expensive dependence on Nvidia. The Maia 100 uses a 5nm process and has 105 billion transistors, which can be used for big language model training and inference, and the Maia 100 will also support OpenAI. Meanwhile, Microsoft has also built an Arm-based Cobalt CPU for cloud computing. According to some sources, these two chips are expected to be launched in 2024.
Also, AMD and Intel are actively deploying AI computing power.
In December of last year, AMD released a new MI300X AI chip, which integrates 153 billion transistors, and claims that its chip's inference performance is superior to the Nvidia H100. AMD CEO Lisa Su (Lisa Su) recently said at an earnings conference that once more production capacity is put into operation in the second half of 2024, the total annual sales of AMD's AI chips will probably exceed 3.5 billion US dollars; Intel will release a new 5nm Gaudi3 AI chip this year, and the reasoning performance is also excellent.
Rosenblatt Securities analyst Hans Mosesmann (Hans Mosesmann) said, “AI computing power seems to be everywhere.”
Recently, Alipay CTO Chen Liang (flower name: Junyi) told Titanium Media App and others that currently large-scale AI applications are still facing many “bottlenecks” during implementation, including problems such as high computing power costs and hardware limitations. Although GPU cards are already very efficient, how to adapt them to different technology stacks (compatible with different technologies) is still an important challenge.
Looking at the domestic computing power market, Yunzhisheng Chairman and CTO Leung Ka-yan once said to the Titanium Media App, etc., “The best chip in the industry now is Nvidia's; the one that can take the lead domestically is Huawei Shengteng.”
Altman has revealed that OpenAI hopes to ensure an adequate supply of AI chips by 2030.