share_log

一夜之间,微软、英伟达、亚马逊全部接入DeepSeek!吴恩达:中国AI正在崛起

Overnight, Microsoft, NVIDIA, and Amazon all connected to DeepSeek! Andrew Ng: AI in China is on the rise.

New Intelligence Source ·  Jan 31 08:49

Microsoft, NVIDIA, and Amazon embrace DeepSeek R1, along with USA Cloud Computing platforms. Andrew Ng and the former CEO of Intel praise DeepSeek's innovative capabilities.

On the last day of January, the enthusiasm from DeepSeek shows no signs of waning.

Far across the ocean in the USA, not only do industry professionals feel unprecedented pressure, but even those who usually show no interest in AI are shocked by the impact of China's AI.

Anthropic's CEO is calling for the USA to strengthen chip regulation; OpenAI is seeking the largest single financing of $40 billion in Silicon Valley history.

Netizens have utilized loose open-source licensing to create tutorials for replacing OpenAI Operator with DeepSeek-R1—completely free without the $200 subscription!

From the beginning, there has been great appreciation for DeepSeek. $NVIDIA (NVDA.US)$ Just announced: "DeepSeek-R1 has officially landed on NVIDIA NIM." According to reports, the processing speed of the full version DeepSeek-R1 671B on a single NVIDIA HGX H200 system can reach 3,872 Tokens per second.

On the same day, $Amazon (AMZN.US)$ the DeepSeek-R1 model was launched on Amazon Bedrock and SageMaker AI.

It once came out and publicly questioned OpenAI about DeepSeek "stealing" data, $Microsoft (MSFT.US)$ and even deployed DeepSeek-R1 in its own Cloud Computing Service Azure a day earlier.

In addition to major Technology companies, AI startups did not miss such a great opportunity.

The Windsurf editor simultaneously integrated the DeepSeek-R1 and V3 models, and for the first time, achieved tool invocation of R1 in the programming agents.

Cerebras not only achieved an inference speed 57 times faster than GPUs but also reported that its deployed 70B model has a higher accuracy rate than GPT-4o and o1-mini.

AI in China is rising.

Andrew Ng, known as the 'first person in AI Education' by global AI learners, believes that the heated discussion around DeepSeek this week has made many people clearly see several important trends that have always existed:

  1. The USA's leading position in the GenAI field is being rapidly caught up by China, and the AI supply chain landscape will be reshaped.

  2. Open weight models are driving the commoditization of foundational model layers, bringing new opportunities for application developers.

  3. Scaling up is not the only way to achieve progress in AI. Although computing power is highly sought after, algorithmic innovation is rapidly lowering training costs.

China is catching up to the USA in the GenAI field.

When ChatGPT was launched in November 2022, the USA was clearly ahead of China in the GenAI field.

Due to the slow shift in perception, until recently, Andrew Ng was still hearing a lot of rhetoric about China's continued backwardness.

But in reality, the gap between the two sides has rapidly narrowed in the past two years.

With the launch of models such as Qwen (which Andrew Ng's team has been using for several months), Kimi, InternVL, and DeepSeek, China's gap in text models is shrinking, and in areas like video generation, China has even shown some leading advantages.

Today, DeepSeek-R1 not only open-sourced its model weights but also shared a technical report containing many details.

In contrast, some American companies are advocating for regulations to stop open sourcing by rendering hypothetical AI dangers like human extinction.

It is undeniable that open-source/open-weight models are key parts of the AI supply chain—many companies are using them.

In this regard, Andrew Ng stated: If the USA continues to hinder open source, this link in the AI supply chain will be dominated by China.

Open-weight models are popularizing the foundation model layer.

The price of LLM tokens has been rapidly declining, and the open-weight models have not only accelerated this trend but also provided developers with more options.

The output price of OpenAI is $60 per million tokens, while DeepSeek R1 is only $2.19. This nearly 30-fold difference has brought attention to the trend of price decline.

Training foundational models and providing API services is fraught with difficulties; many AI companies are still looking for ways to recover the costs of model training.

Sequoia Capital's article "AI's $600B Question" articulates this challenge very well.

In contrast, there are excellent business opportunities in application development built on top of foundational models.

Now, companies have invested billions of dollars to train some models, and you can access them for a small fee. Then, use them to develop customer service chatbots, email summarization tools, AI doctors, legal document assistants, and many other applications.

Scaling up is not the only path to AI advancement.

There has been much heated discussion around driving progress through scaling up models, and even Andrew Ng was one of the early supporters.

Many companies create "hype" for billions of dollars in financing:

With more funding, they can (1) scale up and (2) predictably drive improvements.

As a result, people begin to overly focus on scaling while overlooking improvements achieved in other ways.

Influenced by the USA's AI Chip ban, the DeepSeek team had to run models on the relatively lower-performing H800 GPU, which also drove a lot of innovations in optimization. Ultimately, the model training cost (excluding research costs) was less than 6 million dollars.

Whether this truly reduces computing demand remains to be seen. Sometimes, a lower unit price for a commodity can actually lead to increased total spending on that commodity.

Andrew Ng believes, "In the long run, the demand for Asia Vets and computing power is virtually unlimited, so even if Asia Vets become cheaper, humanity will still use more Asia Vets."

On X, we can see many different interpretations of DeepSeek's progress. Like the "Rorschach inkblot test," it allows many people to project their understanding onto it.

While the geopolitical impact of DeepSeek-R1 remains to be clarified, it is indeed good news for developers of AI applications.

Andrew Ng's team has been brainstorming some new ideas, and these ideas are made possible simply because we can easily access an open advanced reasoning model.

Now is still a great time for creativity!

Three revelations brought by DeepSeek

The success of DeepSeek has even "blown away" veterans of the chip and computing industry, like former Intel CEO Pat Gelsinger.

As a very experienced engineer in the industry, Gelsinger believes that the current reactions to DeepSeek overlook three important lessons learned during the past fifty years of computer development.

First: Computing follows the "gas law"

Computing will fill the available space defined by usable resources (capital, Electrical Utilities, cooling limitations, etc.) just like gas.

As seen in many fields such as CMOS, personal computers, multi-core processors, virtualization, mobile devices, etc., the widespread provision of computing resources at extremely low prices will drive explosive market expansion rather than contraction.

In the future, AI will be everywhere, but today, the cost of realizing this potential is still outrageously high.

Secondly, the essence of engineering is to deal with constraints.

Clearly, the DeepSeek team faces numerous constraints, but they found highly creative ways to deliver world-class solutions at a cost that is 10-50 times lower.

The ban from the USA has restricted available resources, forcing engineers in China to be creative, and they indeed achieved it—hundreds of billions of dollars in hardware, the latest chips, and billions in training budgets are no longer necessities.

Many years ago, Gelsinger interviewed one of the most famous computer scientists, Donald Knuth. He described in detail how to do the best work when resources are extremely limited and deadlines are urgent.

Gelsinger stated that this insight was one of the most important revelations in his engineering management career.

Thirdly, openness will ultimately triumph.

In recent years, it has been disappointing to see foundational model research becoming increasingly closed.

At this point, Gelsinger aligns more with Musk than with Altman – we really hope, no, we need to enhance the openness of AI research.

We need to know what the training datasets are and to study the algorithms while deeply considering their correctness, ethics, and impacts. Numerous examples such as Linux, GCC, USB, and WiFi have made this abundantly clear.

In battles over law, spectrum, engineering, and adoption, openness is not easy and is always challenged by market forces. However, given a proper opportunity, 'openness' will prevail every time.

The importance of AI for the future of humanity is self-evident, thus a closed ecosystem must not be allowed to dominate this field exclusively.

DeepSeek is an incredible engineering feat – it will drive broader adoption of AI and help reshape the industry's view on open innovation.

It is a highly constrained team from China that reminds us all of these fundamental lessons from the history of computing.

Reference Material:

https://x.com/AndrewYNg/status/1885033810552905814

https://www.linkedin.com/posts/patgelsinger_wisdom-learning-the-lessons-i-thought-i-activity-7289659541477113856-o1Qr/

Editor/Somer

The translation is provided by third-party software.


The above content is for informational or educational purposes only and does not constitute any investment advice related to Futu. Although we strive to ensure the truthfulness, accuracy, and originality of all such content, we cannot guarantee it.
    Write a comment

    Statement

    This page is machine-translated. Futubull tries to improve but does not guarantee the accuracy and reliability of the translation, and will not be liable for any loss or damage caused by any inaccuracy or omission of the translation.