share_log

黄仁勋谈AI现状:仍需数年才能达到“高度可信”

Huang Renxun discusses the current situation of AI: It will still take several years to achieve 'highly reliable.'

Golden10 Data ·  Nov 25 11:25

Jensen Huang believes that we are still "years away" from a highly trustworthy AI system. The entire industry is also rethinking how to train models more effectively with limited data and resources.

Although artificial intelligence is developing rapidly, there is still a distance from an AI system that can be highly trusted. Huang Renxun emphasized that in the next few years, continuously improving computational power and exploring new methods will be key tasks. At the same time, the industry is reconsidering how to more effectively train models with limited data and resources to achieve more reliable and powerful artificial intelligence applications.

Nvidia (NVDA.O) CEO Huang Renxun recently stated that current artificial intelligence cannot provide the best answers, and we are still 'years away' from an AI system that can be 'highly trusted'.

"The answers we are currently getting are far from the best answers," Huang Renxun said in an interview at the Hong Kong University of Science and Technology. He pointed out that people should not doubt the answers of AI, such as whether they are 'illusionary' or 'reasonable'.

"We must reach a stage where you can generally trust the answers of AI... To achieve this, I think we still have a few years to go. During this period, we need to continuously improve computational power.

Limitations of large language models: illusions and data bottlenecks

Language models like ChatGPT have made exponential progress in the past few years, able to answer complex questions, but they still have many limitations. Among them, 'illusions,' generating false or non-existent answers, are a persistent problem for AI chatbots.

For example, last year a radio host sued OpenAI for fabricating a false legal complaint document with ChatGPT, and the latter did not respond to it.

In addition, some AI companies are facing the dilemma of how to advance the development of Large Language Models (LLMs) with limited data resources. Huang Renxun pointed out that relying solely on pre-training, that is, training models on large-scale, diverse datasets, is not enough to develop powerful AI.

"Pre-training - automatically discovering knowledge from all the data in the world - is not enough... Just like graduating from college is an important milestone, but it is not the end."

In recent years, technology companies such as OpenAI, Meta, and Google have focused on collecting vast amounts of data, assuming that more training data will lead to more intelligent and powerful models. However, this traditional approach is now being questioned.

Shifting mindset: Beyond 'blind expansion'

Research shows that neural networks based on Transformers (the core technology of LLMs) exhibit linear performance growth as data volume and computing power increase. However, industry leaders are beginning to worry about the limitations of this strategy and are exploring alternative approaches.

Alexandr Wang, CEO of Scale AI, states that AI investment is mainly based on the assumption of this 'law of expansion,' but now it has become 'the biggest problem for the entire industry.'

Aidan Gomez, CEO of Cohere, believes that while increasing computing power and model size can indeed improve performance, this approach is somewhat 'mechanical.' 'This approach, while reliable, seems somewhat foolish,' he said in a podcast. Gomez advocates for the development of smaller, more efficient models, a method supported for its cost-effectiveness.

Others are concerned that this approach may not achieve 'Artificial General Intelligence' (AGI), which is the theoretical form of AI that matches or exceeds human intelligence.

Former Salesforce executive and CEO of AI search engine You.com, Richard Socher, stated that the training method of large language models is too simplistic, merely "predicting the next token based on known tokens." He believes a more effective training method is to force the model to translate the problem into computer code and generate answers based on the code's output. This approach can reduce illusions in quantitative problems and enhance AI capabilities.

Industry viewpoints on differentiation: Has the scale expansion peaked?

However, not all industry leaders believe that artificial intelligence has encountered obstacles to scale expansion.

Microsoft's Chief Technology Officer Kevin Scott holds a different view. In an interview in July, he stated: "Unlike others, we have not yet reached the point of diminishing marginal returns on scaling."

OpenAI is also working on improving existing large language models. For example, the o1 model released in September still relies on the token prediction mechanism mentioned by Socher, but excels in handling quantitative problems (such as programming and mathematics), distinguishing it from the more general-purpose ChatGPT.

Former Uber engineer Waleed Kadous analogized the two: "If we personify GPT-4, it is more like a friend who knows everything, talks endlessly when answering questions, allowing you to filter valuable information. Whereas o1 is more like the friend who listens carefully, ponders for a moment, and then gives a concise and insightful answer."

However, the o1 model requires more computing resources, resulting in slower running speeds and higher costs.

The translation is provided by third-party software.


The above content is for informational or educational purposes only and does not constitute any investment advice related to Futu. Although we strive to ensure the truthfulness, accuracy, and originality of all such content, we cannot guarantee it.
    Write a comment