OpenAI, Microsoft, Meta Advance New AI Tests As Transparency Concerns Grow
OpenAI, Microsoft, Meta Advance New AI Tests As Transparency Concerns Grow
ChatGPT parent OpenAI, Microsoft Corp (NASDAQ:MSFT), and Meta Platforms Inc (NASDAQ:META) are facing challenges as the rapid development of artificial intelligence (AI) outpaces existing evaluation methods.
chatgpt概念股的母公司OpenAI、微軟公司(納斯達克股票代碼:MSFT)和Meta Platforms Inc(納斯達克股票代碼:META)在人工智能(AI)快速發展的情況下,面臨着超越現有評估方法的挑戰。
Major tech firms have begun creating internal benchmarks to test their AI models' capabilities better and address this issue. However, this approach has raised concerns within the industry about the need for standardized public evaluations, making it difficult for businesses and consumers to assess the advancements in AI technology, Financial Times reports.
主要科技公司已經開始創建內部基準來更好地測試其人工智能模型的能力,並解決這個問題。然而,這種方法引發了行業內對於需要標準化公開評估的擔憂,使得企業和消費者難以評估人工智能技術的進步,據《金融時報》報道。
Also Read: ASML's IT Outage Affected Chipmaking Operations Worldwide
也要閱讀:ASML的IT故障影響了全球的芯片製造業務
Ahmad Al-Dahle, the head of generative AI at Meta, highlighted to the Financial Times the difficulty in measuring the capabilities of the latest AI systems. This has prompted companies like Meta, OpenAI, and Microsoft to develop proprietary evaluation methods. However, this move has drawn criticism for limiting the ability to compare different AI technologies.
Meta的生成式人工智能負責人Ahmad Al-Dahle向《金融時報》強調了衡量最新人工智能系統能力的困難。這促使像Meta、OpenAI和微軟這樣的公司開發專有的評估方法。然而,這一舉措卻因限制了比較不同人工智能技術的能力而受到批評。
Traditional public benchmarks, such as Hellaswag and MMLU, utilize multiple-choice questions to test common sense and general knowledge. However, researchers argue that these methods no longer effectively gauge the reasoning capabilities of advanced AI models.
傳統的公共基準,如Hellaswag和MMLU,利用多項選擇題來測試常識和一般知識。然而,研究人員認爲這些方法已不再有效地衡量先進人工智能模型的推理能力。
For instance, Mark Chen, senior vice president of Research at OpenAI, told the Financial Times that human-designed tests are increasingly inadequate for measuring the true capabilities of these sophisticated systems. As a result, there is a growing push within the industry to create more complex tests that better reflect real-world challenges.
例如,OpenAI研究高級副總裁Mark Chen告訴《金融時報》,人類設計的測試逐漸無法有效衡量這些複雜系統的真實能力。因此,行業內存在着更多複雜測試的推動,更好地反映現實世界的挑戰。
The shift towards private benchmarks has sparked debate over the transparency of AI testing. Dan Hendrycks, executive director of the Center for AI Safety, told the Financial Times that with publicly available benchmarks, it becomes easier for businesses and the general public to understand the actual progress being made in AI. This lack of transparency may hinder efforts to accurately gauge how close AI models are to automating complex tasks.
轉向私人基準引發了關於人工智能測試透明度的辯論。AI安全中心執行董事Dan Hendrycks告訴《金融時報》,通過公開基準,企業和普通大衆更容易了解人工智能領域的實際進展。這種缺乏透明度可能妨礙準確評估人工智能模型距離自動化複雜任務的進展。
Beyond internal benchmarks, external organizations have also started contributing to developing new evaluation methods. In September, Scale AI partnered with Hendrycks to launch "Humanity's Last Exam," a project that crowdsources complex questions from experts across various fields, requiring abstract reasoning.
除了內部基準之外,外部組織也開始爲開發新的評估方法做出貢獻。在9月份,Scale AI與Hendrycks合作推出了「人類最後的考試」項目,該項目通過衆包方式從各個領域的專家那裏收集複雜問題,需要抽象推理。
Additionally, FrontierMath, a new benchmark designed by expert mathematicians, challenges even the most advanced models, with a completion rate of less than 2% on its most challenging questions.
此外,FrontierMath是由專業數學家設計的新基準,挑戰着最先進的模型,最具挑戰性問題的完成率不到2%。
Wedbush analyst Dan Ives projected $1 trillion in AI capital expenditure by U.S. tech giants like Microsoft, Meta, Amazon.Com Inc (NASDAQ:META), Alphabet Inc (NASDAQ:GOOG) (NASDAQ; GOOGL).
Wedbush分析師丹·艾夫斯預計,微軟、meta platforms、亞馬遜、以及字母表等美國科技巨頭在人工智能的資本支出將達到1萬億美元。
Price Actions: MSFT stock is down 0.8% at $419.17 at last check Monday. META is down 1.36%.
股價走勢:微軟股價下跌0.8%,報419.17美元,上週一最近一次檢查爲止。meta platforms下跌1.36%。
Also Read:
還閱讀:
- Canaan Secures Major Mining Deal With HIVE, Shares Soar On Equipment Order
- 嘉楠科技與HIVE簽署重要的採礦協議,股價因設備訂單飆升
Image created using artificial intelligence via Midjourney.
圖像由Midjourney通過人工智能創建。
譯文內容由第三人軟體翻譯。