NVIDIA Launches Cosmos World Foundation Model Platform to Accelerate Physical AI Development
NVIDIA Launches Cosmos World Foundation Model Platform to Accelerate Physical AI Development
NVIDIA Cosmos
英偉達阿童木
- New State-of-the-Art Models, Video Tokenizers and an Accelerated Data Processing Pipeline, Optimized for NVIDIA Data Center GPUs, Are Purpose-Built for Developing Robots and Autonomous Vehicles
- First Wave of Open Models Available Now to Developer Community
-
Global Physical AI Leaders 1X, Agile Robots, Agility, Figure AI, Foretellix, Uber, Waabi and XPENG Among First to Adopt
- 新一代最先進的模型、視頻標記器和加速數據處理管道,針對英偉達IDC概念優化,專爲開發機器人和自主車輛而設計。
- 第一波開放模型現已面向開發者社區發佈。
- 全球物理人工智能領導者1X、敏捷機器人、敏捷、Figure AI、Foretellix、優步、Waabi和小鵬是首批採用的公司。
LAS VEGAS, Jan. 06, 2025 (GLOBE NEWSWIRE) -- CES— NVIDIA today announced NVIDIA Cosmos, a platform comprising state-of-the-art generative world foundation models, advanced tokenizers, guardrails and an accelerated video processing pipeline built to advance the development of physical AI systems such as autonomous vehicles (AVs) and robots.
拉斯維加斯,2025年1月6日 (全球新聞網) -- CES——英偉達今天宣佈 英偉達阿童木,一個由最先進的生成式平台組成 世界基座模型,愛文思控股的分詞器、保護機制和加速的視頻處理管道,旨在推進 物理人工智能 系統,例如 自動駕駛車輛(AVs) 和 機器人.
Physical AI models are costly to develop, and require vast amounts of real-world data and testing. Cosmos world foundation models, or WFMs, offer developers an easy way to generate massive amounts of photoreal, physics-based synthetic data to train and evaluate their existing models. Developers can also build custom models by fine-tuning Cosmos WFMs.
物理人工智能模型的開發成本高,而且需要大量的真實世界數據和測試。阿童木世界基礎模型,或稱WFM,爲開發者提供了一種簡單的方法來生成大量的照片真實、基於物理的 合成數據 以訓練和評估他們現有的模型。開發者還可以通過微調阿童木WFM來構建自定義模型。
Cosmos models will be available under an open model license to accelerate the work of the robotics and AV community. Developers can preview the first models on the NVIDIA API catalog, or download the family of models and fine-tuning framework from the NVIDIA NGC catalog or Hugging Face.
阿童木模型 將以開放模型許可的方式提供,以加速機器人和自動駕駛社區的工作。開發者可以在 英偉達API目錄預覽第一個模型,或從中下載模型系列和微調框架。 英偉達NGC目錄 或 Hugging Face.
Leading robotics and automotive companies, including 1X, Agile Robots, Agility, Figure AI, Foretellix, Fourier, Galbot, Hillbot, IntBot, Neura Robotics, Skild AI, Virtual Incision, Waabi and XPENG, along with ridesharing giant Uber, are among the first to adopt Cosmos.
領先的機器人和汽車公司,包括1X、靈活機器人、Agility、Figure AI、Foretellix、Fourier, Galbot, Hillbot, IntBot, Neura RoboticsSkild AI、Virtual Incision、Waabi 和 XPENG 以及搭車共享巨頭 Uber 是首批採用阿童木的公司之一。
"The ChatGPT moment for robotics is coming. Like large language models, world foundation models are fundamental to advancing robot and AV development, yet not all developers have the expertise and resources to train their own," said Jensen Huang, founder and CEO of NVIDIA. "We created Cosmos to democratize physical AI and put general robotics in reach of every developer."
NVIDIA 創始人兼首席執行官黃仁勳表示:「機器人領域的 ChatGPT 時刻即將來臨。與大型語言模型一樣,世界基礎模型對推動機器人和自動駕駛汽車的發展至關重要,但並非所有開發者都有專業知識和資源來訓練自己的模型。我們創建阿童木是爲了使物理人工智能民主化,讓通用機器人技術更加觸手可及。」
Open World Foundation Models to Accelerate the Next Wave of AI
NVIDIA Cosmos' suite of open models means developers can customize the WFMs with datasets, such as video recordings of AV trips or robots navigating a warehouse, according to the needs of their target application.
開放世界基礎模型以加速下一波人工智能的發展
英偉達阿童木的開放模型套件意味着開發者可以 自定義 這些WFMs可以通過數據集進行定製,例如自動駕駛旅行的視頻錄製或機器人在倉庫內導航的記錄,具體取決於他們目標應用的需求。
Cosmos WFMs are purpose-built for physical AI research and development, and can generate physics-based videos from a combination of inputs, like text, image and video, as well as robot sensor or motion data. The models are built for physically based interactions, object permanence, and high-quality generation of simulated industrial environments — like warehouses or factories — and of driving environments, including various road conditions.
阿童木WFMs是專爲物理人工智能研究和開發而設計的,可以根據輸入的組合生成基於物理的視頻,如文本、圖像和視頻,以及機器人傳感器或運動數據。這些模型旨在支持物理基礎的交互、物體持久性,以及高質量的模擬工業環境生成——如倉庫或工廠——以及各種道路條件下的駕駛環境。
In his opening keynote at CES, NVIDIA founder and CEO Jensen Huang showcased ways physical AI developers can use Cosmos models, including for:
在他的開幕式 在CES的主題演講中英偉達創始人兼首席執行官黃仁勳展示了物理人工智能開發者如何使用阿童木模型的方法,包括:
- Video search and understanding, enabling developers to easily find specific training scenarios, like snowy road conditions or warehouse congestion, from video data.
- Physics-based photoreal synthetic data generation, using Cosmos models to generate photoreal videos from controlled 3D scenarios developed in the NVIDIA Omniverse platform.
- Physical AI model development and evaluation, whether building a custom model on the foundation models, improving the models using Cosmos for reinforcement learning or testing how they perform given a specific simulated scenario.
- Foresight and "multiverse" simulation, using Cosmos and Omniverse to generate every possible future outcome an AI model could take to help it select the best and most accurate path.
- 視頻搜索和理解,使開發者能夠輕鬆從視頻數據中找到特定的訓練場景,如雪天道路條件或倉庫擁堵。
- 基於物理的真實感合成數據生成,使用阿童木模型從控制的3D場景中生成真實感視頻。 英偉達Omniverse 平台。
- 物理人工智能模型的開發和評估,無論是基於基礎模型構建自定義模型,使用阿童木進行強化學習改進模型,還是測試它們在特定模擬場景中的表現。
- 前瞻性和「多元宇宙」模擬,使用阿童木和Omniverse生成人工智能模型可能採取的每一個未來結果,以幫助其選擇最佳和最準確的路徑。
Advanced World Model Development Tools
Building physical AI models requires petabytes of video data and tens of thousands of compute hours to process, curate and label that data. To help save enormous costs in data curation, training and model customization, Cosmos features:
愛文思控股世界模型開發工具
構建物理人工智能模型需要寵物字節的視頻數據和數萬小時的計算時間來處理、整理和標記這些數據。爲了幫助節省巨額的數據整理、訓練和模型定製成本,阿童木具有:
- An NVIDIA AI and CUDA-accelerated data processing pipeline, powered by NVIDIA NeMo Curator, that enables developers to process, curate and label 20 million hours of videos in 14 days using the NVIDIA Blackwell platform, instead of over three years using a CPU-only pipeline.
- NVIDIA Cosmos Tokenizer, a state-of-the-art visual tokenizer for converting images and videos into tokens. It delivers 8x more total compression and 12x faster processing than today's leading tokenizers.
- The NVIDIA NeMo framework for highly efficient model training, customization and optimization.
- 一個由英偉達和CUDA加速的數據處理管道,支持 英偉達NeMo Curator這使開發者能夠在14天內利用英偉達Blackwell平台處理、整理和標記2000萬小時的視頻,而不是在僅使用CPU的管道中超過三年。
- 英偉達阿童木分詞器一種先進的視覺分詞器,將圖像和視頻轉換爲分詞。它提供8倍的總體壓縮和12倍的處理速度,超越當今領先的分詞器。
- 本 NVIDIA NeMo 用於高效模型訓練、定製和優化的框架。
World's Largest Physical AI Industries Adopt Cosmos
Pioneers across the physical AI industry are already adopting Cosmos technologies.
全球最大的物理人工智能產業採用阿童木。
在物理人工智能行業的先驅者們已經開始採用阿童木技術。
1X, an AI and humanoid robot company, launched the 1X World Model Challenge dataset using Cosmos Tokenizer. XPENG will use Cosmos to accelerate the development of its humanoid robot. And Hillbot and Skild AI are using Cosmos to fast-track the development of their general-purpose robots.
1X是一家人工智能和類人機器人公司,推出了 1X世界模型挑戰 數據集,使用阿童木分詞器。XPENG將使用阿童木加速其類人機器人的開發。Hillbot和Skild AI正在使用阿童木來加快他們通用機器人的開發。
"Data scarcity and variability are key challenges to successful learning in robot environments," said Pras Velagapudi, chief technology officer at Agility. "Cosmos' text-, image- and video-to-world capabilities allow us to generate and augment photorealistic scenarios for a variety of tasks that we can use to train models without needing as much expensive, real-world data capture."
Agility的首席技術官Pras Velagapudi表示:「數據稀缺和變化性是機器人環境中成功學習的關鍵挑戰。阿童木的文本、圖像和視頻到世界的能力使我們能夠爲各種任務生成和增強逼真的場景,我們可以利用這些場景在不需要大量昂貴現實世界數據捕獲的情況下訓練模型。」
Transportation leaders are also using Cosmos to build physical AI for AVs:
交通行業的領導者們也在使用阿童木來爲自動駕駛汽車構建物理人工智能:
- Waabi, a company pioneering generative AI for the physical world starting with autonomous vehicles, is evaluating Cosmos in the context of data curation for AV software development and simulation.
- Wayve, which is developing AI foundation models for autonomous driving, is evaluating Cosmos as a tool to search for edge and corner case driving scenarios used for safety and validation.
- AV toolchain provider Foretellix will use Cosmos, alongside NVIDIA Omniverse Sensor RTX APIs, to evaluate and generate high-fidelity testing scenarios and training data at scale.
- Global ridesharing giant Uber is partnering with NVIDIA to accelerate autonomous mobility. Rich driving datasets from Uber, combined with the features of the Cosmos platform and NVIDIA DGX Cloud, can help AV partners build stronger AI models even more efficiently.
- Waabi是一家爲物理世界開創生成式人工智能的公司,首要目標是自動駕駛汽車,正在將阿童木用於自動駕駛軟件開發和仿真中的數據策展。
- Wayve正在開發用於自動駕駛的人工智能基礎模型,正在將阿童木作爲工具來搜索用於安全性和驗證的邊緣和角落案例駕駛場景。
- 自動駕駛工具鏈供應商Foretellix將與 英偉達Omniverse傳感器RTX APIs一起使用阿童木,以高效評估和生成高保真測試場景和訓練數據。
- 全球共享出行巨頭優步正在與英偉達合作,加速自主移動。優步豐富的駕駛數據集結合阿童木平台的功能和英偉達DGX雲,可以幫助自動駕駛合作伙伴更有效地構建更強大的人工智能模型。
"Generative AI will power the future of mobility, requiring both rich data and very powerful compute," said Dara Khosrowshahi, CEO of Uber. "By working with NVIDIA, we are confident that we can help supercharge the timeline for safe and scalable autonomous driving solutions for the industry."
「生成性人工智能將推動未來的出行,這需要豐富的數據和非常強大的計算能力,」優步首席執行官達拉·科斯羅沙希表示。「通過與英偉達合作,我們有信心能夠加速行業安全且可擴展的自動駕駛解決方案的時間表。」
Developing Open, Safe and Responsible AI
NVIDIA Cosmos was developed in line with NVIDIA's trustworthy AI principles, which prioritize privacy, safety, security, transparency and reducing unwanted bias.
開發開放、安全和負責任的人工智能
英偉達阿童木 已開發 符合英偉達的 可信賴的人工智能 原則強調隱私、安全、安防、透明性和減少不必要的偏見。
Trustworthy AI is essential for fostering innovation within the developer community and maintaining user trust. NVIDIA is committed to safe and trustworthy AI, in line with the White House's voluntary AI commitments and other global AI safety initiatives.
值得信賴的人工智能對於推動開發者社區內的創新和維護用戶信任至關重要。英偉達致力於安全和可信賴的人工智能,符合白宮的自願人工智能承諾以及其他全球人工智能安全倡議。
The open Cosmos platform includes guardrails designed to mitigate harmful text and images, and features a tool to enhance text prompts for accuracy. Videos generated with Cosmos autoregressive and diffusion models on the NVIDIA API catalog include invisible watermarks to identify AI-generated content, helping reduce the chances of misinformation and misattribution.
開放的阿童木平台包括旨在減輕有害文本和圖像的保護措施,並提供增強文本提示以提高準確性的工具。使用阿童木生成的視頻 自回歸 和 擴散 NVIDIA API目錄中的模型包含隱形水印,以識別AI生成的內容,幫助減少錯誤信息和錯誤歸屬的可能性。
NVIDIA encourages developers to adopt trustworthy AI practices and further enhance guardrail and watermarking solutions for their applications.
NVIDIA鼓勵開發者採用可信的人工智能實踐,並進一步增強他們應用程序的護欄和水印解決方案。
Availability
可用性
Cosmos WFMs are now available under NVIDIA's open model license on Hugging Face and the NVIDIA NGC catalog. Cosmos models will soon be available as fully optimized NVIDIA NIM microservices.
阿童木WFMs現在可以使用 現在可用 在英偉達的開放模型許可下,可在Hugging Face和英偉達NGC目錄中找到。阿童木模型將很快作爲完全優化的版本可用 英偉達NIM 微服務。
Developers can access NVIDIA NeMo Curator for accelerated video processing and customize their own world models with NVIDIA NeMo. NVIDIA DGX Cloud offers a fast and easy way to deploy these models, with enterprise support available through the NVIDIA AI Enterprise software platform.
開發者可以訪問 英偉達NeMo Curator 以加速視頻處理並自定義他們自己的世界模型 NVIDIA NeMo. 英偉達DGX雲 提供了一種快速簡單的方式來部署這些模型,同時通過 英偉達AI企業軟體平台 軟體平台提供企業支持。
NVIDIA also announced new NVIDIA Llama Nemotron large language models and NVIDIA Cosmos Nemotron vision language models that developers can use for enterprise AI use cases in healthcare, financial services, manufacturing and more.
英偉達還宣佈了新的 英偉達Llama Nemotron大型語言模型和阿童木Nemotron視覺語言模型。 開發者可以在醫療、金融服務、製造業等企業人工智能應用案例中使用。
About NVIDIA
NVIDIA (NASDAQ: NVDA) is the world leader in accelerated computing.
關於NVIDIA
英偉達 (納斯達克:英偉達)是全球加速計算的領導者。
For further information, contact:
Janette Ciborowski
Corporate Communications
NVIDIA Corporation
+1-734-330-8817
jciborowski@nvidia.com
欲了解更多信息,請聯繫:
Janette Ciborowski
企業通訊
英偉達公司
+1-734-330-8817
jciborowski@nvidia.com
Certain statements in this press release including, but not limited to, statements as to: the benefits, impact, performance and availability of NVIDIA's products, services, and technologies, including NVIDIA Cosmos, NVIDIA API catalog, NVIDIA Omniverse platform, NVIDIA AI, NVIDIA CUDA, NVIDIA NeMo Curator, NVIDIA Blackwell platform, NVIDIA Cosmos Tokenizer, NVIDIA NeMo framework, NVIDIA DGX Cloud, and NVIDIA AI Enterprise software platform; third parties adopting NVIDIA's products and technologies, and the benefit and impact thereof; and the ChatGPT moment for robotics coming are forward-looking statements that are subject to risks and uncertainties that could cause results to be materially different than expectations. Important factors that could cause actual results to differ materially include: global economic conditions; our reliance on third parties to manufacture, assemble, package and test our products; the impact of technological development and competition; development of new products and technologies or enhancements to our existing product and technologies; market acceptance of our products or our partners' products; design, manufacturing or software defects; changes in consumer preferences or demands; changes in industry standards and interfaces; unexpected loss of performance of our products or technologies when integrated into systems; as well as other factors detailed from time to time in the most recent reports NVIDIA files with the Securities and Exchange Commission, or SEC, including, but not limited to, its annual report on Form 10-K and quarterly reports on Form 10-Q. Copies of reports filed with the SEC are posted on the company's website and are available from NVIDIA without charge. These forward-looking statements are not guarantees of future performance and speak only as of the date hereof, and, except as required by law, NVIDIA disclaims any obligation to update these forward-looking statements to reflect future events or circumstances.
本新聞稿中的某些聲明,包括但不限於關於以下內容的聲明:英偉達產品、服務和技術的好處、影響、性能和可用性,包括英偉達阿童木、英偉達API目錄、英偉達Omniverse平台、英偉達人工智能、英偉達CUDA、英偉達NeMo Curator、英偉達Blackwell平台、英偉達Cosmos Tokenizer、英偉達NeMo框架、英偉達DGX Cloud和英偉達人工智能企業軟體平台;第三方採用英偉達的產品和技術以及由此產生的好處和影響;以及即將到來的機器人ChatGPT時刻,這些都是前瞻性聲明,可能面臨風險和不確定性,可能導致結果與預期有實質性差異。可能導致實際結果有實質性差異的重要因素包括:全球經濟狀況;我們對第三方生產、組裝、包裝和測試我們產品的依賴;技術發展和競爭的影響;新產品和技術的發展或對我們現有產品和技術的增強;市場對我們或我們合作伙伴產品的接受程度;設計、製造或軟體缺陷;消費者偏好或需求變化;行業標準和接口變化;將我們的產品或技術整合到系統中時性能意外下降;以及定期在英偉達向證券交易委員會(SEC)提交的最新報告中詳細列出的其他因素,包括但不限於其10-K表格的年度報告和10-Q表格的季度報告。提交給SEC的報告副本會在公司網站上發佈,並可免費從英偉達獲得。這些前瞻性聲明並不是對未來表現的保證,僅在本文日期時有效,除法律要求外,英偉達不承擔更新這些前瞻性聲明以反映未來事件或情況的義務。
Many of the products and features described herein remain in various stages and will be offered on a when-and-if-available basis. The statements above are not intended to be, and should not be interpreted as a commitment, promise, or legal obligation, and the development, release, and timing of any features or functionalities described for our products is subject to change and remains at the sole discretion of NVIDIA. NVIDIA will have no liability for failure to deliver or delay in the delivery of any of the products, features or functions set forth herein.
本文中描述的許多產品和功能仍處於各種階段,將根據可用性提供。上述聲明並非要成爲承諾、承諾或法律義務,我們的產品所描述的任何功能或功能的開發、發佈和定時僅限於NVIDIA的自主決定。NVIDIA不對任何產品、功能或功能的交付失敗或延遲承擔任何責任。
2025 NVIDIA Corporation. All rights reserved. NVIDIA, the NVIDIA logo, CUDA, DGX, NGC, NVIDIA Cosmos, NVIDIA NeMo, and NVIDIA Omniverse are trademarks and/or registered trademarks of NVIDIA Corporation in the U.S. and other countries. Other company and product names may be trademarks of the respective companies with which they are associated. Features, pricing, availability and specifications are subject to change without notice.
2025 英偉達公司。所有權利保留。英偉達、英偉達標誌、CUDA、DGX、NGC、英偉達阿童木、英偉達NeMo和英偉達Omniverse是英偉達公司在美國和其他國家的商標和/或註冊商標。其他公司和產品名稱可能是與其相關的各自公司的商標。功能、定價、可用性和規格如有變更,恕不另行通知。
A photo accompanying this announcement is available at
此公告的配圖可在此查看
譯文內容由第三人軟體翻譯。