Jieyue Xingchen Jiang Daxin: The large-scale model now has the ability to simultaneously perform fast thinking and slow thinking like the human brain.

cls.cn · Sep 19 13:12

①OpenAI大模型o1通过强化学习让AI具备人类慢思考（可主动反思、纠错的复杂思考）能力，接下来提升强化学习模型的泛化能力，加速推进多模态理解生成一体化，是进一步突破的关键；②GPT-4o标志着多模融合的进步，是更好地对物理世界建模、彻底实现模拟世界的基础。

《科创板日报》9月19日讯（记者黄心怡）在2024云栖大会上，阶跃星辰创始人、CEO姜大昕博士表示，AI发展正在经历关键的技术范式迭代：OpenAI的大模型o1探索出了通过强化学习让AI具备人类慢思考（可主动反思、纠错的复杂思考）能力的方式，接下来提升强化学习模型的泛化能力和加速推进多模态理解生成一体化是AI技术进一步突破的关键。

左二为阶跃星辰CEO姜大昕

据他透露，阶跃星辰积极探索新的技术范式，已经在万亿参数模型上实现了强化学习训练的方法论。同时，持续打磨更高性能的底层大模型，提升C端产品体验。近期阶跃星辰自研的Step-2万亿参数MoE语言大模型已接入智能助手“跃问”，展现出更强大的指令跟随、创作和推理能力。

在云栖大会上，OpenAI最新发布的o1模型引发了热议。o1被认为通过强化学习技术，探索出了让AI具备人类慢思考（可主动反思、纠错、尝试不同策略、进行复杂推理）能力的方式。这是大模型首次同时具备人类大脑System1（即快思考，直接给出答案）和System2（即慢思考）的能力，姜大昕认为这是大模型开始具备归纳世界能力的关键一步。

“我们把AGI的实现路径划分为模拟世界、探索世界和归纳世界三个可并行的发展阶段，过去一年这三点都出现了突破性的技术进展，发展速度可以说是‘AI一日，人间一年’”。他表示，除了o1在归纳世界上的进展，GPT-4o标志着多模融合的进步，是更好地对物理世界建模、彻底实现模拟世界的基础。而在探索世界上，特斯拉发布的完全自动驾驶系统FSDV12，为具身智能设备如何与大模型结合，从数字世界走向探索物理世界指明了技术方向。

在攀登AGI的道路上，无论是强化学习、多模态、行业模型都需要强大的基座模型作为基础。今年以来，阶跃星辰持续迭代万亿参数语言大模型Step-2，3月份在国内率先发布万亿参数大模型预览版，并于7月正式发布。经测试，Step-2比千亿参数语言大模型Step-1的综合能力提升近50%，逻辑推理、数学、编程、知识等方面性能获得了全面提升。

同时，阶跃星辰宣布旗下智能助手“跃问”全面升级，向用户限时免费开放Step-2万亿参数MoE语言大模型能力，并推出了多模态搜索问答功能“拍照问”，用户可通过图像交互实现“即拍即问”，解决文字和语音交互模式下难以高效描述的场景需求。

①OpenAI's large model o1 enables AI to have the ability of human slow thinking (actively reflecting, correcting complex thinking) through reinforcement learning, and the next key breakthrough will be to improve the generalization ability of the reinforcement learning model and accelerate the integration of multimodal understanding and generation, which is a further breakthrough. ②GPT-4o signifies the progress of multimodal integration, and it is the foundation for better modeling the physical world and thoroughly realizing the simulated world.

On September 19th, in the 2024 Yunqi Conference, Dr. Jiang Daxin, the founder and CEO of Star Leap, stated that the development of AI is undergoing a critical technological paradigm shift: OpenAI's large model o1 has explored a way to enable AI to have the ability of human slow thinking (actively reflecting, correcting complex thinking) through reinforcement learning. The next key steps for AI technology are to improve the generalization ability of the reinforcement learning model and to accelerate the integration of multimodal understanding and generation.

The person second from the left is Dr. Jiang Daxin, CEO of Star Leap.

He disclosed that Star Leap is actively exploring new technological paradigms and has already implemented the methodology of reinforcement learning training on trillion-parameter models. At the same time, they are continuously improving the performance of the underlying large models and enhancing the end-user product experience. Recently, Star Leap's self-developed Step-2 trillion-parameter MoE language model has been integrated into the intelligent assistant 'Leap Question,' demonstrating stronger command following, creation, and reasoning capabilities.

At the Yunqi Conference, the latest release of the o1 model by OpenAI has sparked discussions. The o1 model is considered to have explored a way to enable AI to have the ability of human slow thinking (actively reflecting, correcting, trying different strategies, and conducting complex reasoning) through reinforcement learning technology. This is the first time a large model has simultaneously possessed the ability of human brain System 1 (fast thinking, giving direct answers) and System 2 (slow thinking). Jiang Daxin believes that this is a critical step for large models to begin to possess the ability to infer the world.

"We divide the path to AGI into three parallel stages: simulating the world, exploring the world, and inferring the world. In the past year, there have been breakthrough technological advancements in all three areas, and the pace of development can be described as 'a day in AI, a year on Earth'". He stated that in addition to the progress made by o1 in inferring the world, GPT-4o signifies the progress of multimodal integration, which is the foundation for better modeling the physical world and completely realizing the simulated world. Regarding exploring the world, Tesla's release of the fully autonomous driving system FSDV12 indicates the technological direction for how embodied intelligent devices can be combined with large models, transitioning from the digital world to exploring the physical world.

In climbing the road to AGI, strong foundational models are needed for reinforcement learning, multimodal understanding, and industry models. This year, Star Leap has continued to iterate the trillion-parameter language model Step-2, and in March, it was the first in China to release a preview version of the trillion-parameter language model. In July, it was officially released. After testing, Step-2's comprehensive capabilities have increased by nearly 50% compared to the hundred-billion-parameter language model Step-1, with significant improvements in logical reasoning, mathematics, programming, and knowledge.

Meanwhile, YUE Xingchen announced the comprehensive upgrade of its smart assistant "YUEWEN", offering users the Step-2 trillion parameter MoE language model capability for free for a limited time, and launching the multi-modal search and question-answering feature "PAI ZHAO WEN". Users can achieve "capture and ask" through image interaction, solving the scenario demands that are difficult to efficiently describe in text and voice interaction modes.

The translation is provided by third-party software.

The above content is for informational or educational purposes only and does not constitute any investment advice related to Futu. Although we strive to ensure the truthfulness, accuracy, and originality of all such content, we cannot guarantee it.

阶跃星辰姜大昕：大模型已同时具备人类大脑快思考与慢思考的能力

Jieyue Xingchen Jiang Daxin: The large-scale model now has the ability to simultaneously perform fast thinking and slow thinking like the human brain.

Risk Disclaimer

Statement