Two core capabilities, spatiotemporal memory and physical world reasoning, have been introduced.

Author | Huang Yu
The long-standing 'intelligence barrier' in the field of embodied intelligence is gradually being dismantled.
On February 10, Alibaba's DAMO Academy officially released RynnBrain, the foundational model for embodied intelligence, and open-sourced all seven models in its series at once, including the industry's first 30B MoE (Mixture of Experts) architecture.
This move represents a milestone. According to reports, RynnBrain has enabled robots to possess spatiotemporal memory and spatial reasoning capabilities for the first time, while also setting new records (SOTA) on 16 embodied open-source evaluation leaderboards, surpassing $Alphabet-C (GOOG.US)$ top-tier industry models such as Gemini Robotics ER 1.5.
This indicates that the long-standing constraints of 'temporal-spatial forgetting' and 'physical hallucination' in embodied intelligence are being actively addressed, and robotic brains may evolve from simple command receivers into intelligent entities with profound environmental understanding.
For a long time, the level of intelligence in embodied models has been a significant bottleneck restricting the generalization of robots, especially the shortcomings in generalization capabilities, which have greatly limited their application in complex physical scenarios.
To overcome this bottleneck, the industry has developed multiple technical exploration pathways.
According to Wall Street News, one category focuses on action output through VLA models, which, although capable of directly manipulating the physical world, struggle to achieve cross-scenario generalization due to the scarcity of high-quality machine data; another category involves introducing brain models like VLM with generalization potential, but these models generally lack memory capabilities, suffer from restricted dynamic cognition, and often exhibit physical hallucinations, making it difficult to support complex mobility operations in humanoid robots.
This technological barrier, stemming from flaws in the intelligence architecture, has left even seemingly advanced robots struggling when faced with complex mobility tasks.
Alibaba's DAMO Academy's RynnBrain model was specifically designed to fundamentally dismantle this barrier.
It is reported that RynnBrain has creatively introduced two core capabilities: spatiotemporal memory and physical world reasoning, which are the two fundamental abilities required for robots to deeply interact with their environment.
Spatiotemporal memory refers to a robot's ability to locate objects within its complete historical memory, trace back target areas, and even predict motion trajectories, thereby equipping the robot with global spatiotemporal recall capabilities.
Physical space reasoning differs from traditional text-only reasoning paradigms. RynnBrain employs an interleaved reasoning strategy of text and spatial positioning, ensuring that its reasoning process is firmly rooted in the physical environment, greatly reducing hallucination issues.
For example, a robot running RynnBrain, if interrupted during task A and asked to perform task B first, can accurately remember the temporal and spatial state of task A, and seamlessly resume work after completing task B. This 'long-term brain' memory mechanism addresses the longstanding 'instantaneous amnesia' problem in the field of embodied intelligence.
In addition, according to sources at Wall Street Wisdom, RynnBrain was trained based on Qwen3-VL and optimized using the self-developed RynnScale architecture by DAMO Academy, achieving twice the training speed with the same computational resources, with a training dataset exceeding 20 million pairs.
This highly efficient training system is directly reflected in benchmark results: RynnBrain comprehensively set new industry records across 16 key tasks, including environmental perception, object reasoning, first-person visual question answering, spatial reasoning, and trajectory prediction. This is not merely a stacking of computing power but a successful reconstruction of the underlying architecture of embodied intelligence.
It is reported that RynnBrain also exhibits excellent scalability, enabling quick post-training development of various embodied models such as navigation, planning, and action, making it a potential foundational model for the embodied intelligence industry.
In its ambition to create a foundational model for the embodied intelligence industry, DAMO Academy has chosen the open-source route.
It is reported that DAMO Academy has open-sourced the entire series of RynnBrain models, totaling seven, including full-sized base models and post-trained specialized models. Among them is the industry's first MoE-architecture-based 30-billion-parameter embodied model, which requires only 3 billion inference activation parameters to outperform existing 72-billion-parameter models, allowing robots to act faster and more smoothly.
At the same time, DAMO Academy has also open-sourced a new evaluation benchmark, RynnBrain-Bench, designed to assess fine-grained spatiotemporal embodied tasks, filling an industry gap.
Behind the large-scale open-source initiative by Alibaba's DAMO Academy lies a broader industry ambition: to accelerate the construction of an open and evolving embodied intelligence ecosystem.
From the perspective of global technological competition, embodied intelligence is at a critical inflection point transitioning from 'digital virtual' to 'physical reality'.
Zhao Deli, head of DAMO Academy’s Embodied Intelligence Lab, pointed out that RynnBrain has achieved deep understanding and reliable planning for the physical world for the first time, marking a crucial step towards general embodied intelligence under a hierarchical architecture of the brain. “We look forward to it accelerating AI’s deployment from the digital world into real-world physical scenarios.”
In 2017, on the occasion of Alibaba’s 18th anniversary, Jack Ma founded the DAMO Academy, dedicated to addressing technological and R&D issues that promote productivity. At that time, Ant Group also pledged to invest 100 billion yuan in the DAMO Academy over three years.
However, over the past three years, amid significant organizational changes within the Alibaba Group, the DAMO Academy has undergone multiple rounds of adjustments and reshuffles. The once extensive '4+X' research areas have now been streamlined to focus on 'intelligence + computing.' The intelligence direction includes medical AI, decision-making intelligence, video technology, embodied intelligence, and genetic intelligence, while the computing direction includes computational technologies and RISC-V.
Embodied intelligence is clearly one of the key areas where the DAMO Academy is heavily investing.
It is reported that in the field of embodied intelligence, the DAMO Academy is building a deployable, scalable, and evolvable embodied intelligence system. It has open-sourced several embodied models, including WorldVLA, which integrates world models and VLA models, as well as the world-understanding model RynnEC, along with the industry’s first robot context protocol, RynnRCP.
As the DAMO Academy focuses on embodied intelligence, the global humanoid robotics market is entering a critical phase of scaled development. In 2025, the global humanoid robotics market reached its scaled starting point.
According to IDC data, last year’s global shipments of humanoid robots approached 18,000 units, representing a year-on-year increase of approximately 508%, with sales amounting to about $440 million. During the same period, cumulative sales orders are expected to exceed 35,000 units.
Although this field still faces numerous challenges such as the scarcity of real-world physical feedback data, generalization in unstructured environments, and deep hardware-software synergy, the open-sourcing of RynnBrain undoubtedly provides developers worldwide with a relatively mature 'brain template,' facilitating the accelerated industrialization of embodied intelligence.
For the industry, this is not only a release of code but also a redistribution of technical power. When top-tier models are no longer secret weapons confined to giant corporate laboratories, the embodied intelligence industry will enter a new cycle of accelerated iteration and collective evolution.
Editor/Doris