China's Alibaba Unveils AI Brains Designed To Power The Next Generation Of Robots Authored by Jijo Malayil via Interesting Engineering, Chinese firmChina's Alibaba Unveils AI Brains Designed To Power The Next Generation Of Robots Authored by Jijo Malayil via Interesting Engineering, Chinese firm

China's Alibaba Unveils AI Brains Designed To Power The Next Generation Of Robots

2026/06/18 11:00
7 min read
For feedback or concerns regarding this content, please contact us at [email protected]

China's Alibaba Unveils AI Brains Designed To Power The Next Generation Of Robots

Tyler Durden's Photo
by Tyler Durden
Authored...

Authored by Jijo Malayil via Interesting Engineering,

Chinese firm Alibaba has launched its first embodied AI model family, which links large language models with real-world robotic actions.

The Qwen-Robot suite includes three distinct models, each targeting a different layer of physical intelligence.Unitree/YouTube

The Qwen-Robot suite was developed by Alibaba's Tongyi Lab and is undergoing pilot testing with selected Alibaba Cloud enterprise clients.

The suite comprises three models focused on navigation, manipulation, and world modeling for robots operating in physical environments.

Alibaba said the models enable machines to perceive, reason, and interact with the real world, joining a growing global push to advance embodied AI beyond traditional chatbot applications.

Robots meet reasoning

Alibaba says its Qwen family of AI models has become very good at understanding the physical world. These models can recognize objects, understand spatial relationships, follow complex visual instructions, and reason about real-world environments. For example, a model can understand a command such as, "Go to the kitchen, find the red cup, pick it up, and place it on the shelf."

However, understanding a task is different from actually performing it. While a vision-language model (VLM) can describe the steps needed to complete a task, it cannot directly control a robot's movements.

The challenge is connecting human language and visual understanding with the motor actions required to interact with the physical world.

This problem is difficult because robot training data is very different from internet data. Information collected from navigation systems, robotic arms, vehicles, and cameras comes in different formats and is expensive to gather. Simply combining all this data often creates conflicts rather than improving performance.

To address this, Alibaba developed the Qwen-Robot Suite, which includes three specialized models. Qwen-RobotNav focuses on movement and navigation. It helps robots follow instructions, navigate to locations, track targets, and support autonomous driving.

According to its website, Qwen-RobotManip focuses on physical interaction. It enables robots to grasp, move, and manipulate objects using a large training dataset collected from different robotic systems. Qwen-RobotWorld acts as a world model, predicting how environments may change and helping robots understand the likely outcomes of their actions.

Together, these models aim to enable robots to understand instructions, interact with objects, navigate environments, and make decisions in the real world.

Physical AI accelerates

Alibaba showcased Qwen-RobotNav on a Unitree Go2 quadruped powered by NVIDIA Jetson Thor hardware and a single low-resolution camera. The robot successfully navigated an unfamiliar apartment, following spoken instructions across multiple rooms without preloaded maps, while maintaining an inference latency of 196 milliseconds.

The company claims that Qwen-RobotManip, its robotic manipulation model, was trained on more than 38,000 hours of open-source data covering object handling and interaction tasks. According to Alibaba, the model recently achieved the highest score in the generalist category of the RoboChallenge real-world robotics benchmark, earning a process score of 59.83 and a task success rate of 45 percent.

The company also unveiled Qwen-RobotClaw, a robotics agent framework that enables Qwen models to use the Qwen-Robot suite as physical-world tools. In one demonstration, an agent searched for a restroom, identified an out-of-order sign, and independently rerouted to another location. Alibaba further open-sourced Chat2Robot, a browser-based platform for testing embodied AI interactions.

As competition in embodied AI intensifies worldwide, Alibaba has expanded its ambitions beyond language and multimodal software with the launch of its Qwen-Robot models. The move reflects a broader industry shift toward creating AI systems capable of understanding and interacting with the physical world.

Alibaba's move comes as competition in physical AI accelerates globally. In the US, Google DeepMind is advancing Gemini Robotics, while Nvidia is expanding its robotics ecosystem through Cosmos, Isaac, and GR00T. Start-ups, including Physical Intelligence, Skild AI, and Figure AI, are also developing general-purpose robotic intelligence, according to the South China Morning Post.

China is strengthening its position by pairing its manufacturing advantages with growing investments in AI software for autonomous decision-making. The sector now spans AI developers, robotics firms, and EV makers. Companies such as Alibaba, Tencent, Unitree, AgiBot, UBTech, Galbot, Spirit AI, GigaAI, Xpeng, and Xiaomi are actively pursuing embodied AI technologies.

0

Authored by Jijo Malayil via Interesting Engineering,

Chinese firm Alibaba has launched its first embodied AI model family, which links large language models with real-world robotic actions.

The Qwen-Robot suite includes three distinct models, each targeting a different layer of physical intelligence.Unitree/YouTube

The Qwen-Robot suite was developed by Alibaba's Tongyi Lab and is undergoing pilot testing with selected Alibaba Cloud enterprise clients.

The suite comprises three models focused on navigation, manipulation, and world modeling for robots operating in physical environments.

Alibaba said the models enable machines to perceive, reason, and interact with the real world, joining a growing global push to advance embodied AI beyond traditional chatbot applications.

Robots meet reasoning

Alibaba says its Qwen family of AI models has become very good at understanding the physical world. These models can recognize objects, understand spatial relationships, follow complex visual instructions, and reason about real-world environments. For example, a model can understand a command such as, "Go to the kitchen, find the red cup, pick it up, and place it on the shelf."

However, understanding a task is different from actually performing it. While a vision-language model (VLM) can describe the steps needed to complete a task, it cannot directly control a robot's movements.

The challenge is connecting human language and visual understanding with the motor actions required to interact with the physical world.

This problem is difficult because robot training data is very different from internet data. Information collected from navigation systems, robotic arms, vehicles, and cameras comes in different formats and is expensive to gather. Simply combining all this data often creates conflicts rather than improving performance.

To address this, Alibaba developed the Qwen-Robot Suite, which includes three specialized models. Qwen-RobotNav focuses on movement and navigation. It helps robots follow instructions, navigate to locations, track targets, and support autonomous driving.

According to its website, Qwen-RobotManip focuses on physical interaction. It enables robots to grasp, move, and manipulate objects using a large training dataset collected from different robotic systems. Qwen-RobotWorld acts as a world model, predicting how environments may change and helping robots understand the likely outcomes of their actions.

Together, these models aim to enable robots to understand instructions, interact with objects, navigate environments, and make decisions in the real world.

Physical AI accelerates

Alibaba showcased Qwen-RobotNav on a Unitree Go2 quadruped powered by NVIDIA Jetson Thor hardware and a single low-resolution camera. The robot successfully navigated an unfamiliar apartment, following spoken instructions across multiple rooms without preloaded maps, while maintaining an inference latency of 196 milliseconds.

The company claims that Qwen-RobotManip, its robotic manipulation model, was trained on more than 38,000 hours of open-source data covering object handling and interaction tasks. According to Alibaba, the model recently achieved the highest score in the generalist category of the RoboChallenge real-world robotics benchmark, earning a process score of 59.83 and a task success rate of 45 percent.

The company also unveiled Qwen-RobotClaw, a robotics agent framework that enables Qwen models to use the Qwen-Robot suite as physical-world tools. In one demonstration, an agent searched for a restroom, identified an out-of-order sign, and independently rerouted to another location. Alibaba further open-sourced Chat2Robot, a browser-based platform for testing embodied AI interactions.

As competition in embodied AI intensifies worldwide, Alibaba has expanded its ambitions beyond language and multimodal software with the launch of its Qwen-Robot models. The move reflects a broader industry shift toward creating AI systems capable of understanding and interacting with the physical world.

Alibaba's move comes as competition in physical AI accelerates globally. In the US, Google DeepMind is advancing Gemini Robotics, while Nvidia is expanding its robotics ecosystem through Cosmos, Isaac, and GR00T. Start-ups, including Physical Intelligence, Skild AI, and Figure AI, are also developing general-purpose robotic intelligence, according to the South China Morning Post.

China is strengthening its position by pairing its manufacturing advantages with growing investments in AI software for autonomous decision-making. The sector now spans AI developers, robotics firms, and EV makers. Companies such as Alibaba, Tencent, Unitree, AgiBot, UBTech, Galbot, Spirit AI, GigaAI, Xpeng, and Xiaomi are actively pursuing embodied AI technologies.

Market Opportunity
Gensyn Logo
Gensyn Price(AI)
$0,02534
$0,02534$0,02534
+2,75%
USD
Gensyn (AI) Live Price Chart

World Cup Combo: Aim for 200x

World Cup Combo: Aim for 200xWorld Cup Combo: Aim for 200x

Combine up to 20 World Cup matches in one order

Disclaimer: The articles reposted on this site are sourced from public platforms and are provided for informational purposes only. They do not necessarily reflect the views of MEXC. All rights remain with the original authors. If you believe any content infringes on third-party rights, please contact [email protected] for removal. MEXC makes no guarantees regarding the accuracy, completeness, or timeliness of the content and is not responsible for any actions taken based on the information provided. The content does not constitute financial, legal, or other professional advice, nor should it be considered a recommendation or endorsement by MEXC.

Score Your Share of 50K USDT

Score Your Share of 50K USDTScore Your Share of 50K USDT

Complete DEX+ tasks to unlock the Champion Wheel