趋势洞察 8月前 121 阅读 0 评论

Robetera CEO Says Humanoid Robots Could Be in Homes Within Five Years as VLA Models Take Center Stage

作者头像
AI中国

AI技术专栏作家 | 发布了 246 篇文章

Chen Jianyu, Founder of Robotera and Assistant Professor at the Institute for Interdisciplinary Information Sciences, Tsinghua University

AsianFin — The debate over whether to focus on “world models” or broader vision-language-action (VLA) models in robotics is shifting, according to Chen Jianyu, founder and CEO of Robetera, who believes the next wave of AI breakthroughs will come from general-purpose humanoid robots.

Speaking at the main forum of the 2025 World Robot Conference on Aug. 12, Chen — also an assistant professor at Tsinghua University — said humanoid robots, with their mobility and manipulation skills, are poised to transform productivity and social services.

“The fastest path to a general-purpose humanoid robot is to learn directly from humans — the only true general embodied intelligence in the real world,” Chen told attendees. “By integrating a general-purpose brain with a general-purpose body, we can define a new paradigm for robotics.”

Chen described VLA as any system that combines visual perception, language understanding, and physical action — with world models being one subcategory. He expects the next generation of VLA will be defined less by a specific architecture and more by its ability to operate end-to-end in the real world.

While many in the field are focused on collecting more data, Chen argued that model innovation matters more right now. “The total amount of data will grow, but efficiency in using that data is what counts. If you have to focus on just one thing, focus on the model,” he said.

Founded in 2023, Robetera has raised three funding rounds in under two years, including a nearly $69 million Series A in July led by CDH VGC and Haier Capital. The company is developing both the “brains” and “bodies” for general-purpose robots.

Its flagship VLA model, ERA-42, unifies vision, comprehension, prediction, and action into one end-to-end system. Current prototypes can perform hundreds of tasks on voice command — from sorting objects and scanning barcodes to using screwdrivers and pipettes.

On the hardware side, Robetera’s humanoid robots are designed as modular “universal interfaces” for interacting with the physical world. The Stardust L7 is a full-sized biped optimized for logistics, while the Stardust Q5 targets service roles like retail assistance. The company has also developed the XHand 1, a dexterous robotic hand with 12 degrees of freedom and built-in tactile sensors.

Chen sees humanoids as the “ultimate form of embodied intelligence,” thanks to their ability to learn from human behavior and operate effectively in human environments. He predicts humanoid adoption will follow a B2B-to-B2C path — starting with industrial and service applications, before entering households at scale.

“In real industrial scenarios, our robots already operate at 70% of human efficiency. By next year, we expect 90%,” he said. “Give it time, and robots will match human capabilities.”

Despite recent hype around AI, Chen says robotics hasn’t reached “bubble” territory, noting valuations are still far below those of the autonomous vehicle sector. But once large-scale deployments by major players arrive, he expects a flood of investment.

“The killer app for humanoid robots will be in the home,” Chen said. “We’re not there yet — but within five years, the tipping point could come.”

作者头像

AI前线

专注人工智能前沿技术报道,深入解析AI发展趋势与应用场景

246篇文章 1.2M阅读 56.3k粉丝

评论 (128)

用户头像

AI爱好者

2小时前

这个更新太令人期待了!视频分析功能将极大扩展AI的应用场景,特别是在教育和内容创作领域。

用户头像

开发者小明

昨天

有没有人测试过新的API响应速度?我们正在开发一个实时视频分析应用,非常关注性能表现。

作者头像

AI前线 作者

12小时前

我们测试的平均响应时间在300ms左右,比上一代快了很多,适合实时应用场景。

用户头像

科技观察家

3天前

GPT-4的视频处理能力已经接近专业级水平,这可能会对内容审核、视频编辑等行业产生颠覆性影响。期待看到更多创新应用!