Microsoft Launches Vision-Language-Action Model for Robots

Share This Post

Microsoft introduced Rho-alpha, a new vision-language action model designed to make robots more adaptable, responsive, and capable of operating in real-world environments.

The tech giant revealed the generative AI vision-language-action (VLA) model in a blog post earlier this month. The model is derived from Microsoft’s Phi open model series.

Rho-alpha translates natural language commands into control signals for robots performing manipulation tasks. 

To train its model, Microsoft said it combined physical demonstrations and simulations together with a multistage reinforcement learning process built on the open Nvidia Isaac Sim framework.

For better perception, Microsoft also added tactile sensing capabilities, enabling robots to use touch to respond to their environment rather than solely relying on visual input. 

In future iterations, Microsoft said it plans to add force sensing and other modalities.

A video demonstration included in the blog post shows Rho-alpha interacting with BusyBox, a physical interaction benchmark recently introduced by Microsoft Research, using natural language instructions.

The Microsoft model release comes as more industries are starting to use robots, shifting from narrow, task-specific deployments to rollout across more dynamic, unstructured and often human-centered environments

Related:Serve Robotics Acquires Hospital Assistant Robot Company

The shift has led to increased focus on models that enable robots to reason and act with greater autonomy. 

In this context, Microsoft is positioning Rho-alpha as a more flexible and adaptable AI system for robots, enabling greater deployment opportunities across sectors than traditional models.

“The emergence of VLA models for physical systems is enabling systems to perceive, reason, and act with increasing autonomy alongside humans,” Ashley Llorens, corporate vice president and managing director at Microsoft’s Research Accelerator, said in a blog post introducing the model.

Rho-alpha is currently being evaluated on dual-arm robotic systems and humanoid robots, with Microsoft planning to publish a technical description of the model in the coming months.

The model will initially be available through an early access program, with broader availability planned in the Microsoft Foundry in the future.

 

 

 

 

 

 

 

Related Posts

BTC adds to weekend losses on Trump blockade order

Crypto prices are under further pressure during U.S. morning...

Bitcoin, Ether Near Levels That Could Signal Trend Reversal: Investor

Bitcoin and Ether aren’t far from levels that could...

There’s a Way to Make Bitcoin Safe From Quantum Without a Fork, Researchers Say

In brief A new proposal outlines a way to create...

Super PAC tied to Tether makes first ad buy from firm founded by Tether’s U.S. CEO

The crypto sector's new Fellowship political action committee disclosed...

Pepe Coin News: Cofounder Returns With Pepeto and Confirmed Listing

Share Share Share Share Email Canary Capital just filed the first spot PEPE ETF...