Nvidia on Monday unveiled a spate of new features and models to accelerate uptake and development of physical AI.
The burgeoning sector is defined as systems that enable machines to more intelligently respond to their physical environments.
Introduced at the vendor’s GTC conference in San Jose, the releases focus primarily on Nvidia’s Physical AI Data Factory, an open reference architecture designed to transform real-world data into large-scale training datasets.
Rev Lebaredian, vice president of Omniverse and simulation technology at Nvidia, said during a pre-briefing that the system uses the company’s Cosmos world models and coding agents.
Specifically, the system is built around three components: Cosmo Curator, which processes datasets; Cosmos Transfer, which generates scenarios to expand the datasets; and Cosmos Evaluator; which verifies generated data before using it for training.
Together, the components automate the data generation process for robotics developers.
“It’s a data factory designed specifically for physical AI,” Lebaredian said. “Cosmos unifies and manages all three stages, reducing manual work so developers can focus on building models.”
The platform will initially be available on Microsoft’s Azure cloud platform. Nvidia said companies including Field AI, Hexagon Robotics, Milestone Systems, Skild AI, and TerraNine Robotics are early adopters.
Alongside the data architecture, Nvidia introduced Cosmos-3, a new world model that combines vision, reasoning and prediction to generate robot behaviors.
The new Cosmos platform also includes what Nvidia described as the largest open video dataset for physical AI, along with frameworks for curating and evaluating large-scale video data.
A typical challenge in physical AI, according to Lebaredian, is that real-world training data is difficult to collect at scale due to the unpredictability of physical environments.
“In the past, real-world data was the primary mode of training,” he said. “But the real world is diverse, unpredictable and full of edge cases. You simply cannot manually capture enough data to train for all of them.”
Instead, developers are increasingly turning to world models trained on internet-scale video and human demonstration data — enabling robotic training on a far greater scale than previously possible.
To stay ahead of this shift, Nvidia also rolled out early access to its AI-enabled video search and summarization tool, Metropolis VSS Blueprint. The system enables developers to build agents that analyze and act on massive streams of video data from edge to cloud.
Along with the product launches was a new partnership with T-Mobile. The companies are working to integrate physical AI applications into networks, bringing these agents to edge applications.
Looking Ahead
The moves reflect Nvidia’s growing drive into physical AI, which has become something of a buzzword as developers seek machines with elevated intelligence and perception capabilities.
“Autonomous vehicles represented the first wave of physical AI, but much more is coming down the pipeline,” Lebaredian said. “Soon we will have billions of AI agents running on billions of devices. The world’s industries will be transformed by physical AI and AI-driven physics.”
He identified the rise of humanoid robots as a key catalyst for market growth, with demand for physical AI systems anticipated to see a further upswing
“Today, roughly three million robots power the world’s industries,” Lebaredian said. “But the next generation of humanoid robots is now arriving, with deployments expected to grow nearly tenfold by 2026.”
“In this context, our models and frameworks are designed to support both existing and future robot platforms,” he added. “These systems will be more accurate, more lightweight and easier to deploy.”

