close
close

What is physical AI and why could it change the world?


What is physical AI and why could it change the world?

Nvidia CEO Jensen Huang believes physical AI will be the next big trend. New robots will take many forms and all will be powered by AI.

Nvidia recently predicted a future where robots will be everywhere. Intelligent machines will be found in the kitchen, the factory, the doctor’s office, and on the highway – to name just a few areas where repetitive tasks will increasingly be handled by intelligent machines. And Jensen’s company will, of course, provide all the AI ​​software and hardware needed to train and run the AIs needed.

What is physical AI?

Jensen describes our current phase of AI as Pioneer AI, where the foundational models and tools needed to refine it for specific roles are created. The next phase, already underway, is Enterprise AI, where chatbots and AI models improve the productivity of enterprise employees, partners, and customers. At the peak of this phase, everyone will have a personal AI assistant or even a collection of AIs to help them perform specific tasks.

In these two phases, the AI ​​tells or shows us things by generating the likely next word in a word sequence or token. The final third phase, according to Jensen, is physical AI, where the intelligence takes a form and interacts with the world around it. To do this well, it requires integrating sensor inputs and manipulating objects in three-dimensional space.

“Building base models for general humanoid robots is one of the most exciting problems to solve in AI today,” said Jensen Huang, founder and CEO of NVIDIA. “The necessary technologies are coming together to enable leading roboticists around the world to take big steps toward artificial general robotics.”

OK, so you need to design the robot and its brain. Clearly a job for AI. But how do you test the robot under an infinite number of circumstances it might encounter, many of which cannot be anticipated or perhaps reproduced in the physical world? And how will we control it? You guessed it: we will use AI to simulate the world the robot will live in and the myriad devices and creatures the robot will interact with.

“We will need three computers… one to create the AI, one to simulate the AI, and one to run the AI,” Jensen said.

The three-computer problem

Jensen is, of course, talking about Nvidia’s portfolio of hardware and software solutions. The process starts with Nvidia H100 and B100 servers to build the AI, workstations and servers with Nvidia Omniverse with RTX GPUs to simulate and test the AI ​​and its environment, and Nvidia Jetsen (soon with Blackwell GPUs) to provide the integrated real-time sensing and control.

Nvidia has also introduced GR00T, which stands for Generalist Robot 00 Technology, to design, understand and emulate movements by observing human actions. GRooT will learn coordination, dexterity and other skills to navigate, adapt and interact with the real world. In its Terms and Conditions KeynoteHuang demonstrated several such robots on stage.

Two new AI NIMs enable roboticists to develop simulation flows for generative physical AI in NVIDIA Isaac Sim, a robot simulation reference application built on the NVIDIA Omniverse platform. First, the MimicGen NIM microservice generates synthetic motion data based on recorded remote data using spatial computing devices such as Apple Vision Pro. The Robocasa NIM microservice generates robot tasks and simulation-ready environments in OpenUSD, the universal framework underlying Omniverse for developing and collaborating in 3D worlds.

Finally, NVIDIA OSMO is a cloud-native managed service that enables users to orchestrate and scale complex robotics development workflows across distributed computing resources, whether on-premises or in the cloud.

OSMO simplifies robot training and simulation workflow creation, reducing deployment and development cycles from months to less than a week. Users can visualize and manage a range of tasks – such as generating synthetic data, training models, performing reinforcement learning, and testing at scale for humanoids, autonomous mobile robots, and industrial manipulators.

So how do you design a robot that can grasp objects without crushing or dropping them? The Nvidia Isaac Manipulator brings cutting-edge dexterity and AI capabilities to robotic arms based on a collection of base models. Early ecosystem partners include Yaskawa, Universal Robots, a Teradyne company, PickNik Robotics, Solomon, READY Robotics and Franka Robotics.

OK, so how do you teach a robot to “see”? Isaac Perceptor offers multi-camera and 3D surround-view capabilities that are increasingly being used in autonomous mobile robots in manufacturing and order fulfillment to improve worker efficiency and safety while reducing error rates and costs. Early adopters include ArcBest, BYD and KION Group, who are looking to achieve new levels of autonomy in material handling operations and more.

For running robots, the new Jetson Thor SoC includes a Blackwell GPU with Transformer Engine, delivering 800 teraflops of 8-bit floating-point AI performance to run multimodal generative AI models like GR00T. Equipped with a functional safety processor, a high-performance CPU cluster, and 100 GB of Ethernet bandwidth, it greatly simplifies design and integration efforts.

Conclusions

Just when you thought it was safe to get back in the water, da dum. Da dum. Da dum. Here come the robots. Jensen believes robots will need to take on human form because the factories and environments they will work in were all designed for human operators. It is far more economical to develop humanistic robots than to redesign the factories and spaces they will be used in.

Even if it’s just your kitchen.

Leave a Reply

Your email address will not be published. Required fields are marked *