The Foundation Agent
One model that generalizes across three axes — the skills it can do, the bodies it can control, and the realities it can master.
Fan's flagship thesis (he says he 'proposed' the term the prior year): the field is converging toward a single embodied model that generalizes over three axes simultaneously — (1) skills, (2) embodiments / form factors, and (3) worlds or realities, virtual and physical. It is the embodied analogue of how one LLM replaced a zoo of task-specific NLP pipelines. The GEAR Lab's stated end-goal.
- Generality lives on three axes at once: skills × embodiments × realities.
- Virtual and physical agents share one API: perception in, actions out.
- A single foundation agent subsumes both gaming AI and robotics.
Articulated in the 'Exploring virtual worlds' section as the unifying frame behind GEAR's dual mandate — robotics (physical) and gaming agents (virtual) are the same problem under one model.
Source · PODCAST
Jim Fan on Nvidia's Embodied AI Lab and Jensen Huang's Prediction that All Robots will be Autonomous