Code as Action
Let the agent write code as its action space — then save what works to a skill library it authors itself.
The insight behind Voyager (built once GPT-4 arrived as 'the best coding model out there'): use code as the action interface. The 3D Minecraft world is converted to a text representation; the agent writes programs against an action API; a self-reflection loop feeds runtime errors back so the agent debugs its own code; correct programs are saved to a skill library — 'a codebase that the LLM interactively authored all by itself.' An automated curriculum (one directive: 'find as many novel items as possible') lets it propose its own next task, neither too hard nor too easy.
- Code is a more general action space than hand-coded primitives.
- A self-reflection loop turns runtime errors into self-correction.
- Verified programs become reusable skills in a self-authored library.
- An automated curriculum lets the agent set its own next task at the right difficulty.
Described in the 'Exploring virtual worlds' section as the second of Fan's Minecraft projects (after MineDojo), and the team's earliest foundation-model agent.