Specialized Generalists

Build the generalist first, then distill it down — a specialized generalist beats a native specialist almost every time.

Problem it solves

The temptation to chase narrow, easy-to-demo specialist systems that never compound into general capability.

Best for

Deciding generalist-vs-specialist when both look viable; resisting easy-demo pressure.

Not ideal for

Resource-constrained teams that must ship a narrow win now.

Overview

Why this framework exists

Fan's strategic argument for why GEAR pursues a general humanoid foundation model despite specialists being faster to show results. The NLP precedent: before ChatGPT, NLP was a zoo of task-specific pipelines (translation, math, coding); GPT-3/ChatGPT unified them into one generalist, which you then prompt, distill and fine-tune back down to tasks — the 'specialized generalist.' Historically the specialized generalist is far stronger than the original specialist, and easier to maintain (one API). The bet: the same trajectory will play out in robotics.

Core principles

4 total

Specialists are faster to demo but a dead end for general capability.
Unify into a generalist, then prompt/distill/fine-tune back to specialists.
The specialized generalist beats the native specialist and is cheaper to maintain.
Generalist-first is slower and harder — but it's where the future is.

Origin story

How this framework came to be

Made in the 'Specialized generalists' section in response to roboticists who think a general approach can't work and to the recurring Sutton 'Bitter Lesson' theme.

Source

Traced to primary

Source · PODCAST

Jim Fan on Nvidia's Embodied AI Lab and Jensen Huang's Prediction that All Robots will be Autonomous

Sequoia Capital (Training Data) · 2024

Open source →

Related frameworks

Browse all Strategy →