INNOVATIONWeeks to result89% confidence

Rockets vs Bicycles of AI

Not all AI is the same — match resource consumption to use-case requirements

Problem it solves

Distinguishing AI products that externalize large costs from those that are genuinely efficient

Best for

Product teams and investors differentiating between AI products by structural resource consumption and regulatory exposure

Not ideal for

Predicting regulatory response timelines — Hao's analysis of when regulation arrives is vague

Overview

Why this framework exists

Hao proposes a transportation analogy: we have nuanced conversations about whether a use case requires a bicycle or a rocket. We should apply the same taxonomy to AI products. Frontier LLMs are rockets — they require extraordinary data, compute, energy, and labor, provide dramatic capability for some use cases, and externalize large costs onto communities (environmental), workers (annotation labor), and creators (intellectual property). Narrow-purpose AI is a bicycle — small curated datasets specific to a use case, significantly less compute, high specific utility, lower externalized costs.

The framework's practical power is in identifying which AI products are structurally exposed to regulatory and resource-constraint risk. The frontier labs cannot be bicycles because their business model requires generalizable capability that commands premium pricing across multiple industries — they must be rockets to justify their valuations. This creates a structural differentiation: bicycle AI products are insulated from the regulatory and resource pressures that threaten rocket AI products.

DeepMind's AlphaFold is the exemplar: trained on a small, curated dataset of protein sequences, won the 2024 Nobel Prize in Chemistry, accelerates drug discovery, and required a fraction of the compute of a frontier LLM. The best AI outcomes in healthcare, Hao argues, come from the human expert retaining agency with the AI as a tool — not replacement.

Core principles

5 total
  1. Resource consumption should be proportional to use-case requirements — bicycle use cases do not justify rocket resource expenditure.
  2. Frontier LLMs are structurally incapable of becoming bicycles because their business model requires generalizable capability across high-value industries.
  3. The most impressive narrow AI achievements (AlphaFold, precision diagnostics) required small curated datasets, not internet-scale data appropriation.
  4. Human expert plus AI tool consistently outperforms AI replacement in domains with high expertise density.
  5. Regulatory pressure will fall disproportionately on rocket AI because its externalized costs are visible and concentrated; bicycle AI's costs are distributed and small.

Steps

5 steps
  1. Define the use case's actual capability requirements
    Before evaluating an AI product, specify what the use case actually needs: breadth of domain coverage, volume of training examples required for acceptable performance, latency requirements, and acceptable error rates. Many use cases that are marketed as requiring frontier models can be served adequately by narrow-purpose models.
    Pro tipAsk: what is the smallest dataset on which a model achieves acceptable performance for this specific task? That is your bicycle floor.
  2. Map the resource consumption stack
    For any AI product you are evaluating (as a builder, investor, or policy analyst), trace the full resource stack: training data volume and source, compute requirements for training and inference, energy and water consumption, and annotation labor requirements. Compare to the use-case requirement established in step 1.
    WarningCompanies rarely disclose full resource stacks — use proxy indicators (number of parameters, data sources cited, facility size) to estimate.
  3. Identify the externalized costs
    Map where costs that do not appear on the company's balance sheet are being absorbed: environmental costs (power draw, water usage, methane turbines), community costs (facility siting decisions), labor costs (piece-rate annotation at below-market rates), and intellectual property costs (training on copyrighted material without consent or compensation).
    Pro tipThe Memphis turbine discovery (community members smelling gas leaks) is the archetype — externalized costs eventually become visible to affected parties, creating political liability.
  4. Assess regulatory exposure by resource consumption
    Rocket AI products are exposed to regulatory risk through each of their externalized cost categories: environmental regulation (energy/water), copyright law (training data), labor law (annotation worker classification), and market concentration rules. Bicycle AI products with small curated datasets, consented data, and low compute are structurally insulated from most of these vectors.
    Pro tipThe 80% American consensus on AI regulation is the policy pressure signal — when regulation arrives, identify which pillar it targets first (data practices are the most legally mature vector).
  5. Evaluate positioning against the empire critique
    If you are building or investing in an AI product, assess explicitly whether it can be positioned as a bicycle alternative to rocket AI on each of the four empire dimensions: data (consented, curated vs. appropriated), labor (no displacement loop vs. double extraction), knowledge (open research vs. captured), and narrative (specific claims vs. AGI mythmaking). Products that genuinely differ on all four dimensions have structural moat against regulatory risk.
    Pro tipVenice's positioning as inference-only, no-training-data-extraction, privacy-preserving is the operational example Hao's framework validates — it is structurally differentiated on all four empire dimensions.
    WarningPositioning claims must be structurally grounded — greenwashing-equivalent bicycle claims from rocket AI companies are common and will face increasing scrutiny.

Checklist

Saved in your browser

Examples

2 cases
DeepMind AlphaFold — the bicycle exemplar

AlphaFold predicts 3D protein structure from amino acid sequences. It was trained on a curated database of known protein structures — a small, purpose-built dataset, not internet-scale data appropriation. It won the 2024 Nobel Prize in Chemistry and has materially accelerated drug discovery research globally. Its compute requirements are a fraction of frontier LLMs. It exemplifies what Hao calls the bicycle: small curated dataset, specific use case, high genuine utility, low externalized cost.

OutcomeAlphaFold is now cited by frontier labs as evidence that AI delivers scientific breakthroughs — despite being structurally opposite to their own development approach. The conflation is the tell.
OpenAI Stargate and Memphis Colossus — the rocket cost footprint

The OpenAI Stargate facility in Abilene, Texas will be the size of Central Park, run 1 million chips, and require power equivalent to more than 20% of New York City's consumption. Musk's Colossus training cluster in Memphis used 35 methane gas turbines — discovered by community members through smell, not disclosure. These facilities represent the externalized resource consumption of rocket AI at full scale.

OutcomeDozens of protests against AI data centers have emerged across the US. The environmental pillar of regulatory exposure is becoming politically active — validating the framework's prediction that externalized costs eventually create political liability.

Common mistakes

3 traps
Accepting frontier lab credit for narrow AI successes
When OpenAI or Google cite AlphaFold, protein folding, or medical imaging AI as justifications for frontier LLM development budgets, they are conflating structurally different systems. AlphaFold required a curated dataset of protein sequences and targeted architecture — not internet-scale data appropriation. Accepting this conflation legitimizes rocket-scale resource consumption based on bicycle-scale achievements.
Assuming generalizable capability is always more valuable
The frontier labs' business model requires generalizable capability because narrow-purpose AI cannot command premium pricing across multiple industries. But for specific use cases, narrow AI with curated domain knowledge consistently outperforms general AI. Hao's point on radiologists: 'The best healthcare AI outcomes come from the radiologist having the AI model in their hands' — human expert plus specific tool, not general replacement.
Underestimating regulatory timeline because costs are not yet priced
Externalized costs that are not currently priced (environmental, labor, IP) will eventually be regulated or litigated. The Memphis community discovering the methane turbines through gas smell is the archetype — externalized costs become politically visible faster than companies anticipate once they reach residential scale.

Origin story

How this framework came to be

The analogy emerged from Hao's observation that public discourse collapses all AI into a single category, allowing frontier labs to claim credit for narrow-purpose AI successes while obscuring the costs of their own rocket-scale development. AlphaFold's success is used to justify GPT-scale training budgets despite the two systems having almost nothing in common architecturally or in resource requirements. The conflation is not accidental — it serves the narrative control pillar of the empire framework by allowing selective citation of AI benefits.

Source

Traced to primary
Source · PODCAST
AI Whistleblower: We Are Being Gaslit By AI Companies, They're Hiding The Truth!
Karen Hao · 2025
Open source →

Related frameworks

Browse all Innovation →