STRATEGY

Ongoing practice82% confidence

The Two-Tier AI Risk Stack

Separate near-term misuse risk from long-term AGI risk — they need different models

AI risk mental models timeline analysis epistemics

Problem it solves

Conflating near-term AI misuse with long-term AGI speculation

Best for

Anyone needing a clear mental model for reasoning about AI risk across different time horizons without conflating near-term harms with existential speculation

Not ideal for

Specific policy timelines or investment entry/exit signals — the framework is conceptual, not operational

Overview

Why this framework exists

Harris distinguishes two fundamentally different species of AI risk that are related but demand separate mental models and levels of urgency. Tier 1 is near-term misuse risk (1–5 years): AI amplifies the already-broken information environment to the point where synthetic media floods the epistemic commons and shared reality collapses. One person with a sufficiently capable model could generate thousands of convincing fake journal articles, deepfake documentaries, or disinformation campaigns with near-zero friction cost.

Tier 2 is long-term AGI alignment risk (10–50 years): once a superhuman, self-improving system exists, the Dumber Party Problem becomes irreducible. Harris uses the analogy of dogs and humans — dogs benefit from the relationship for 10,000 years but cannot conceive of everything humans do or why. At capability scale, misalignment does not require malevolence; it only requires divergence. In the space of all possible superhuman minds, there are vastly more ways to be misaligned than aligned.

The framework's core insight is that the field made a critical error by assuming a 'box moment' would exist — a point where AGI is contained and deliberated over before release. That moment never came. The systems were built in the wild, connected to the internet, with millions of users, before anyone could have the conversation. Harris argues we lost the moment to decide whether to hook our most powerful AI to everything.

Core principles

5 total

Near-term misuse risk and long-term AGI alignment risk require distinct mental models, urgency levels, and response strategies.
The friction cost of producing high-quality disinformation is approaching zero, making information environment collapse a 1–5 year risk, not a speculative future one.
In the space of all possible superhuman minds, misaligned minds vastly outnumber aligned ones — greater intelligence does not produce greater ethics by default.
The assumption that a controlled 'box moment' for deliberation about AGI release would exist was wrong; the systems were deployed at scale before the conversation could happen.
Processing speed asymmetry makes alignment negotiation with AGI structurally impossible: what is two weeks to humans could be 20,000 years of analogous progress to a superhuman system.

Steps

4 steps

Identify which tier the risk claim belongs to
When encountering any AI risk argument, first classify it as Tier 1 (near-term misuse, 1–5 years, current systems) or Tier 2 (long-term AGI alignment, 10–50 years, superhuman systems). The causal mechanisms are different and conflating them produces muddled reasoning.
Pro tipAsk: 'Does this risk require superhuman intelligence or just current capability at scale?' If the latter, it is Tier 1.
Map Tier 1 risks to the information environment
Near-term misuse risk centers on synthetic media and the epistemic commons. Assess the specific threat: fake scientific literature, deepfake video, AI-generated disinformation at industrial scale. Each has a different friction-cost trajectory and a different detection horizon.
WarningUnderestimating the timeline is the dominant error. Harris argues a convincing AI-generated Holocaust-denial documentary was 18 months to 3 years away at the time of recording (2023).
Apply the Dumber Party Problem to Tier 2 reasoning
For AGI alignment arguments, run the Dumber Party analogy: the dumber species in the presence of the smarter species has a fundamental lack of insight into what the smarter party is doing, why it is doing it, and what it will do next. Capability asymmetry alone — not malevolence — is sufficient for danger.
Pro tipFocus on processing speed as the key asymmetry. Million-fold speed advantage means alignment negotiation cannot happen in real time.
Assess what response is appropriate for each tier
Tier 1 calls for provenance infrastructure, detection tools, and epistemic inoculation. Tier 2 calls for alignment research, compute governance, and pausing capability scaling until safety is better understood. Applying Tier 2 responses to Tier 1 problems (or vice versa) wastes resources and misdirects urgency.
WarningDo not let Tier 2 speculation crowd out Tier 1 urgency. Harris's personal reassessment moved toward greater near-term focus precisely because this substitution was happening in public discourse.

Checklist

Saved in your browser

Classify the AI risk claim as Tier 1 (near-term misuse) or Tier 2 (long-term AGI) before responding to it
Ask whether the risk requires superhuman AI or just current AI at industrial scale
For Tier 1 risks, assess the friction-cost trajectory of producing vs. detecting the harm
For Tier 2 risks, apply the Dumber Party Problem: capability asymmetry alone is sufficient for danger
Do not assume greater intelligence implies greater alignment with human values
Do not plan for a future deliberation window that current deployment patterns have already bypassed
Allocate urgency proportionally: Tier 1 is happening now, Tier 2 is speculative but structurally serious
Assess whether your response to one tier is inadvertently deprioritizing the other

Examples

2 cases

The Fake Journal Article Attack

Harris describes a scenario in which a single teenager with access to a sufficiently capable LLM generates a thousand fake journal articles arguing that mRNA vaccines cause cancer — complete with citations, statistical tables, and formatting indistinguishable from Nature or JAMA. The cost of this attack is near zero. The cost of refuting each article through the scientific process is orders of magnitude higher.

OutcomeThis illustrates why Tier 1 risk is not theoretical: the asymmetry between generating disinformation and refuting it is structurally exploitable at scale with current systems, not future AGI.

The Holocaust Denial Documentary

Harris describes a near-future (18 months to 3 years at time of recording) in which a 45-minute AI-generated documentary arguing the Holocaust never happened is produced — complete with archival-style imagery, Hitler speaking German with appropriate translations, and the aesthetic of a Ken Burns production. The production cost collapses from millions of dollars and years of specialized skill to hours of prompting.

OutcomeDemonstrates the deepfake video threat as a specific, near-term, technically achievable Tier 1 risk that does not require AGI — only current multimodal AI at sufficient quality, which Harris believed was imminent.

Common mistakes

4 traps

Treating AI risk as a single undifferentiated category

The most common error Harris identifies: lumping near-term misuse harms and long-term AGI alignment risks into one 'AI risk' bucket produces confused policy thinking, misallocated research funding, and an inability to prioritize what is actually urgent now.

Assuming greater intelligence produces greater ethics

There is no principled basis for the assumption that a superintelligent system will converge on human values. In the space of all possible minds, ethical alignment with human interests is one configuration among an astronomically larger set. Betting on this without alignment work is, in Harris's framing, quite a gamble.

Expecting a deliberate 'box moment' before AGI deployment

The field collectively assumed there would be a controlled laboratory phase during which humanity could deliberate about AGI release. That phase never materialized. Current systems were connected to hundreds of millions of users before alignment was solved. Planning for a future box moment that has already been missed is a strategic error.

Underestimating the timeline for information environment collapse

Most discourse treats deepfake video and AI-generated scientific literature at industrial scale as 5–10 year risks. Harris argues the friction cost is collapsing faster than that, and epistemic infrastructure is not being built at a matching pace. The asymmetry between destructive and defensive capability timelines is systematically underappreciated.

Origin story

How this framework came to be

Harris developed this framing after years of focusing primarily on long-term AGI existential risk — in the tradition of Nick Bostrom and the Machine Intelligence Research Institute — before reassessing his emphasis in light of rapidly advancing large language models and the near-term information environment collapse he observed from 2022 onward. The framework emerged from his public writing, his Making Sense podcast, and his ongoing engagement with AI alignment researchers.

The Two-Tier structure is a deliberate corrective to what Harris views as a common error in AI discourse: treating 'AI risk' as a monolithic category when the relevant causal mechanisms, timelines, and required responses differ substantially. The Dumber Party Problem analogy was refined over multiple public conversations and represents his most accessible framing of the alignment concern.

Source

Traced to primary

Source · PODCAST

WARNING: ChatGPT Could Be The Start Of The End!

Sam Harris · 2023

Open source →

Related frameworks

Browse all Strategy →