STRATEGY

Months to result85% confidence

Proof of Safety Standard

Require AI developers to prove safety the way we require it of nuclear plants and aircraft

regulation safety-standards governance policy risk-management

Problem it solves

Absence of structural safety requirements that make safety non-optional for AI developers

Best for

Evaluating AI regulatory proposals; identifying which safety claims are substantive vs. performative; designing AI governance frameworks with structural teeth

Not ideal for

Individual model evaluation or near-term product purchasing decisions

Overview

Why this framework exists

The Proof of Safety Standard applies the regulatory logic of nuclear power and commercial aviation to AI development: before a technology with extinction-capable risk is deployed, its developers must prove it is safe to a specified confidence level. This is not a prohibition on AI development — it is a precondition for deployment, exactly as aircraft must pass certification before carrying passengers.

Russell's argument is that the AI industry's response to this proposal — 'we don't know how to prove safety, so you can't have that rule' — is structurally identical to a nuclear plant operator saying 'we don't know how to prove containment, so you can't require it.' It is the industry asserting that its inability to meet a safety standard is a valid reason to not have the standard, rather than a reason to not deploy until the standard can be met.

The framework distinguishes between prohibition and precondition. Russell is not arguing for a ban on AI. He is arguing for a deployment precondition: prove safety at the confidence levels we require of nuclear plants. If you cannot prove it, do not deploy. The industry's resistance to this standard reveals the gap between safety claims and safety capability — companies are claiming their systems are safe while simultaneously admitting they cannot prove it.

Core principles

5 total

Technologies with catastrophic failure modes require proof of safety before deployment, not after incidents.
An industry's inability to prove its technology is safe is a reason to delay deployment, not a reason to waive the safety requirement.
Voluntary safety commitments are structurally insufficient when the economic incentive to deploy is $15 quadrillion.
Proof of safety must be certified at the same confidence levels required of nuclear plants and commercial aircraft.
Claiming safety while being unable to prove safety is a form of regulatory capture — the burden of proof has been silently inverted.

Steps

5 steps

Map the failure mode to existing regulatory analogues
Identify the closest regulatory precedent for the failure mode in question. For AI with extinction-capable risk, the analogues are nuclear power and commercial aviation — both have established proof-of-safety frameworks that can be adapted.
Pro tipThe closer the failure mode analogue, the more directly the existing regulatory framework can be applied. AI self-preservation leading to human harm maps to nuclear containment failure.
Identify the safety claim vs. safety proof gap
For any AI system with claimed safety properties, determine whether the claim is backed by proof to a specified confidence level or by testing and assertion. The gap between these is the regulatory exposure. Safety by assertion is not equivalent to safety by proof.
Pro tipAsk: 'What would it take to falsify this safety claim?' If the answer is 'we'd have to run it and see what happens,' the claim is assertion, not proof.
Apply the cannot-prove test
If a developer says they cannot prove their system is safe to the required standard, evaluate this as a deployment precondition failure rather than a reason to lower the standard. The correct response is: do not deploy until you can prove it.
Pro tipRussell's formulation: 'They are literally saying humanity has no right to protect itself from us.' This reframes the cannot-prove response as an assertion of right rather than a technical limitation.
WarningThe cannot-prove response is frequently deployed to shift burden of proof from developer (prove it is safe) to regulator (prove it is dangerous).
Distinguish prohibition from precondition
When evaluating regulatory proposals, determine whether they are prohibitions (you cannot build this technology) or preconditions (you cannot deploy until you meet this standard). Russell's proposal is a precondition — pro-innovation, pro-development, but requiring safety proof before deployment.
Pro tipPrecondition framing neutralizes the 'this will kill innovation' counter-argument — the standard accelerates the development of safe AI, not the suppression of AI.
Evaluate whether safety culture is structural or performative
Determine whether the organization's safety practices are structural (required by external standards with enforcement mechanisms) or performative (internal commitments with no external verification). Structural safety cultures survive economic pressure; performative ones do not.
Pro tipThe test: would safety practices persist if the economic incentive to abandon them increased by 10x? If no, the culture is performative.

Checklist

Saved in your browser

Identify the closest regulatory analogue for the failure mode (nuclear, aviation, pharmaceutical)
Map the safety claim to the safety proof gap: is safety backed by proof or assertion?
Apply the cannot-prove test: does inability to prove safety justify waiving the standard?
Distinguish prohibition from precondition in regulatory proposals
Evaluate whether safety culture is structural (external enforcement) or performative (internal commitment)
Verify that safety standards require falsifiable proof, not just testing-and-assertion
Confirm who bears the burden of proof: developer (prove it is safe) or regulator (prove it is dangerous)

Examples

2 cases

Nuclear power regulatory model

Nuclear power plants operate under rigorous proof-of-safety requirements: containment must be proven to specified confidence levels before operation, and failure modes must be characterized and bounded. The industry did not develop this culture voluntarily — it was imposed after near-miss events and the recognition that self-regulation was insufficient for extinction-capable technologies.

OutcomeDemonstrates that proof-of-safety standards are achievable for complex, high-risk technologies and that they do not prohibit the technology — they create the conditions under which the technology can be deployed responsibly.

Industry's cannot-prove response

When regulators have proposed requiring AI developers to prove their systems cannot escape human control or develop self-preservation objectives that override human interests, the industry response has been that they do not know how to prove this. Russell's interpretation: this is an admission that the safety claim is assertion, not proof — and therefore a deployment precondition has not been met.

OutcomeReveals the gap between safety claims (public messaging) and safety capability (what can actually be proven). Companies claiming their systems are safe while admitting they cannot prove it are making a non-falsifiable safety claim.

Common mistakes

3 traps

Accepting cannot-prove as a valid reason to waive safety requirements

The AI industry's response to proof-of-safety requirements — 'we don't know how to do that' — is frequently accepted as a valid objection. It should be treated as a deployment precondition failure: you cannot deploy what you cannot prove is safe.

Conflating safety effort with safety proof

Running extensive safety evaluations, red-teaming, and releasing safety reports is safety effort. It is not safety proof to a specified confidence level. The distinction is the same as the difference between stress-testing an aircraft component and certifying it airworthy.

Treating voluntary commitments as structural safety

Industry safety pledges, voluntary frameworks, and CEO commitments are performative safety mechanisms. They do not survive economic pressure equivalent to the $15 quadrillion magnet. Structural safety requires external enforcement with deployment consequences.

Origin story

How this framework came to be

Russell has advocated for nuclear/aviation-parallel safety standards for AI throughout his safety research, building on the observation that aviation and nuclear power — both technologies with catastrophic failure modes — developed robust safety cultures and regulatory frameworks precisely because deployment without safety proof was not permitted. The aviation analogy is particularly sharp: an aircraft manufacturer cannot certify a plane is airworthy by saying 'we tested it a lot and it seems fine.' They must prove it to a specified standard. Russell applies this directly to AI deployment.

Source

Traced to primary

Source · PODCAST

An AI Expert Warning: 6 People Are Quietly Deciding Humanity's Future!

Stuart Russell · 2025

Open source →

Related frameworks

Browse all Strategy →