Proof of Safety Standard
Require AI developers to prove safety the way we require it of nuclear plants and aircraft
The Proof of Safety Standard applies the regulatory logic of nuclear power and commercial aviation to AI development: before a technology with extinction-capable risk is deployed, its developers must prove it is safe to a specified confidence level. This is not a prohibition on AI development — it is a precondition for deployment, exactly as aircraft must pass certification before carrying passengers.
Russell's argument is that the AI industry's response to this proposal — 'we don't know how to prove safety, so you can't have that rule' — is structurally identical to a nuclear plant operator saying 'we don't know how to prove containment, so you can't require it.' It is the industry asserting that its inability to meet a safety standard is a valid reason to not have the standard, rather than a reason to not deploy until the standard can be met.
The framework distinguishes between prohibition and precondition. Russell is not arguing for a ban on AI. He is arguing for a deployment precondition: prove safety at the confidence levels we require of nuclear plants. If you cannot prove it, do not deploy. The industry's resistance to this standard reveals the gap between safety claims and safety capability — companies are claiming their systems are safe while simultaneously admitting they cannot prove it.
- Technologies with catastrophic failure modes require proof of safety before deployment, not after incidents.
- An industry's inability to prove its technology is safe is a reason to delay deployment, not a reason to waive the safety requirement.
- Voluntary safety commitments are structurally insufficient when the economic incentive to deploy is $15 quadrillion.
- Proof of safety must be certified at the same confidence levels required of nuclear plants and commercial aircraft.
- Claiming safety while being unable to prove safety is a form of regulatory capture — the burden of proof has been silently inverted.
- Map the failure mode to existing regulatory analoguesIdentify the closest regulatory precedent for the failure mode in question. For AI with extinction-capable risk, the analogues are nuclear power and commercial aviation — both have established proof-of-safety frameworks that can be adapted.Pro tipThe closer the failure mode analogue, the more directly the existing regulatory framework can be applied. AI self-preservation leading to human harm maps to nuclear containment failure.
- Identify the safety claim vs. safety proof gapFor any AI system with claimed safety properties, determine whether the claim is backed by proof to a specified confidence level or by testing and assertion. The gap between these is the regulatory exposure. Safety by assertion is not equivalent to safety by proof.Pro tipAsk: 'What would it take to falsify this safety claim?' If the answer is 'we'd have to run it and see what happens,' the claim is assertion, not proof.
- Apply the cannot-prove testIf a developer says they cannot prove their system is safe to the required standard, evaluate this as a deployment precondition failure rather than a reason to lower the standard. The correct response is: do not deploy until you can prove it.Pro tipRussell's formulation: 'They are literally saying humanity has no right to protect itself from us.' This reframes the cannot-prove response as an assertion of right rather than a technical limitation.WarningThe cannot-prove response is frequently deployed to shift burden of proof from developer (prove it is safe) to regulator (prove it is dangerous).
- Distinguish prohibition from preconditionWhen evaluating regulatory proposals, determine whether they are prohibitions (you cannot build this technology) or preconditions (you cannot deploy until you meet this standard). Russell's proposal is a precondition — pro-innovation, pro-development, but requiring safety proof before deployment.Pro tipPrecondition framing neutralizes the 'this will kill innovation' counter-argument — the standard accelerates the development of safe AI, not the suppression of AI.
- Evaluate whether safety culture is structural or performativeDetermine whether the organization's safety practices are structural (required by external standards with enforcement mechanisms) or performative (internal commitments with no external verification). Structural safety cultures survive economic pressure; performative ones do not.Pro tipThe test: would safety practices persist if the economic incentive to abandon them increased by 10x? If no, the culture is performative.
Nuclear power plants operate under rigorous proof-of-safety requirements: containment must be proven to specified confidence levels before operation, and failure modes must be characterized and bounded. The industry did not develop this culture voluntarily — it was imposed after near-miss events and the recognition that self-regulation was insufficient for extinction-capable technologies.
When regulators have proposed requiring AI developers to prove their systems cannot escape human control or develop self-preservation objectives that override human interests, the industry response has been that they do not know how to prove this. Russell's interpretation: this is an admission that the safety claim is assertion, not proof — and therefore a deployment precondition has not been met.
Russell has advocated for nuclear/aviation-parallel safety standards for AI throughout his safety research, building on the observation that aviation and nuclear power — both technologies with catastrophic failure modes — developed robust safety cultures and regulatory frameworks precisely because deployment without safety proof was not permitted. The aviation analogy is particularly sharp: an aircraft manufacturer cannot certify a plane is airworthy by saying 'we tested it a lot and it seems fine.' They must prove it to a specified standard. Russell applies this directly to AI deployment.