Spatiotemporal Flexibility

Shift demand in time and space to unlock stranded capacity instead of building more supply.

Problem it solves

Constrained shared infrastructure where adding supply is slow and expensive but average utilisation is far below peak, leaving stranded capacity unused.

Best for

Operators facing constrained shared infrastructure where peak demand occurs only a few hours per period and average utilization is far below capacity.

Not ideal for

Workloads that are entirely latency-sensitive, non-relocatable, and cannot tolerate any pause, slowdown, or geographic shift.

Overview

Why this framework exists

Spatiotemporal Flexibility is a demand-side operating framework for matching elastic workloads to constrained shared infrastructure. Instead of treating supply as the only lever, it asks: how flexible can demand be in *when* it runs (temporal) and *where* it runs (spatial)? The insight is that most shared systems—power grids, networks, compute clusters—are sized for rare peak events, leaving roughly half their capacity unused on average. If even a small fraction of demand can shift, vast stranded capacity becomes usable without new build-out.

The framework decomposes flexibility into two dimensions. Temporal flexibility classifies workloads as batchable (training, simulations, deep research) versus real-time, then pauses or slows the batchable portion when the system is stressed and accelerates it when capacity is abundant. Spatial flexibility recognises that some real-time workloads—like a chatbot query—can't pause, but can be routed across a network at the speed of light to wherever capacity is currently free.

Operationally, you need three layers: a classifier that tags each unit of demand with its flexibility profile, a signal feed from the constrained system telling you when stress is imminent, and an orchestrator that pauses, slows, or relocates work to match. Done well, modest flexibility (single-digit percent of the year) unlocks order-of-magnitude capacity gains and lets cheap, intermittent supply integrate without destabilising the system.

Core principles

5 total

Treat demand, not just supply, as the primary lever when infrastructure is constrained—half of most shared systems' capacity sits idle on average.
Classify every workload by its flexibility profile (batchable, shiftable, fixed) before deciding how to schedule or route it.
Use temporal flexibility to ride out short, predictable peaks by pausing or slowing non-urgent work, then sprinting when headroom returns.
Use spatial flexibility to relocate real-time work to wherever capacity is currently abundant, exploiting fast networks as virtual transmission.
Modest flexibility unlocks disproportionate capacity—being flexible less than 2% of the year can free up enough headroom to multiply usable supply.

Checklist

Saved in your browser

Audit your demand portfolio and tag each workload as batchable, shiftable in space, or fully inflexible, with acceptable performance thresholds for each.
Wire a real-time stress signal from the constrained system (utility, network, cluster) into your scheduler so flexibility is triggered automatically, not manually.
Build or adopt an orchestration layer that pauses, slows, or relocates flexible work the moment a stress signal fires and ramps it back when the signal clears.

Origin story

How this framework came to be

Developed by Varun Sivaram and the Emerald AI team and demonstrated in May 2025 in Phoenix, Arizona, where a 256-GPU cluster at an Oracle data centre cut power draw 25% for three hours during peak grid demand while still meeting AI workload performance thresholds.

Source

Traced to primary

Source · PODCAST

How AI Can Solve Its Own Energy Crisis

Varun Sivaram

Open source →

Related frameworks

Browse all Strategy →