Spatiotemporal Flexibility
Shift demand in time and space to unlock stranded capacity instead of building more supply.
Spatiotemporal Flexibility is a demand-side operating framework for matching elastic workloads to constrained shared infrastructure. Instead of treating supply as the only lever, it asks: how flexible can demand be in *when* it runs (temporal) and *where* it runs (spatial)? The insight is that most shared systems—power grids, networks, compute clusters—are sized for rare peak events, leaving roughly half their capacity unused on average. If even a small fraction of demand can shift, vast stranded capacity becomes usable without new build-out.
The framework decomposes flexibility into two dimensions. Temporal flexibility classifies workloads as batchable (training, simulations, deep research) versus real-time, then pauses or slows the batchable portion when the system is stressed and accelerates it when capacity is abundant. Spatial flexibility recognises that some real-time workloads—like a chatbot query—can't pause, but can be routed across a network at the speed of light to wherever capacity is currently free.
Operationally, you need three layers: a classifier that tags each unit of demand with its flexibility profile, a signal feed from the constrained system telling you when stress is imminent, and an orchestrator that pauses, slows, or relocates work to match. Done well, modest flexibility (single-digit percent of the year) unlocks order-of-magnitude capacity gains and lets cheap, intermittent supply integrate without destabilising the system.
- Treat demand, not just supply, as the primary lever when infrastructure is constrained—half of most shared systems' capacity sits idle on average.
- Classify every workload by its flexibility profile (batchable, shiftable, fixed) before deciding how to schedule or route it.
- Use temporal flexibility to ride out short, predictable peaks by pausing or slowing non-urgent work, then sprinting when headroom returns.
- Use spatial flexibility to relocate real-time work to wherever capacity is currently abundant, exploiting fast networks as virtual transmission.
- Modest flexibility unlocks disproportionate capacity—being flexible less than 2% of the year can free up enough headroom to multiply usable supply.
Developed by Varun Sivaram and the Emerald AI team and demonstrated in May 2025 in Phoenix, Arizona, where a 256-GPU cluster at an Oracle data centre cut power draw 25% for three hours during peak grid demand while still meeting AI workload performance thresholds.