FINANCE

Ongoing practice87% confidence

The Long-Run Data Minimum

Thirty years of data tells you almost nothing — you need a century across many countries

data requirements historical returns long-term investing financial research statistical validity

Problem it solves

Calibrating return expectations from datasets that are too short to distinguish signal from noise

Best for

Investors and analysts evaluating any claimed long-run return series, regulators setting evidence-based return assumptions, researchers designing financial datasets

Not ideal for

Day traders and short-term momentum investors for whom 125-year evidence is irrelevant to their holding period

Overview

Why this framework exists

The central methodological claim of the DMS research program is that short-run financial data — even data spanning 30-40 years — is insufficient to form reliable conclusions about structural asset-class relationships like the equity risk premium. The reason is not just statistical: it is that short windows coincide with specific structural regimes that may not recur. A 40-year window starting in 1980 captures the entire decline of interest rates from 16% to near-zero — an unrepeatable tailwind for bond prices that inflates both absolute bond returns and relative bond-equity comparisons.

Staunton uses the example of a 'guess the weight of the cake' competition to make the epistemological point: without intrinsic capacity to estimate, the best strategy is to average all available evidence — not to anchor on the most recent data point. For financial return estimation, 'all available evidence' means 125+ years across 35+ countries with explicit survivorship correction. Anything less gives undue weight to recent structural regimes.

The practical implication is a data-quality threshold below which return estimates should not be treated as reliable anchors. Bitcoin fails this threshold not because of any ideological objection to cryptocurrency, but because even several decades of data is insufficient to separate structural signal from noise — and adding it based on strong recent performance introduces success bias. The same logic applies to hedge fund data (several decades is not enough), sector rotation strategies (most lack 100-year evidence across countries), and any asset class where the data series was started because it was doing well.

Core principles

5 total

A 30-40 year return series typically spans only one structural regime and tells you almost nothing about long-run asset-class behavior.
A credible long-run return estimate requires at least 100 years of data across multiple countries to average out country-specific noise.
Any return series started because an asset was performing well is contaminated by success bias from the outset.
Structural tailwinds embedded in short windows — like 40 years of declining interest rates — cannot be extrapolated and must be stripped out.
The appropriate data minimum for evaluating alternative asset classes is not shorter than for traditional asset classes — it is longer, because the data is noisier.

Steps

5 steps

Identify the effective start date of any return series
For every return series you use, determine when the data actually begins and why. If the series was started or back-filled because the asset was doing well, or if the start date coincides with an unusual structural regime, the series is unreliable as a forward-looking anchor.
Pro tipThe DMS standard: series must start at 1900 or as early as surviving records allow, regardless of whether that starting point flatters or hurts the asset class.
WarningMany widely-cited alternative asset return series begin in the 1990s or 2000s — precisely when the assets in question started performing well. This is selection bias encoded in the data start date.
Check whether the data window spans multiple structural regimes
A valid long-run return series should include at least two complete interest rate cycles, at least one major deflationary and inflationary episode, and at least one period of financial crisis severe enough to eliminate some market participants. If the window does not include all of these, it is not long enough to anchor expected returns.
Pro tipThe DMS 125-year series covers two world wars, multiple inflationary and deflationary episodes, the Great Depression, and the 2008 financial crisis — the minimum credible set of stress events.
Require cross-country evidence before treating a return as structural
A return pattern visible in one country may reflect that country's specific history, not a universal structural relationship. Before treating any asset-class return as a reliable forward-looking anchor, verify it holds across multiple countries with different institutional arrangements.
Pro tipThe equity risk premium holds across the DMS 35-country dataset — that cross-country validation is what makes it a structural claim rather than a US-specific observation.
WarningCountry-specific return records — even long ones — are unreliable guides to forward returns in that country, let alone globally.
Apply a structural-regime discount to short-window estimates
When forced to use a shorter data window — because 100-year data is unavailable — explicitly identify the dominant structural regime and discount accordingly. If interest rates were falling throughout the window, discount bond returns. If the market was in a secular bull run, discount equity returns. State the discount and its rationale.
WarningRegulators typically resist structural discounts because they reduce the allowable returns for the utilities they oversee — but the intellectual case for the discount is strong regardless of the political resistance.
Default to the world-index estimate when country data is insufficient
When country-specific data is too short or too volatile to be reliable, use the DMS world-index estimate as the base case and apply a country-specific adjustment only if there is strong evidence for a structural difference. This is conservative, robust, and consistent with the principle that diversification is the dominant strategy.
Pro tipThe world-index real equity return (~5.6% annualized) is the most survivorship-corrected, multi-country, long-run estimate available. It should be the prior from which all country-specific adjustments are made.

Checklist

Saved in your browser

Does the return series I am using extend back at least 50 years, and ideally 100 years?
Does the window include at least one period of high inflation and one period of deflation?
Does the window include at least one major financial crisis that eliminated significant market participants?
Was the series started at a neutral point in the asset class's history, or when performance was already strong?
Is there cross-country evidence for this return pattern, or is it based on a single country's experience?
Have I identified and discounted the structural tailwinds embedded in the historical window?
Am I using the world-index estimate as a starting prior before applying country-specific adjustments?
Would a financial historian reviewing this dataset describe it as survivorship-corrected and structurally comprehensive?

Examples

4 cases

Bitcoin exclusion from the DMS database

When asked whether Bitcoin would eventually join the DMS asset classes, Marsh gives a two-part answer: (1) the data series is too short — even several decades would not be enough; (2) adding it now based on strong recent performance would be success bias. The discipline of the framework requires ignoring recent performance as a criterion for inclusion.

OutcomeThis is the purest illustration of the data-minimum principle: a genuinely novel asset class cannot be evaluated for expected-return purposes without multi-decade, multi-regime evidence — and acquiring that evidence takes time that cannot be shortened by recent performance.

UK government bond returns 1980-2020 as a misleading baseline

An investor who started tracking UK gilt returns in 1980, when yields were approximately 16%, would have recorded spectacular bond performance through 2020 as yields fell to near-zero. That entire return series captures a single structural regime — declining rates — not long-run bond behavior.

OutcomeAny forward projection anchored to this window overstates expected bond returns by a wide margin, because the tailwind cannot repeat. The long-run DMS bond data — which includes the inflationary periods when bonds were 'a terrible thing for the average country' — gives a much more conservative and accurate forward anchor.

The Second Bank of the United States omission

Pre-1926 US equity data series omitted the Second Bank of the United States — a company that grew to ~30% of market capitalization before failing. Its omission made early US equity returns look systematically better than they were. The correction, once identified, shifted the long-run return estimate downward.

OutcomeThis example illustrates that even well-researched long-run series can contain errors of omission that bias estimates. The DMS team's active data correction program is the institutional response to this vulnerability — treating the dataset as continuously improvable rather than fixed.

Austria's copper-plate archive research

To construct a valid Austrian equity series beginning in 1900, the DMS team needed data that had never been digitized. A PhD student climbed ladders in a Viennese stock exchange archive to transcribe handwritten price records from books stored at height. The result was a data series that could not have been constructed remotely — and which captures Austrian market performance across the Austro-Hungarian collapse, two world wars, and hyperinflation.

OutcomeThe 'financial archaeology' required to build credible long-run data series is expensive and slow. This is why the DMS 35-country dataset is unique — the barrier to entry is not analytical but archival, and the team has been building it for 40 years.

Common mistakes

5 traps

Using the most recent 30-40 years as representative of long-run returns

The 1980-2020 period embedded a once-in-a-generation structural tailwind from falling interest rates. Bond investors made money not because bonds are reliably valuable over the long run, but because rates fell from 16% to near-zero. Using this window to set expected bond returns dramatically overstates forward-looking bond performance.

Starting a return series when an asset begins performing well

Adding Bitcoin, Lego, or any alternative asset to a dataset when it has already performed strongly is success bias. The DMS approach requires series to start before performance is known — ideally at 1900 or equivalent. Any series that starts at the beginning of a strong run is not a long-run return series; it is a performance track record.

Treating single-country evidence as universal

The US equity premium is the most-studied in the world but represents the best-performing major market of the 20th century. Using US-only evidence to set global return expectations overstates the structural equity premium by several percentage points.

Ignoring the pre-1900 evidence on equity premia

Data back to 1800 for the US and UK shows a smaller equity premium than the 20th century, suggesting the 20th century was a structural anomaly. Ignoring this longer-run evidence leads to overestimation of the forward-looking premium.

Assuming that more recent data is more relevant than older data

The intuition that 'the world has changed since 1900' is partially correct but systematically misapplied. Modern institutional structures, central banks, and financial regulation have changed the context — but the fundamental risk-return relationship captured in 125 years of data is more robust than any 30-year window, regardless of which era it covers.

Origin story

How this framework came to be

Marsh explains the data-minimum principle most directly in response to the question of how long a return series Bitcoin would need before DMS would consider adding it. His answer: 'even for hedge funds, several decades is really not enough.' He illustrates with the UK government bond example: 30-40 years of data looks good because it captures a declining-rate tailwind, but it tells you almost nothing about what long-term bond returns will be when that structural tailwind has run out. The principle is a generalization of the survivorship-bias insight: not only must the data cover failures, it must cover full structural cycles — multiple interest rate regimes, different economic institutions, and different geopolitical configurations.

Source

Traced to primary

Source · PODCAST

The Golden Age of Returns is Over

Mike Staunton & Paul Marsh · 2025

Open source →

Related frameworks

Browse all Finance →