The Long-Run Data Minimum
Thirty years of data tells you almost nothing — you need a century across many countries
The central methodological claim of the DMS research program is that short-run financial data — even data spanning 30-40 years — is insufficient to form reliable conclusions about structural asset-class relationships like the equity risk premium. The reason is not just statistical: it is that short windows coincide with specific structural regimes that may not recur. A 40-year window starting in 1980 captures the entire decline of interest rates from 16% to near-zero — an unrepeatable tailwind for bond prices that inflates both absolute bond returns and relative bond-equity comparisons.
Staunton uses the example of a 'guess the weight of the cake' competition to make the epistemological point: without intrinsic capacity to estimate, the best strategy is to average all available evidence — not to anchor on the most recent data point. For financial return estimation, 'all available evidence' means 125+ years across 35+ countries with explicit survivorship correction. Anything less gives undue weight to recent structural regimes.
The practical implication is a data-quality threshold below which return estimates should not be treated as reliable anchors. Bitcoin fails this threshold not because of any ideological objection to cryptocurrency, but because even several decades of data is insufficient to separate structural signal from noise — and adding it based on strong recent performance introduces success bias. The same logic applies to hedge fund data (several decades is not enough), sector rotation strategies (most lack 100-year evidence across countries), and any asset class where the data series was started because it was doing well.
- A 30-40 year return series typically spans only one structural regime and tells you almost nothing about long-run asset-class behavior.
- A credible long-run return estimate requires at least 100 years of data across multiple countries to average out country-specific noise.
- Any return series started because an asset was performing well is contaminated by success bias from the outset.
- Structural tailwinds embedded in short windows — like 40 years of declining interest rates — cannot be extrapolated and must be stripped out.
- The appropriate data minimum for evaluating alternative asset classes is not shorter than for traditional asset classes — it is longer, because the data is noisier.
- Identify the effective start date of any return seriesFor every return series you use, determine when the data actually begins and why. If the series was started or back-filled because the asset was doing well, or if the start date coincides with an unusual structural regime, the series is unreliable as a forward-looking anchor.Pro tipThe DMS standard: series must start at 1900 or as early as surviving records allow, regardless of whether that starting point flatters or hurts the asset class.WarningMany widely-cited alternative asset return series begin in the 1990s or 2000s — precisely when the assets in question started performing well. This is selection bias encoded in the data start date.
- Check whether the data window spans multiple structural regimesA valid long-run return series should include at least two complete interest rate cycles, at least one major deflationary and inflationary episode, and at least one period of financial crisis severe enough to eliminate some market participants. If the window does not include all of these, it is not long enough to anchor expected returns.Pro tipThe DMS 125-year series covers two world wars, multiple inflationary and deflationary episodes, the Great Depression, and the 2008 financial crisis — the minimum credible set of stress events.
- Require cross-country evidence before treating a return as structuralA return pattern visible in one country may reflect that country's specific history, not a universal structural relationship. Before treating any asset-class return as a reliable forward-looking anchor, verify it holds across multiple countries with different institutional arrangements.Pro tipThe equity risk premium holds across the DMS 35-country dataset — that cross-country validation is what makes it a structural claim rather than a US-specific observation.WarningCountry-specific return records — even long ones — are unreliable guides to forward returns in that country, let alone globally.
- Apply a structural-regime discount to short-window estimatesWhen forced to use a shorter data window — because 100-year data is unavailable — explicitly identify the dominant structural regime and discount accordingly. If interest rates were falling throughout the window, discount bond returns. If the market was in a secular bull run, discount equity returns. State the discount and its rationale.WarningRegulators typically resist structural discounts because they reduce the allowable returns for the utilities they oversee — but the intellectual case for the discount is strong regardless of the political resistance.
- Default to the world-index estimate when country data is insufficientWhen country-specific data is too short or too volatile to be reliable, use the DMS world-index estimate as the base case and apply a country-specific adjustment only if there is strong evidence for a structural difference. This is conservative, robust, and consistent with the principle that diversification is the dominant strategy.Pro tipThe world-index real equity return (~5.6% annualized) is the most survivorship-corrected, multi-country, long-run estimate available. It should be the prior from which all country-specific adjustments are made.
When asked whether Bitcoin would eventually join the DMS asset classes, Marsh gives a two-part answer: (1) the data series is too short — even several decades would not be enough; (2) adding it now based on strong recent performance would be success bias. The discipline of the framework requires ignoring recent performance as a criterion for inclusion.
An investor who started tracking UK gilt returns in 1980, when yields were approximately 16%, would have recorded spectacular bond performance through 2020 as yields fell to near-zero. That entire return series captures a single structural regime — declining rates — not long-run bond behavior.
Pre-1926 US equity data series omitted the Second Bank of the United States — a company that grew to ~30% of market capitalization before failing. Its omission made early US equity returns look systematically better than they were. The correction, once identified, shifted the long-run return estimate downward.
To construct a valid Austrian equity series beginning in 1900, the DMS team needed data that had never been digitized. A PhD student climbed ladders in a Viennese stock exchange archive to transcribe handwritten price records from books stored at height. The result was a data series that could not have been constructed remotely — and which captures Austrian market performance across the Austro-Hungarian collapse, two world wars, and hyperinflation.
Marsh explains the data-minimum principle most directly in response to the question of how long a return series Bitcoin would need before DMS would consider adding it. His answer: 'even for hedge funds, several decades is really not enough.' He illustrates with the UK government bond example: 30-40 years of data looks good because it captures a declining-rate tailwind, but it tells you almost nothing about what long-term bond returns will be when that structural tailwind has run out. The principle is a generalization of the survivorship-bias insight: not only must the data cover failures, it must cover full structural cycles — multiple interest rate regimes, different economic institutions, and different geopolitical configurations.