
Sharpe ratio confidence intervals: how to quantify the noise around a Sharpe estimate
A reported Sharpe ratio of 1.0 looks definite. In statistical reality, it is a point estimate from a finite sample, and the true underlying Sharpe could plausibly be anywhere within a wide confidence interval around the reported figure. Computing the interval is straightforward and tells the investor how much weight the headline figure deserves.
What Sharpe confidence intervals are
A confidence interval on the Sharpe ratio is a range that, with a stated probability, contains the true underlying Sharpe of the strategy. A 95% confidence interval for a reported Sharpe of 1.0 might be [0.3, 1.7], meaning that with 95% probability the strategy's true Sharpe lies in that range—and equivalently, that the observed value of 1.0 is statistically consistent with a wide range of underlying performance levels.
The interval depends on three inputs: the point estimate of the Sharpe, the sample size (number of return observations), and the higher moments of the return distribution (skewness and excess kurtosis, which affect the precision of the estimate beyond what the standard Gaussian assumption would give). Larger samples and more normal distributions both produce tighter intervals.
The standard formula for the asymptotic standard error of the Sharpe ratio (Lo, 2002) is SE(SR) = √((1 + ½ × SR² − γ × SR + (κ − 3) × SR² / 4) / n), where SR is the observed Sharpe, γ is skewness, κ is kurtosis, and n is the sample size. The 95% confidence interval is the point estimate plus or minus 1.96 × SE.
How it works
For a strategy with an observed Sharpe of 1.0, n = 36 monthly observations, near-normal returns (γ ≈ 0, κ ≈ 3), the standard error is √(1.5 / 36) ≈ 0.20. The 95% confidence interval is approximately [0.6, 1.4]. The same observed Sharpe of 1.0 from n = 360 monthly observations produces SE ≈ 0.06 and a 95% CI of approximately [0.88, 1.12]. The longer sample tightens the interval by approximately √(360/36) = √10 ≈ 3.2×—meaning the interval is roughly 3.2 times narrower with the longer sample.
Negative skewness and excess kurtosis both inflate the standard error. A strategy with the same observed Sharpe of 1.0 and n = 36 but γ = −1.0 and κ = 6 has SE ≈ √((1 + 0.5 + 1.0 + 0.75) / 36) ≈ 0.30—a 50% wider interval than the same Sharpe from a near-normal series. The tail behaviour of the return distribution affects the precision of the Sharpe estimate even when the headline number is identical.
The interval can be used in two ways. First, descriptively: it tells the investor how much trust to place in the headline figure (a Sharpe of 1.0 with a 95% CI of [0.3, 1.7] is consistent with a true Sharpe anywhere from underwhelming to impressive). Second, inferentially: it allows tests like "is the strategy's Sharpe statistically distinguishable from 0.5?"—yes if 0.5 lies outside the interval, no if it lies inside.
What the evidence shows
The implication of confidence intervals for backtested Sharpe ratios is sobering. A backtest of three years of monthly data (36 observations) cannot statistically distinguish a Sharpe of 1.5 from a Sharpe of 0.5; the confidence intervals overlap meaningfully. A backtest of even ten years (120 observations) cannot reliably distinguish similar strategies whose true Sharpe ratios differ by less than 0.3.
The widely-quoted finding from López de Prado and Bailey (2014) is that to reliably claim a Sharpe ratio above 1.0 (95% CI excluding 1.0), an investor needs approximately 100+ monthly observations, depending on the higher moments of the return distribution. For strategies with less-favourable distributional properties—non-zero skew, fat tails—the required sample is even larger.
The practical implication is that headline Sharpe ratios from short backtests should be heavily discounted. The point estimate may be impressive, but the underlying evidence often does not support the claim that the strategy's true Sharpe exceeds any reasonable alternative explanation.
Limitations and trade-offs
The confidence interval depends on the assumption that the return-generating process is stationary. In practice, regimes shift, market structure changes, and the strategy's underlying edge may decay or disappear over the sample window. The confidence interval describes the precision of the estimate within the sample; it does not address the question of whether the true Sharpe is the same in the future as it was in the past.
The asymptotic formula also breaks down for very small samples. For n < 20, the standard normal approximation underlying the 1.96 multiplier is no longer accurate, and a small-sample t-distribution adjustment is more appropriate. For n < 10, even the t-distribution adjustment leaves so much noise that the confidence interval is rarely informative.
For comparing strategies, the confidence interval on the Sharpe difference is more informative than the confidence intervals on each Sharpe separately. Two strategies whose individual confidence intervals overlap may still have a statistically significant Sharpe difference if the differences in their realised returns are correlated—a common pattern in factor strategies. The pairwise comparison is the right test, not the eyeball assessment of overlapping intervals.
Sharpe confidence intervals in pfolio
Confidence intervals on the Sharpe ratio are not currently displayed in pfolio Insights. The standard Sharpe ratio and the underlying return series are visible; investors who want to compute confidence intervals can do so externally using the sample size, skewness, and kurtosis of the return distribution.
Related articles
- Sharpe ratio explained: measuring risk-adjusted portfolio returns
- Probabilistic Sharpe ratio: testing whether a Sharpe is statistically distinguishable from a target
- Modified Sharpe ratio: Pezier-White adjustment for skewness and kurtosis
- Annualisation in investing: how monthly and daily figures are scaled to annual rates
Disclaimer
Get started now

