08 May 2026 阅读时间 2 分钟 Backtest

Why Backtests Lie

A practical checklist for avoiding overfit crypto trading research.

Backtests are useful, but they are also excellent at telling traders exactly what they want to hear. A clean curve can hide overfitting, missing fees, weak sample design, market-regime luck, or a signal that only worked because it accidentally learned the past.

This note is research only. It is not financial advice, not an instruction to trade, and backtests are not live performance.

The first problem is selection

A strategy can look strong because the research process selected the best window, the best pair, the best timeframe, or the best parameter set after seeing history. That does not mean the strategy found durable market behavior. It may only mean the researcher searched enough combinations to find a curve that looked convincing.

The antidote is not one more optimization run. The antidote is a gate that forces the candidate to survive data it did not tune against.

The second problem is cost

Crypto strategies often look better before fees, spread, funding, slippage, latency, and failed execution are included. A strategy with many small wins can turn fragile once realistic costs are added. A backtest that ignores costs is not conservative evidence; it is a draft hypothesis.

Every serious test should state the fee model, the assumed spread or slippage, and whether funding or exchange mechanics matter for the market being tested.

The third problem is regime luck

Some signals only work in one kind of market. A long-biased strategy can look brilliant during a broad rally. A mean-reversion rule can look stable until volatility expands. A breakout system can look dead until trend conditions return.

The right question is not only whether the full-period result is positive. The better question is which market regimes produced the result, and which regimes damaged it.

A practical checklist

Split research into in-sample and out-of-sample windows.
Keep a fresh window that is not used for parameter selection.
Count trades; do not trust a result built on a tiny sample.
Check fees, slippage, spread, and funding assumptions.
Slice results by pair, timeframe, direction, volatility, and trend regime.
Inspect drawdown shape, not only final profit.
Compare against a simple baseline.
Record why a candidate was blocked, not only why it looked promising.

What a blocked candidate teaches

A blocked candidate is not a failure if it leaves a clear reason behind. It can teach that the signal is regime-specific, that fees dominate the edge, that the sample is too small, or that a filter is removing the very trades that carried the result.

That is why ProBitForge treats risk logs as first-class research output. As explained in What ProBitForge Is Building, the system boundary matters because research notes should support validation, not bypass it. A blocked idea with a clean failure reason is more useful than a lucky backtest that cannot explain itself.

Operating rule

Backtest performance is not live performance. A candidate should move forward only after it survives out-of-sample checks, realistic costs, drawdown review, and a risk gate that can block it before it reaches execution.

Research only. Not financial advice. Not an instruction to trade.