How do you know if a backtest result is overfit or genuinely profitable?

Check the overfitting scorecard: compare in-sample vs out-of-sample Sharpe ratio (OOS/IS should be above 0.7), test parameter stability by varying each parameter ±20%, run the strategy on similar assets (cross-asset test), and complete walk-forward analysis with efficiency above 0.5. Red flags include Sharpe above 3.0, win rate above 80%, and performance that collapses with small parameter changes.

What is walk-forward analysis and why is it better than simple backtesting?

Walk-forward analysis repeatedly optimizes strategy parameters on a training window, then validates on the next unseen window, rolling forward through time. The out-of-sample results are chained together to create a realistic equity curve. It is better than simple backtesting because it prevents look-ahead bias in parameter selection and provides walk-forward efficiency (WFE) -- the ratio of out-of-sample to in-sample performance -- as a robust measure of strategy quality.

What is the minimum number of trades needed to validate a trading strategy?

A minimum of 100 trades is the general rule for strategy validation. For parameter optimization, you need at least 20 times the number of parameters in trades. Additionally, the strategy must pass a t-test for statistical significance (p-value below 0.05) and bootstrap confidence intervals for the Sharpe ratio should exclude zero. The backtest should also cover different market regimes including bull, bear, and sideways conditions.

Backtesting & Validation

This skill provides the complete backtesting and strategy validation pipeline — from historical data collection through statistical validation of results. Backtesting is the process of testing a trading strategy on historical data before risking real capital. Done correctly, it provides evidence of a strategy’s edge. Done incorrectly, it produces dangerously misleading results due to overfitting, look-ahead bias, and other pitfalls. This skill teaches rigorous backtesting methodology, including walk-forward analysis, Freqtrade integration, paper trading, and the statistical tests needed to distinguish genuine edge from random chance.

When to Use This Skill

When testing a new trading strategy before deploying capital
When evaluating the historical performance of any strategy
When determining if backtest results are statistically significant
When checking for overfitting in optimized strategy parameters
When transitioning from backtest to paper trading to live trading
When configuring Freqtrade or other backtesting frameworks
When calculating performance metrics (Sharpe, Sortino, max drawdown, etc.)
When comparing multiple strategy variants to select the best one
When a user presents backtest results and wants them validated

What This Skill Does

Backtesting Pipeline: Guides through data collection, strategy coding, execution simulation, and performance analysis
Bias Detection: Identifies look-ahead bias, survivorship bias, overfitting, and other common pitfalls
Walk-Forward Analysis: Implements in-sample optimization with out-of-sample validation on rolling windows
Freqtrade Integration: Provides strategy templates, backtesting commands, and hyperopt configuration
Paper Trading: Defines methodology for transitioning from backtest to paper to live
Performance Metrics: Calculates and interprets Sharpe, Sortino, Calmar, max drawdown, win rate, profit factor, expectancy
Statistical Significance: Determines minimum sample size, bootstrap confidence intervals, and Monte Carlo analysis
Overfitting Detection: Compares in-sample vs. out-of-sample degradation and parameter sensitivity

How to Use

Run a Backtest

Backtest a momentum crossover strategy (EMA 9/21) on BTC daily data for the past 2 years

Test my mean-reversion strategy on ETH: buy when z-score < -2, sell at -0.5

Validate Results

I got a Sharpe of 2.5 on my backtest -- is this realistic or am I overfitting?

Run walk-forward validation on my strategy -- 6-month in-sample, 2-month out-of-sample

Freqtrade Setup

Create a Freqtrade strategy template for a Bollinger Band reversion trade

Configure hyperopt for my Freqtrade strategy

Paper Trading

My backtest looks good. What's the paper trading plan before going live?

Data Sources

With MCP/CLI tools connected:

Freqtrade CLI — Strategy backtesting, hyperopt optimization, paper trading, live trading
Empyrical MCP — Performance metrics calculation (Sharpe, Sortino, max drawdown, VaR, etc.)
yFinance MCPs (tooyipjee, maxscheijen, Adity-star) — Historical price and volume data
Binance MCP — Historical crypto data, OHLCV candles
OpenBB CLI — Comprehensive financial data, backtesting frameworks

Without tool access: Ask the user to provide:

Strategy rules (entry, exit, position sizing)
Historical data or source
Time period for the test
Initial capital and commission assumptions
Any existing backtest results to validate

Proceed with methodology guidance and manual analysis of provided results.

Methodology

Step 1: The Backtesting Pipeline

COMPLETE BACKTESTING WORKFLOW:

  ┌──────────────────────────────────────────────────────────────┐
  │ Phase 1: DATA COLLECTION                                      │
  │ > Gather clean OHLCV data for target asset(s)                 │
  │ > Check for data quality: gaps, splits, survivorship          │
  │ > Minimum: 2 years daily data (500+ bars) or equivalent       │
  ├──────────────────────────────────────────────────────────────┤
  │ Phase 2: STRATEGY CODING                                      │
  │ > Define exact entry/exit rules (no ambiguity)                │
  │ > Define position sizing rules                                │
  │ > Include commission and slippage modeling                     │
  │ > Include realistic execution assumptions                     │
  ├──────────────────────────────────────────────────────────────┤
  │ Phase 3: IN-SAMPLE BACKTEST                                   │
  │ > Run on training data (60-70% of total)                      │
  │ > Optimize parameters if needed                               │
  │ > Record ALL metrics                                          │
  ├──────────────────────────────────────────────────────────────┤
  │ Phase 4: OUT-OF-SAMPLE VALIDATION                             │
  │ > Run EXACT same strategy (no re-optimization) on held-out data│
  │ > Compare IS vs OOS metrics                                   │
  │ > Acceptable degradation: < 30% decline in key metrics         │
  ├──────────────────────────────────────────────────────────────┤
  │ Phase 5: WALK-FORWARD ANALYSIS                                │
  │ > Repeat IS/OOS on rolling windows                            │
  │ > Chain OOS results to create a realistic equity curve         │
  ├──────────────────────────────────────────────────────────────┤
  │ Phase 6: STATISTICAL VALIDATION                               │
  │ > Minimum sample size check                                   │
  │ > Bootstrap confidence intervals                              │
  │ > Monte Carlo simulation for worst-case drawdown              │
  ├──────────────────────────────────────────────────────────────┤
  │ Phase 7: PAPER TRADING                                        │
  │ > Minimum 30 trades or 3 months (whichever is longer)         │
  │ > Compare paper results to backtest expectations               │
  │ > If within tolerance → proceed to live                       │
  ├──────────────────────────────────────────────────────────────┤
  │ Phase 8: LIVE DEPLOYMENT (at reduced size)                    │
  │ > Start at 25-50% of target size                              │
  │ > Scale up over 3-6 months if results confirm                 │
  └──────────────────────────────────────────────────────────────┘

Step 2: Bias Detection

Look-Ahead Bias

LOOK-AHEAD BIAS: Using future information in trading decisions.

COMMON SOURCES:
  1. Using close price to make decisions that execute at close price
     FIX: Use close for signals, execute at next bar's open
  2. Using data that wasn't available at the time (revised earnings, restated GDP)
     FIX: Use point-in-time data only
  3. Indicator calculation using future data (centered moving averages)
     FIX: Use only trailing indicators
  4. Filling missing data with future values (forward fill from future)
     FIX: Use only backward fill or drop missing periods
  5. Using survivorship-bias-free universes retroactively
     FIX: Use constituent lists as of each historical date

DETECTION CHECKLIST:
  - [ ] Signals generated BEFORE the bar closes, execution at NEXT bar open?
  - [ ] All data used was available at the time of the signal?
  - [ ] No forward-looking indicators (centered averages, etc.)?
  - [ ] Universe of assets reflects what was available at each point in time?
  - [ ] Dividend adjustments applied correctly (backward, not forward)?

Survivorship Bias

SURVIVORSHIP BIAS: Only testing on assets that survived to the present.

IMPACT: Overstates returns by excluding bankruptcies, delistings, and failed projects.

EXAMPLES:
  Stocks: Testing momentum on today's S&P 500 members ignores stocks that were
          removed due to decline. Adds ~1-2% annual bias.
  Crypto: Testing on today's top 50 coins ignores hundreds of projects that went
          to zero. Adds potentially 5-10%+ annual bias.

MITIGATION:
  1. Use survivorship-bias-free databases (CRSP for stocks, CoinGecko full history)
  2. Include delisted assets in the backtest universe
  3. For crypto: Account for tokens that went to zero
  4. Clearly note if survivorship bias cannot be eliminated

Overfitting

OVERFITTING: Strategy is tuned to historical noise, not signal.

WARNING SIGNS:
  - Very specific parameters (e.g., EMA(17) instead of EMA(20))
  - Many parameters (> 5 free parameters for a simple strategy)
  - Strategy only works on one asset or one time period
  - In-sample Sharpe >> out-of-sample Sharpe
  - Strategy fails on similar but different data
  - Equity curve is unrealistically smooth
  - Win rate > 80% (suspicious unless strategy has very tight stops)

OVERFITTING DETECTION:
  Metric Degradation Test:
    OOS_Sharpe / IS_Sharpe > 0.7 → Likely robust
    OOS_Sharpe / IS_Sharpe = 0.5-0.7 → Possible overfitting
    OOS_Sharpe / IS_Sharpe < 0.5 → Likely overfit

  Parameter Stability Test:
    Vary each parameter ±20%
    If performance collapses → overfit to exact parameter
    If performance degrades gracefully → more likely robust

  Cross-Asset Test:
    Run the same strategy on similar assets (e.g., BTC strategy on ETH)
    If it works → strategy captures a general pattern
    If it fails → may be overfit to that specific asset

Step 3: Walk-Forward Analysis

WALK-FORWARD METHOD:

  CONCEPT: Repeatedly optimize on a window, validate on the next window,
  then roll forward. Chain the OOS results for a realistic performance estimate.

  WINDOW CONFIGURATION:
    | Strategy Frequency | IS Window    | OOS Window   | Step Size   |
    |-------------------|-------------|-------------|-------------|
    | Daily trades      | 6 months     | 2 months     | 2 months    |
    | Weekly trades     | 12 months    | 3 months     | 3 months    |
    | Monthly trades    | 24 months    | 6 months     | 6 months    |

  STEP-BY-STEP:
    Window 1: IS = months 1-6,   OOS = months 7-8    → Record OOS metrics
    Window 2: IS = months 3-8,   OOS = months 9-10   → Record OOS metrics
    Window 3: IS = months 5-10,  OOS = months 11-12  → Record OOS metrics
    ...continue rolling forward...

    Final equity curve = chain of ALL OOS results (no IS data in the curve)

  VALIDATION CRITERIA:
    Walk-forward efficiency = Average(OOS_Metric) / Average(IS_Metric)
    WFE > 0.5: Acceptable → strategy has real edge
    WFE > 0.7: Good → strategy is robust
    WFE > 0.85: Excellent → very robust strategy
    WFE < 0.3: Poor → strategy is likely overfit, do NOT deploy

  MINIMUM WINDOWS:
    At least 5 walk-forward windows for statistical relevance
    Preferably 8-12 windows covering different market conditions

Step 4: Freqtrade Integration

Strategy Template

# Freqtrade Strategy Template
# Save as: user_data/strategies/MyStrategy.py

from freqtrade.strategy import IStrategy, merge_informative_pair
from pandas import DataFrame
import talib.abstract as ta

class MyStrategy(IStrategy):
    # Strategy parameters
    INTERFACE_VERSION = 3
    timeframe = '1h'

    # Position management
    stoploss = -0.05           # 5% stop loss
    trailing_stop = True
    trailing_stop_positive = 0.01
    trailing_stop_positive_offset = 0.03
    trailing_only_offset_is_reached = True

    # ROI table (take profit at these levels)
    minimal_roi = {
        "0": 0.10,    # 10% immediate target
        "30": 0.05,   # 5% after 30 minutes
        "60": 0.03,   # 3% after 60 minutes
        "120": 0.01   # 1% after 120 minutes
    }

    def populate_indicators(self, dataframe: DataFrame, metadata: dict) -> DataFrame:
        # Add your indicators here
        dataframe['ema_9'] = ta.EMA(dataframe, timeperiod=9)
        dataframe['ema_21'] = ta.EMA(dataframe, timeperiod=21)
        dataframe['rsi'] = ta.RSI(dataframe, timeperiod=14)
        dataframe['adx'] = ta.ADX(dataframe, timeperiod=14)
        dataframe['volume_sma'] = ta.SMA(dataframe['volume'], timeperiod=20)
        return dataframe

    def populate_entry_trend(self, dataframe: DataFrame, metadata: dict) -> DataFrame:
        dataframe.loc[
            (
                (dataframe['ema_9'] > dataframe['ema_21']) &    # EMA crossover
                (dataframe['adx'] > 25) &                        # Trend confirmed
                (dataframe['rsi'] < 70) &                        # Not overbought
                (dataframe['volume'] > dataframe['volume_sma'])  # Volume confirmation
            ),
            'enter_long'] = 1
        return dataframe

    def populate_exit_trend(self, dataframe: DataFrame, metadata: dict) -> DataFrame:
        dataframe.loc[
            (
                (dataframe['ema_9'] < dataframe['ema_21']) |     # EMA cross back
                (dataframe['rsi'] > 80)                          # Overbought
            ),
            'exit_long'] = 1
        return dataframe

Backtesting Commands

# Download historical data
freqtrade download-data --exchange binance --pairs BTC/USDT ETH/USDT SOL/USDT \
  --timeframe 1h --days 730

# Run backtest
freqtrade backtesting --strategy MyStrategy \
  --timeframe 1h \
  --timerange 20240101-20250101 \
  --enable-protections

# Run with detailed trade list
freqtrade backtesting --strategy MyStrategy \
  --timeframe 1h \
  --timerange 20240101-20250101 \
  --export trades --export-filename user_data/backtest_results/my_strategy.json

# Run hyperopt (parameter optimization)
freqtrade hyperopt --strategy MyStrategy \
  --hyperopt-loss SharpeHyperOptLoss \
  --spaces buy sell roi stoploss \
  --epochs 500 \
  --timerange 20240101-20241001  # IS period only!

# Paper trading (dry run)
freqtrade trade --strategy MyStrategy \
  --config user_data/config.json \
  --dry-run

Hyperopt Configuration

# Add to strategy class for hyperopt
from freqtrade.strategy import DecimalParameter, IntParameter

class MyStrategy(IStrategy):
    # Hyperopt parameters -- keep these to a minimum (< 6)
    buy_ema_short = IntParameter(5, 15, default=9, space='buy')
    buy_ema_long = IntParameter(15, 30, default=21, space='buy')
    buy_adx_threshold = IntParameter(20, 35, default=25, space='buy')
    sell_rsi_threshold = IntParameter(70, 85, default=80, space='sell')

    # WARNING: More parameters = higher overfitting risk
    # Rule of thumb: max parameters = sqrt(number_of_trades) / 2
    # 100 trades → max 5 parameters
    # 400 trades → max 10 parameters

Step 5: Performance Metrics

ESSENTIAL PERFORMANCE METRICS:

┌──────────────────────────────────────────────────────────────────┐
│ Metric             │ Formula                    │ Benchmark       │
├──────────────────────────────────────────────────────────────────┤
│ Sharpe Ratio       │ (Avg Return - Rf) / StdDev │ > 1.0 acceptable│
│                    │ Annualized: SR × √252       │ > 1.5 good      │
│                    │                             │ > 2.0 excellent  │
│                    │                             │ > 3.0 suspicious │
├──────────────────────────────────────────────────────────────────┤
│ Sortino Ratio      │ (Avg Return - Rf) / DownDev │ > 1.5 acceptable│
│                    │ (Uses only downside deviation)│ > 2.0 good     │
├──────────────────────────────────────────────────────────────────┤
│ Calmar Ratio       │ Annual Return / Max Drawdown │ > 1.0 acceptable│
│                    │                              │ > 2.0 good      │
├──────────────────────────────────────────────────────────────────┤
│ Max Drawdown       │ Largest peak-to-trough decline│ < 20% acceptable│
│                    │                               │ < 10% good      │
├──────────────────────────────────────────────────────────────────┤
│ Win Rate           │ Winning trades / Total trades │ Depends on R:R  │
│                    │                               │ 40%+ for 2:1 R:R│
│                    │                               │ 55%+ for 1:1 R:R│
├──────────────────────────────────────────────────────────────────┤
│ Profit Factor      │ Gross Profits / Gross Losses  │ > 1.5 acceptable│
│                    │                               │ > 2.0 good      │
├──────────────────────────────────────────────────────────────────┤
│ Expectancy         │ (Win% × AvgWin) - (Loss% × AvgLoss) │ Must be > 0│
│                    │ Per trade expected value       │                │
├──────────────────────────────────────────────────────────────────┤
│ Risk of Ruin       │ P(account drawdown > X%)      │ < 5% acceptable │
│                    │ Based on win rate, payoff, risk│               │
├──────────────────────────────────────────────────────────────────┤
│ Recovery Factor    │ Net Profit / Max Drawdown      │ > 3.0 acceptable│
└──────────────────────────────────────────────────────────────────┘

METRIC RED FLAGS:
  Sharpe > 3.0 → Almost certainly overfit or biased
  Win rate > 80% with tight stops → Likely look-ahead bias
  Max drawdown < 5% on a 2-year test → Unrealistic
  No losing months → Extreme red flag
  Profit factor > 5.0 → Likely overfit

Step 6: Statistical Significance

MINIMUM TRADE COUNT:
  The minimum number of trades needed depends on the strategy's win rate:

  | Win Rate | Min Trades (95% confidence) | Min Trades (99% confidence) |
  |----------|---------------------------|---------------------------|
  | 40%      | ~60 trades                | ~100 trades               |
  | 50%      | ~100 trades               | ~150 trades               |
  | 60%      | ~60 trades                | ~100 trades               |

  Rule of thumb: MINIMUM 100 trades for any strategy validation
  For parameter optimization: Min trades = 20 × number_of_parameters

T-TEST FOR STRATEGY EDGE:
  H0: Average return per trade = 0 (no edge)
  H1: Average return per trade > 0 (strategy has edge)

  t = (Mean_Return × √n) / StdDev_Return
  p-value < 0.05 → Strategy has statistically significant edge

BOOTSTRAP CONFIDENCE INTERVAL:
  1. Resample trades with replacement (10,000 iterations)
  2. Calculate metric (Sharpe, expectancy) for each resample
  3. Sort results, take 2.5th and 97.5th percentile → 95% CI

  If 95% CI for Sharpe includes 0 → strategy may NOT have real edge
  If 95% CI for Sharpe is entirely > 0 → strategy likely has edge

MONTE CARLO WORST-CASE DRAWDOWN:
  1. Randomly shuffle trade sequence (10,000 iterations)
  2. Calculate max drawdown for each sequence
  3. 95th percentile of drawdowns = expected worst-case drawdown

  Ensure your risk tolerance can handle the Monte Carlo worst-case drawdown
  If worst-case drawdown > 40% → strategy needs better risk management

Step 7: Paper Trading Protocol

PAPER TRADING TRANSITION:

  REQUIREMENTS TO START PAPER TRADING:
    - Backtest passes walk-forward validation (WFE > 0.5)
    - Minimum 100 trades in backtest
    - Statistical significance confirmed (p < 0.05)
    - Sharpe ratio > 1.0 (out-of-sample)
    - Max drawdown within tolerance (see risk-management)
    - Strategy coded with no manual overrides

  PAPER TRADING DURATION:
    Minimum: 30 trades OR 3 months, whichever is LONGER
    Preferred: 50+ trades or 6 months

  PAPER TRADING VALIDATION:
    Compare paper results to backtest expectations:

    | Metric         | Tolerance vs Backtest       |
    |----------------|---------------------------|
    | Sharpe Ratio   | > 50% of backtest Sharpe   |
    | Win Rate       | Within ±10% absolute       |
    | Avg Win/Loss   | Within ±20% relative       |
    | Max Drawdown   | Within 1.5× backtest DD    |
    | Trade Frequency| Within ±30% of expected    |

    If ALL metrics within tolerance → APPROVED for live
    If 1-2 metrics outside tolerance → Investigate cause, extend paper period
    If 3+ metrics outside → Strategy likely overfit, return to development

  LIVE DEPLOYMENT SCHEDULE:
    Month 1: 25% of target size
    Month 2: 50% of target size (if month 1 is within tolerance)
    Month 3: 75% of target size
    Month 4+: 100% of target size
    At any point: If performance falls outside tolerance → reduce to 25% and reassess

Step 8: Overfitting Detection Deep Dive

OVERFITTING SCORECARD:

  Test                              Score    Result
  ─────────────────────────────────────────────────────
  IS vs OOS Sharpe ratio > 0.7      0-20    ___
  Parameter stability (±20% test)   0-20    ___
  Cross-asset validation            0-20    ___
  Walk-forward efficiency > 0.5     0-20    ___
  Number of parameters ≤ 5          0-10    ___
  Trade count > 100                 0-10    ___
  ─────────────────────────────────────────────────────
  TOTAL                              /100    ___

  INTERPRETATION:
    80-100: Low overfitting risk → proceed to paper trading
    60-79:  Moderate risk → simplify strategy, re-test
    40-59:  High risk → significant overfitting suspected
    < 40:   Very high risk → strategy is almost certainly overfit

DEFLATED SHARPE RATIO (DSR):
  Accounts for the number of strategy variations tried:
  DSR = Sharpe × (1 - N_trials / (4 × Sharpe^2 × T))
  Where: N_trials = number of strategy variations tested
         T = number of return observations
  If DSR < 0 → The observed Sharpe can be explained by random trials alone

Anti-Patterns

DO NOT do these — they produce misleading backtest results:

Optimizing on the full dataset: If you use all data for optimization, there is no unseen data to validate on. Always split IS/OOS BEFORE any optimization.
Peeking at OOS data during development: Once you look at OOS results and then modify the strategy, the OOS data is contaminated. It becomes IS data. Reserve OOS strictly.
Ignoring transaction costs and slippage: A strategy that makes 0.1% per trade looks great until you realize 0.05% goes to commissions and 0.03% to slippage. Model costs realistically.
Assuming perfect fills: In reality, limit orders may not fill, market orders have slippage, and illiquid assets have wide spreads. Add 1-2 ticks of slippage per trade.
Testing on a single asset and time period: A strategy that works on BTC 2023-2024 may fail on BTC 2022 or on ETH. Test across assets and periods.
Too many parameters: Every parameter you add doubles the search space. More than 5 parameters for a simple strategy is almost guaranteed overfitting.
Confusing backtest Sharpe with live Sharpe: Backtests overstate Sharpe by 30-50% on average. If your backtest Sharpe is 1.5, expect 0.75-1.0 live.
Deploying without paper trading: Skipping paper trading means you discover data issues, execution problems, and strategy flaws with real money. Always paper trade first.

Validation Checkpoints

Input Validation

Strategy rules fully defined (no ambiguity in entry, exit, sizing)
Historical data quality verified (no gaps, correct adjustments)
Data period includes different market regimes (bull, bear, sideways)
Transaction costs and slippage modeled realistically
Survivorship bias addressed or acknowledged
Look-ahead bias checked with the detection checklist

Output Validation

Example

User: “I built a momentum strategy for BTC — EMA 9/21 crossover with ADX filter. The backtest shows Sharpe 2.8 and 75% win rate over 2 years. Is this good?”

Analysis:

## Backtest Validation: BTC EMA 9/21 + ADX Strategy

### Initial Assessment -- Red Flags Check
- Sharpe 2.8: SUSPICIOUS -- backtested Sharpe > 2.5 is almost always overfit
- Win rate 75%: SUSPICIOUS -- momentum strategies typically have 35-50% win rates
- These metrics together suggest possible overfitting or look-ahead bias

### Bias Checklist
- [ ] Look-ahead bias: Was entry on the bar AFTER the crossover signal?
  → Common error: entering on the same bar the EMA crosses
  → FIX: Signal on bar N, enter at open of bar N+1
- [ ] Survivorship bias: N/A (single asset BTC)
- [ ] Slippage modeled? If no → add 0.1% per trade for crypto
- [ ] Commissions modeled? If no → add 0.1% per trade (0.05% each side)

### What Realistic Metrics Look Like
For an EMA crossover momentum strategy on BTC:
| Metric       | Suspicious (reported) | Realistic range  |
|-------------|----------------------|------------------|
| Sharpe      | 2.8                  | 0.8 - 1.5        |
| Win Rate    | 75%                  | 35-50%           |
| Profit Factor| (unknown)           | 1.3 - 2.0        |
| Max Drawdown | (unknown)           | 15-35%           |

### Recommended Validation Steps
1. FIX ENTRY TIMING: Ensure execution is at next bar open, not signal bar close
2. ADD COSTS: Apply 0.1% round-trip (0.05% entry + 0.05% exit) + 0.05% slippage
3. SPLIT DATA: Use 2022-01 to 2024-06 as IS, 2024-07 to 2025-01 as OOS
4. RUN WALK-FORWARD: 6-month IS, 2-month OOS, rolling
5. CROSS-ASSET TEST: Run same strategy on ETH and SOL
6. PARAMETER SENSITIVITY: Test EMA(7/19), EMA(9/21), EMA(11/23) -- if results
   collapse for nearby parameters, the strategy is overfit to EMA(9/21)

### Expected Outcome After Corrections
After fixing look-ahead bias and adding costs, expect:
- Sharpe: 0.8-1.2 (down from 2.8)
- Win rate: 40-50% (down from 75%)
- Max drawdown: 20-30%
- If OOS Sharpe > 0.7 after walk-forward → strategy may have real edge
- If OOS Sharpe < 0.5 → strategy is likely overfit

### Verdict
DO NOT deploy this strategy until the validation steps above are completed.
The reported metrics are almost certainly inflated. The strategy may still
have a real edge, but it needs rigorous validation to prove it.