WOLFX Research · Updated 2026-04-24
Every strategy WOLFX trades has cleared a walk-forward 70/15/15 backtest with hard gates. Every strategy WOLFX considers but rejects is published here — the rejections are how you know the filter is real.
Running score: 4 PASS / 26 rounds (15 %). One documented near-miss (Round 13). FIVE LIVE strategies gauntleted post-deployment, ALL FIVE rejected (Rounds 21, 22, 23, 24, 25). Live-strategy validation campaign COMPLETE — zero gauntlet-validated live alpha.
| Round | Strategy | Test Sharpe | Shipped |
|---|---|---|---|
| 7 | Cross-Asset Futures Trend (ES/NQ/RTY/YM/6E/6J, 12-1 skip-month) | 0.895 | V167 · 2026-04-24 · whitepaper |
| 8 | VIX Contango Carry (short VXX / long VXZ, regime-gated) | 1.41 | V169 · 2026-04-24 · whitepaper |
| 14 | Overnight Drift Reversal (SPY MOC→MOO, 5d-intraday filter) | 1.229 | V174 · 2026-04-25 · whitepaper |
| 19 | Intraday VWAP Breakout (top-50 SP500, 50bp threshold, long+short) | 1.83 test / 0.47 full | V182 · 2026-04-28 · whitepaper TBD · GUARDED PASS — slice trajectory train 0.40 → val -0.60 → test 1.83 is regime-flip-shaped, not steady-edge-shaped. PF clears gate by 1bp (1.21 vs 1.20). Canary at 0.5% NAV (not spec'd 5%) + rolling-Sharpe kill-switch when 20-trade rolling Sharpe < 0 for 30 trades. |
(Round 15 intentionally skipped — FOMC Pre-Announcement Drift only fires 8 times/year, can't clear the v5 ≥50-trade gate. Proposed as a sizing multiplier on Round 14, not standalone.)
Both strategies are in 30-day paper-shadow canary via V170 scheduler. Flag flip to live execution happens only after rolling Sharpe ≥ 0.5 with no monthly drawdown > 3 %.
| Round | Strategy | Test Sharpe | Verdict | Finding |
|---|---|---|---|---|
| 1 | Overnight Gap Continuation | -8.11 | NO-GO | Signal evaporated when realistic fill costs applied. Infrastructure gap on premarket data. |
| 2 | Crypto Funding Arbitrage v1 | 3.15 IS / -5.68 OOS | NO-GO | Textbook overfit. Great in-sample, dead out-of-sample. |
| 3 | Crypto Funding v2 (long-only, z < -3) | 7.22 (spurious) | NO-GO | Forensic finding: the claimed 71.4 % WR was implicitly bundled with a "price at 20-day low" filter. The filter, not the funding signal, was doing the work. |
| 4 | Cointegration Pairs Trading | -1.51 | NO-GO | Mega-cap dispersion in 2025-2026 broke the cointegration assumptions that made this work in 2015. |
| 5 | Momentum + VIX Long/Short | -1.23 | NO-GO | Benign-VIX regimes produce short-squeeze spikes that shred the short leg. MaxDD 25 %. |
| 6 | Momentum long-only (decomposed) | +0.88 | NO-GO-drawdown | Sharpe ok, but MaxDD 12.85 % eats the risk budget. |
| — | PEAD — Post-Earnings Drift | -2.77 | HARD NO-GO | THE SIGNAL HAS INVERTED. A positive earnings surprise now predicts -0.61 % forward return. Classical decades-old anomaly is now anti-signal. |
| 9 | G10 FX Trend+Carry | 0.164 | NO-GO | Strategy lost money over 9 years. AUD/USD + USD/JPY profitable legs couldn't offset EUR/USD, GBP/USD, CHF, etc. |
| 10 | Commodity Basis Carry | -1.58 | NO-GO | Proxy rejected, not the underlying premium. Inverted-momentum-as-basis shorted the 2024-2026 gold/palladium rally. Real term-structure data required. |
| 11 | Treasury Curve Carry 2s10s (Yahoo futures ratio) | 0.54 test / -1.08 train | NO-GO | Test slice looked fine (3 of 4 gates pass) but strategy lost 18 % over the full 10-year window. Only worked post-QE. A regime bet, not a carry premium. |
| 12 | Treasury Curve Carry 2s10s (FRED daily yields, IEF/SHY) | -3.34 | NO-GO | Retested Round 11 with real yield data to falsify "maybe the proxy was the problem." Result: real yields were worse than the proxy. Full-window Sharpe -0.99, final NAV down 36 %. One trade alone (Oct 2023 short SHY at 3.78× weight) lost $44.8K when 2Y yields fell into the rate-cut cycle. The underlying signal — not the proxy — is wrong for the 2022-2026 hiking/cutting regime. |
| 13 | DXY Regime Switch (long-only, Variant B) | 1.20 test / 0.61 full | NEAR-MISS / NO-GO | Sharpe 1.20, PF 2.95, MaxDD -1.47 %, full-window Sharpe positive — every substantive gate clears with margin. Fails only on trade count (2 vs gate 20). The 20-trade gate is miscalibrated for a regime classifier that fires ~3 times per test slice by design. Honest verdict under strict rules: NO-GO. Under the same gate-calibration argument that Round 7 (Trend) accepted, this would be a PASS — that's a calibration decision, not a statistical one. Flagged for re-evaluation if the trade-count gate is recalibrated per signal class. |
| 16 | RRP-Driven Treasury Carry Reversal (FRED RRPONTSYD → SHY) | -0.89 test / 0.24 full | NO-GO | Test slice fails 3 of 5 gates (Sharpe -0.89, PF 0.88, only 29 trades). Walk-forward: train -0.03, val +1.81, test -0.89 — textbook in-sample-fit / out-of-sample-collapse. The validation slice caught the late-2023 RRP drain wave during the Fed pivot; the test slice is in a post-RRP-trough regime where the facility sits near zero and meaningful drains stop happening. Two side findings: (1) the v5 proposal had a units bug — RRPONTSYD is in billions, not millions; harness corrected; (2) the premium, if it existed, has likely been arbitraged away in the four years since Copeland-Duffie-Yang published. |
| 17 | Speculator Crowding Reversal (CFTC COT, 12 commodities cross-sectional) | 0.11 test / 0.13 full | NO-GO | Test PF 1.03 (gate 1.2). 444 legs — huge sample, so the near-zero Sharpe is statistically firm, not noise. Train slice ran -35% MaxDD during 2021-22 commodity supercycle when speculators stayed crowded long AND prices kept rising — exactly the regime Boons-Prado warn breaks the reversion. Recommendation was retest at 4-week hold (paper's documented 4-8w reversion window). |
| 17b | Speculator Crowding Reversal — 4w / 6w / 8w hold retest | -0.37 / -0.79 / -0.39 test | PERMANENT NO-GO | Tested at every horizon the paper documents. ALL THREE produce negative test-slice Sharpe (PF 0.93, 0.86, 0.93 — all below 1.0). Sample is 408-432 legs each — not noise. Full-window Sharpes 0.31-0.73 are positive (train+val carried edge), but the held-out test slice (~late-2024 → Apr 2026) flipped sign. Strategy is genuinely dead at retail-data scale. Two cleanly-separated explanations: (a) premium decay since paper's 1986-2018 sample (RFS publication + COT-factor ETFs likely arbitraged); (b) Yahoo continuous-futures roll noise. Action identical regardless: permanent shelf. Five horizons tested (1w/2w/4w/6w/8w) — strategy gets removed from future v7 alpha pipelines. |
| 18 | HY-OAS-Gated Put Credit Spreads (Israelov-Klein 2024, SPY/QQQ/IWM weekly) | -0.14 test (BS-on-RV) / +0.63 test (with +20% IV-VRP uplift) | NO-GO formal · R18b pending data | Formal NO-GO at Black-Scholes-on-realized-vol pricing (PF 0.94, full-window Sharpe -0.39). But agent's sensitivity test under defensible IV-over-RV pricing (Bakshi-Kapadia 2003, Israelov 2017) flips ALL 5 gates to pass: Sharpe 0.63, PF 1.30, full-window 0.31. The "failure" is a modeling artifact — BS-on-RV erases the very variance-risk premium the strategy harvests by construction. Reversible (distinct from R17b permanent shelf). Two data blockers: FRED API key (free, lifts HY OAS cap from 3yr to 25yr), historical options chains (Polygon $199/mo or OptionAlpha $99 one-time). Promote to R18b once procured. |
| 20 | GP/A Quality Long-Short (Novy-Marx 2013, sector-neutral SP500 quintiles) | -2.12 test / -0.42 full-window | NO-GO | The factor has inverted in 2024-2026. Train +0.12 → val -0.57 → test -2.12 is monotonic DECAY — the literal opposite of the v7-hypothesised monotonic improvement. New v7 trajectory gate (gate #8, added because of R19 regime-flip lesson) caught this — train alone looked benign, but val and test progression was the textbook overfit/regime-decay shape. Mechanism: top-quintile names (mag-7 quality leaders) kept rallying through 2024-25 while the SHORT LEG (bottom quintile distressed industrials / capital-heavy energy) bounced harder in the 2024-26 reflation. Within-sector neutralisation didn't save it. Joins PEAD and Momentum+VIX in the "classical equity factor that has inverted" pile. Permanent shelf as a single-factor signal; agent recommends regime-filtered variant for v8 (only fade junk when long-junk underperforming long-quality 6mo). Coverage bias: 314/503 SP500 had clean fundamentals (financials excluded — banks don't report COGS). |
| 21 | sniper_mean_reversion (LIVE WORKHORSE validation) | -0.34 test / -0.27 full-window | NO-GO — production is regime luck | Walk-forward over 2,973 trades (vs production's 48) showed the live engine's main alpha source is statistically noise + regime survivorship. Train Sharpe -0.41, Val +0.53, Test -0.34 — classic lucky-middle-slice. Production +$3,677 from PF 3.05 reflects an Oct-2023 → Apr-2024 mean-reversion-friendly regime; the 10-year backtest finds 49% WR, PF 0.91, -30% NAV. The 1.5 ATR stop = 1.5 ATR target with 49% WR is structurally a money-loser even before costs. Concrete actions: tighten conf gate 0.50→0.75 (extreme-only path), add per-strategy kill-switch on rolling 30-trade PF < 1.5, change R:R from 1:1 to 1:2, prune universe per-ticker. V186 ships the kill-switch + tightened conf gate. |
| 22 | news_alpha (LIVE strategy validation) | -1.47 test / -0.72 full-window | NO-GO — 6-trade production sample is noise | Walk-forward over 1,123 events (earnings as news-proxy) shows the strategy is unedge-bearing. Random-direction arm: 40.3% WR, PF 0.87, -18% NAV. Momentum-direction arm: 38.6% WR, PF 0.80, -28% NAV. Production "PF 99.90" / 100% WR was 6 coin flips heads in a row (p ≈ 1.6%). The 3% target / 2% stop asymmetry needs >40% WR + sentiment oracle margin > 15bps cost — both untestable from 6 live trades. The exit mechanics carry no inherent edge around news events. Pattern-matches R21 sniper_mean_reversion exposure: live strategies with tiny samples reflect regime + selection bias, not structural edge. V189 adds news_alpha kill-switch. |
| 23 | wolf_quantum_convergence (LIVE strategy validation) | +0.36 test / +0.38 full-window | NO-GO — regime-stable but edge-too-small after costs | Walk-forward over 2,471 trades shows the convergence mechanic produces a stable, weakly-positive distribution: train +0.45, val +0.03, test +0.36 — all same-sign. Different rejection class from R21/R22 (those were lucky-middle-slice patterns). PF 1.07 across the full window — real but tiny structural drift. Win rate 46.5% with 2:1 R:R bracket should print PF ~1.7 if WR were even 50%; the 5% target rarely hits in 5 days while losses cap at -2.5% stops. Mean trade +$8 on $10K notional after 10bps round-trip — statistical noise dominates. Critical caveat: 5 of 11 source weights (news/social/congress/insider/GDELT cascade) are non-replayable. If live edge exists, it lives entirely in those 5 — and we cannot validate it. Production "PF 99.9" / 4 trades is a 0.465^4 = 4.7% probability outcome under the actual distribution. V190 ships kill-switch + tightens V143 quantum bypass conf gate from 0.80 → 0.95. |
| 24 | forex_trend (LIVE OANDA validation) | +0.33 test / -0.39 full-window | NO-GO — anti-edge | The worst rejection yet. Walk-forward over 1,754 trades, 21 instruments, 2018-2026: full-window total return -99.3%, Sharpe -0.39, 7 of 9 years negative. 2018 -54.6%, 2019 -60.5%, 2020 -13.5%, 2021 -55.4%, 2022 -69.5%, 2023 -44.3%, 2024 -52.4%, 2025 +115.2% (the lone winner — single-regime tail event on USD weakness + gold blow-off), 2026 YTD -44.9%. Per-pair: only 6 of 21 instruments PF > 1.0; JPY-crosses + equity indices are toxic (JP225 PF 0.64, GBP_JPY 0.60). 42% WR at 1.6 R:R is breakeven; transaction costs push it under. ADX>20 floor admits whipsaw regimes. Recommendation: disable forex_trend immediately. V191 ships kill-switch (WOLFX_FOREX_TREND_ENABLED). The 2025 P&L that justified deployment was tail-luck on regime persistence. |
| 25 | forex_reversion (LIVE OANDA validation, 5/5 strategies tested) | +0.87 test / +0.32 full-window | NO-GO — short-vol blowup | The closest call of all 26 rounds. Walk-forward over 955 trades 2018-2026: passed 6 of 7 v7 gates including Sharpe 0.87, PF 1.42, +0.32 full-window. The single fail: Test MaxDD 37.4% (gate 15%), full-window MaxDD 76%. Cause: 2020 COVID year produced -61.3% over 169 trades — RSI<30 + lower-band signals kept firing into the vertical drop. Strategy is structurally short-volatility — sells tails, tails eventually get paid. Different rejection class: real per-pair edge does exist (EUR_GBP PF 3.55, EUR_JPY 1.72, XAG_USD 1.67, GBP_JPY 1.59). Mean reversion works on JPY crosses and silver, fails on gold/SPX/NZD/USD_CHF. Cannot be deployed without volatility regime filters + dynamic sizing — the live system has neither. MAJOR IMPLICATION: ALL FIVE live strategies gauntleted, ALL FIVE rejected. Zero gauntlet-validated live alpha. Recommendation per agent: halt live execution; convert engine to research/shadow harness until a strategy passes the gauntlet from scratch. The four shadow strategies (Trend, VIX Carry, Overnight Drift, VWAP Breakout) remain the only credible pipeline. |
Variant B (mean-reversion-filtered overnight long): test Sharpe 1.229, MaxDD -8.36 %, PF 1.25, 163 trades, full-window Sharpe 0.447. All five v5 gates cleared. Train/Val/Test Sharpe 0.51 / 1.35 / 1.23 — clean OOS pattern, no overfit signature. Variant A (always-on overnight long, no filter) NO-GO at PF 1.09 — confirms the filter is doing real work.
The strategy fills the high-frequency gap that v4 was structurally unable to test. Diversifies cleanly from Trend (monthly futures momentum) and VIX Carry (monthly vol selling).
---
WOLFX publishes every signal and every realized fill. Past performance, including walk-forward backtest performance, is not predictive of future results. Every strategy here is informational — nothing is investment advice.