Skip to content

Phase 5 — eight strategies, all "no edge"

The autonomous-window phase (2026-05-15) attempted three strategy spikes on the 5-ticker × 3y window, then five more variants across the day. All eight ended in one of three categories.

The strategies attempted

SpikeStrategyHonest mean SharpeVerdict
S20refined-news (density × tone)-3.90 ± 1.63 (21d), +2.55 ± 0.20 (180d)scale-artifact post-L24 — collapsed to ~0
S21quality-tilted low-volatility+0.76 ± 4.96 (21d), +0.46 ± 1.86 (180d)scale-invariance pathology of negative_sharpe
S26PEAD (text-confirmed earnings surprise)-0.06 ± 0.38 (21d), +1.34 (180d post-L24)window-position artifact; 21d = 0, 180d = +1.34
S22pure-price momentum diagnosticmixed sign, single-seedshuffled_target FAIL — harness regime trade-off

The three failure modes

Scale artifact. The strategy reported a strong Sharpe, but it came from feature-magnitude mismatch — some columns were 1000× larger than others, and the model latched onto magnitude rather than signal. S20 post-L24 was the canonical example: honest Sharpe collapsed from +2.55 to ~+1.0 with mean_pnl = 0 to 7 decimals once features were properly standardised.

Scale-invariance pathology. The strategy reported a non-zero Sharpe, but its actual P&L was zero to seven decimal places. Positions had collapsed to near-zero. negative_sharpe is scale-invariant, so the Sharpe was numerically defined but meaningless. S21 was the canonical example. Fixed structurally in Phase 6.E via position-aware losses (L32).

Window-position artifact. S26 looked great on a 180-day evaluation window (Sharpe +1.34) but produced near-zero Sharpe on a 21-day window over the same data (Sharpe +0.16). A real edge must be consistent across nested windows. This finding became the motivation for the window-stability test in Phase 6.B (L54 later confirmed the pattern recurs).

The harness regime trade-off

The whole leakage-test machinery had a regime problem no strategy escaped:

Configurationshuffled_targetlook_aheadfuture_news
Standardised features (S20/S21/S26 post-L24)PASSFAIL (false neg.)FAIL (false neg.)
Raw features (S22)FAIL (single-seed noise or real leak)PASSPASS

No configuration produced all-4-PASS for any strategy. This re-framed the deliverable from "no measurable edge at this scale" to "leakage-test methodology needs Phase 6 recalibration". The fixes shipped in Phase 6.A-E.

What it meant

Three honest "no edge" verdicts, but the honest read was also that the harness itself had visible gaps that prevented confident scoring. The autonomous window's deliverable was the diagnosis, not a working strategy — which set up Phase 6 as the harness recalibration phase.

OpenBracket v0.6 — methodology release-ready; v1 forecaster in active build.