Phase 5 — eight strategies, all "no edge"

The autonomous-window phase (2026-05-15) attempted three strategy spikes on the 5-ticker × 3y window, then five more variants across the day. All eight ended in one of three categories.

The strategies attempted

Spike	Strategy	Honest mean Sharpe	Verdict
S20	refined-news (density × tone)	-3.90 ± 1.63 (21d), +2.55 ± 0.20 (180d)	scale-artifact post-L24 — collapsed to ~0
S21	quality-tilted low-volatility	+0.76 ± 4.96 (21d), +0.46 ± 1.86 (180d)	scale-invariance pathology of `negative_sharpe`
S26	PEAD (text-confirmed earnings surprise)	-0.06 ± 0.38 (21d), +1.34 (180d post-L24)	window-position artifact; 21d = 0, 180d = +1.34
S22	pure-price momentum diagnostic	mixed sign, single-seed	`shuffled_target` FAIL — harness regime trade-off

The three failure modes

Scale artifact. The strategy reported a strong Sharpe, but it came from feature-magnitude mismatch — some columns were 1000× larger than others, and the model latched onto magnitude rather than signal. S20 post-L24 was the canonical example: honest Sharpe collapsed from +2.55 to ~+1.0 with mean_pnl = 0 to 7 decimals once features were properly standardised.

Scale-invariance pathology. The strategy reported a non-zero Sharpe, but its actual P&L was zero to seven decimal places. Positions had collapsed to near-zero. negative_sharpe is scale-invariant, so the Sharpe was numerically defined but meaningless. S21 was the canonical example. Fixed structurally in Phase 6.E via position-aware losses (L32).

Window-position artifact. S26 looked great on a 180-day evaluation window (Sharpe +1.34) but produced near-zero Sharpe on a 21-day window over the same data (Sharpe +0.16). A real edge must be consistent across nested windows. This finding became the motivation for the window-stability test in Phase 6.B (L54 later confirmed the pattern recurs).

The harness regime trade-off

The whole leakage-test machinery had a regime problem no strategy escaped:

Configuration	`shuffled_target`	`look_ahead`	`future_news`
Standardised features (S20/S21/S26 post-L24)	PASS	FAIL (false neg.)	FAIL (false neg.)
Raw features (S22)	FAIL (single-seed noise or real leak)	PASS	PASS

No configuration produced all-4-PASS for any strategy. This re-framed the deliverable from "no measurable edge at this scale" to "leakage-test methodology needs Phase 6 recalibration". The fixes shipped in Phase 6.A-E.

What it meant

Three honest "no edge" verdicts, but the honest read was also that the harness itself had visible gaps that prevented confident scoring. The autonomous window's deliverable was the diagnosis, not a working strategy — which set up Phase 6 as the harness recalibration phase.

Phase 5 — eight strategies, all "no edge" ​

The strategies attempted ​

The three failure modes ​

The harness regime trade-off ​

What it meant ​

Phase 5 — eight strategies, all "no edge"

The strategies attempted

The three failure modes

The harness regime trade-off

What it meant