Project goal
stock-core is a research project for differentiable trading on Indian large-caps. The goal is to train a single neural model end-to-end against a profit-and-loss objective — with gradients flowing through the backtest itself — and produce three personality-tuned strategies: return-max, Sharpe-max, drawdown-averse. Each strategy emits next-day positions for a small universe of stocks.
The training signal is literally "did this position make money?", and gradients flow backward through the entire backtest into the model's weights. No human-written trading rules, no separation between feature-extraction and execution — one end-to-end differentiable system.
Universe
The seed universe was five large NSE-listed stocks chosen for liquidity and news coverage:
RELIANCE.NS, TCS.NS, HDFCBANK.NS, INFY.NS, ITC.NS
In Phase 7 the universe was expanded to 46 Nifty 50 constituents (Nifty 50 minus 4 with missing OHLCV or zero news matches) to test the "memorisation-as-blocker" hypothesis flagged by L41-L43.
The honest expectation — the realistic Sharpe ceiling
A realistic post-cost Sharpe at this scale, given the published academic literature on Indian-equity factors, is between 0.4 and 0.7 measured over 18+ years of data. Anything above 1.0 is suspicious and should be investigated as a bug, not celebrated as a result. That bar is baked into every test in this repo as L23.
Sharpe standard error at the 5-ticker × 21-day-holdout scale is approximately 1.55, which means the project's evaluation power is below the published edge resolution even when an edge exists. This is one of the reasons the methodology — leakage-clean walk-forward with bit-exact reproducibility — is itself the deliverable, not any single Sharpe number.
What "success" would have looked like
- Three different strategies (return-max, Sharpe-max, drawdown-averse)
- Each passing all six leakage tests on real data
- Each showing post-cost Sharpe between 0.4 and 0.7
- All three reproducible bit-for-bit with the same seed
What we got instead
No measurable time-edge at 5×3y or 46×3y, across 367 deterministic experiments, 12 SPIKE_DEFAULTS configurations, 8 loss-function variants, and many hp variants. Every result above the L23 ceiling traced to a harness gap (which got fixed — see Harness evolution) or a scale artifact (which got caught — see What didn't work).
The two leakage-clean results that survive every test sit in a statistical noise band: Sharpe between -0.51 and -0.35. That negative result is research output. See Best candidates.