Playbook

What this system does, the process a trade flows through, the hard rules that govern it, and the honest read on whether the edge is real. This is the one-pager — start here.

The process — one trade, six stages

The same loop an institutional quant pod runs — signal → conviction → sizing → execution → attribution → recalibration — compressed onto one machine.

1
FIND
Daemons scan ~50 AI-infra tickers + options flow, dealer gamma, IV rank, technicals.
Typed signals (GEX_FLIP, UW_FLOW, IV_RANK, Triple-Confirm) — not opinions.
scout_plays.py · setup_hunter.py · watchlist_indicators.py
2
JUDGE
Multi-agent tribunal: macro / fundamentals / technicals specialists, then bull-vs-bear, then a coordinator.
A conviction read. Council weights shift with the market regime.
tribunal-* agents · regime-conditional weights
3
SIZE
Risk-based position sizing caps exposure; concentration (HHI) + greeks checked.
A dollar-risk-bounded position, not a gut-sized one.
/risk veto layer · sizing on /scout
4
ALERT
Deterministic sanitizer strips bloat/hallucination; result is a copy-pasteable Telegram PLAY card.
A 10-second decision. The SHELL sends, not the LLM.
sanitize_alert.py · run.sh delivery
5
GRADE
Every TAKE/HIGH suggestion is auto-logged to the shadow book (paper). After ≥7d it's eligible for analysis.
Win rate, expectancy, profit factor, slippage-net P&L, MAE/MFE.
shadow-manager (every 15min) · /journal
6
LEARN
Reflection layer suggests per-signal threshold changes from realized outcomes. Losing signals get retired.
Nothing auto-applies — Eric approves. GEX_WALL was retired this way.
threshold calibration · /signals

Is the edge real? (SPY baseline)

We have no formal backtest engine — we can't replay signals we never logged historically. What we cando honestly: benchmark every closed trade against whether SPY rose over its actual holding window. If our win rate beats the tape, the edge is real. If not, we're riding AI beta.· refreshed 4h 25m ago

Mostly beta, not yet proven alpha
2026-05-032026-05-29 · n=86
Our win rate
70.9%
SPY up-rate (same windows)
79.1%
Win-rate alpha
-8.2pp
Wins while SPY flat/down
10 (16.4%)

In this window SPY was up in 79.1% of our holding periods — more often than we won (70.9%). On hit-rate, our directional picking did not beat simply being long. The dollar expectancy stays positive (options leverage), but leverage amplifies both ways. The real test is a flat/down tape, and we only have 10 wins there. Treat the paper win rate as cheap until proven in a non-bull window.

Active-book expectancy (for context): +$678.69/trade · PF 2.77 · win 68.6% (n=86). Dollar edge ≠ hit-rate edge.

$10K live sleeve

A ring-fenced $10K real-money sleeve, separate from the main $223K account, to pressure-test the system with real stakes. Real money only follows signals that clear the deploy gate.· refreshed 4h 14m ago

Status
Live · funded 2026-05-29
Funded and live. Awaiting first reported entry. Real capital may only follow signals in 'cleared for real capital' below — today that is GEX_FLIP only.
Deployed $0 · available $10,000
Hard limits
  • Max risk / trade: $1000
  • Max concurrent: 5
  • Daily loss limit: $1000
  • Sleeve drawdown halt: $7500
Cleared for real capital (1)
GEX_FLIPn=65 · win 67.7% · exp $729.02 · PF 2.52 · WLB 55.6%
Blocked from real capital (12) — click to expand
UW_FLOWn=7sample n=7 < 30 (not established)
IV_RANKn=6sample n=6 < 30 (not established)
intraday_scann=3sample n=3 < 30 (not established)
MANUALn=3sample n=3 < 30 (not established)
CC_INCOMEn=3sample n=3 < 30 (not established)
earnings_5_7_AMCn=1sample n=1 < 30 (not established)
NVDA_corning_optical_halon=1sample n=1 < 30 (not established)
theta_decay_12dten=1sample n=1 < 30 (not established)
post_earnings_winner_lockn=1sample n=1 < 30 (not established)
decision_zone_loss_capn=1sample n=1 < 30 (not established)
no_recovery_signaln=1sample n=1 < 30 (not established)
CSP_INCOMEn=1sample n=1 < 30 (not established)

Discipline rules

The hard rules the system enforces — the part that matters more than any single trade.

Determinism boundary
LLMs own reasoning (which trade, why). Shell scripts own side-effects (sizing math, sanitizing, sending). Validators sit between agent output and any send. A hallucinated ticker physically cannot reach an order.
Deploy gate before real capital
A signal must clear: established sample (n≥30) + positive expectancy + Wilson 95% lower bound > 55% + profit factor > 1.2. Paper can trade anything; $10K cannot.
Symmetric stops/targets
Long options: cut at −50%, trim/BTC at +50%. CSP/CC income: BTC at 50% profit — never hold to expiry chasing the last dollar (the AAOI lesson).
Short-dated options cap
Aggregate sub-30-DTE long-option cost basis capped at $10K — separate from the $50K options budget.
Retire losers, keep the evidence
A proven losing signal is excluded everywhere user-facing but its trades stay in the all-time data, so the honest all-time P&L still shows what kept-trading-garbage would have cost.
Freshness or nothing
Every derived table shows 'refreshed Xm ago' and flags amber when stale. A dead daemon can't masquerade as a live signal.

What changes when we leave paper

The flaws that only bite with real money. Read before funding the sleeve.

Slippage is real now
Paper assumes bid/ask + commission fills. Real single-name option fills in size are worse. The 68.6% paper win rate will compress — plan for it.
We may be selling beta as skill
Over May, SPY was up in 79% of our holding windows; our win rate (71%) did NOT beat the tape on hit-rate. In a bull run, winning is cheap. The real test is a flat/down tape — we have only 10 such trades.
One thesis, one correlated bet
Everything is AI-infra. A semis/AI-capex drawdown hits every position at once. The HHI monitor exists because this is the top risk.
Thin samples on the 'best' signals
UW_FLOW/IV_RANK look spectacular at n=6–7 — too small to stake real money. Only GEX_FLIP (n=65) has earned the gate. Don't let a hot small sample pull real capital early.
Psychology isn't modeled
A paper loss costs nothing. A real −50% leg tempts you to abandon the rules. The whole point of the $10K sleeve is to feel that at survivable stakes.
One vendor, one machine
If UW's API changes or the Mac sleeps through a session, the desk goes dark. No redundancy yet.