Playbook

What this system does, the process a trade flows through, the hard rules that govern it, and the honest read on whether the edge is real. This is the one-pager — start here.

The process ↓Is the edge real? ↓$10K live sleeve ↓Discipline rules ↓Out-of-paper flaws ↓

The process — one trade, six stages

The same loop an institutional quant pod runs — signal → conviction → sizing → execution → attribution → recalibration — compressed onto one machine.

FIND

Daemons scan ~50 AI-infra tickers + options flow, dealer gamma, IV rank, technicals.

→ Typed signals (GEX_FLIP, UW_FLOW, IV_RANK, Triple-Confirm) — not opinions.

scout_plays.py · setup_hunter.py · watchlist_indicators.py

JUDGE

Multi-agent tribunal: macro / fundamentals / technicals specialists, then bull-vs-bear, then a coordinator.

→ A conviction read. Council weights shift with the market regime.

tribunal-* agents · regime-conditional weights

SIZE

Risk-based position sizing caps exposure; concentration (HHI) + greeks checked.

→ A dollar-risk-bounded position, not a gut-sized one.

/risk veto layer · sizing on /scout

ALERT

Deterministic sanitizer strips bloat/hallucination; result is a copy-pasteable Telegram PLAY card.

→ A 10-second decision. The SHELL sends, not the LLM.

sanitize_alert.py · run.sh delivery

GRADE

Every TAKE/HIGH suggestion is auto-logged to the shadow book (paper). After ≥7d it's eligible for analysis.

→ Win rate, expectancy, profit factor, slippage-net P&L, MAE/MFE.

shadow-manager (every 15min) · /journal

LEARN

Reflection layer suggests per-signal threshold changes from realized outcomes. Losing signals get retired.

→ Nothing auto-applies — Eric approves. GEX_WALL was retired this way.

threshold calibration · /signals

Is the edge real? (SPY baseline)

We have no formal backtest engine — we can't replay signals we never logged historically. What we cando honestly: benchmark every closed trade against whether SPY rose over its actual holding window. If our win rate beats the tape, the edge is real. If not, we're riding AI beta.· refreshed 4h 25m ago

Mostly beta, not yet proven alpha

2026-05-03 → 2026-05-29 · n=86

Our win rate

70.9%

SPY up-rate (same windows)

79.1%

Win-rate alpha

-8.2pp

Wins while SPY flat/down

10 (16.4%)

In this window SPY was up in 79.1% of our holding periods — more often than we won (70.9%). On hit-rate, our directional picking did not beat simply being long. The dollar expectancy stays positive (options leverage), but leverage amplifies both ways. The real test is a flat/down tape, and we only have 10 wins there. Treat the paper win rate as cheap until proven in a non-bull window.

Active-book expectancy (for context): +$678.69/trade · PF 2.77 · win 68.6% (n=86). Dollar edge ≠ hit-rate edge.

$10K live sleeve

A ring-fenced $10K real-money sleeve, separate from the main $223K account, to pressure-test the system with real stakes. Real money only follows signals that clear the deploy gate.· refreshed 4h 14m ago

Status

Live · funded 2026-05-29

Funded and live. Awaiting first reported entry. Real capital may only follow signals in 'cleared for real capital' below — today that is GEX_FLIP only.

Deployed $0 · available $10,000

Hard limits

Max risk / trade: $1000
Max concurrent: 5
Daily loss limit: $1000
Sleeve drawdown halt: $7500

Cleared for real capital (1)

GEX_FLIPn=65 · win 67.7% · exp $729.02 · PF 2.52 · WLB 55.6%

Blocked from real capital (12) — click to expand

UW_FLOWn=7sample n=7 < 30 (not established)

IV_RANKn=6sample n=6 < 30 (not established)

intraday_scann=3sample n=3 < 30 (not established)

MANUALn=3sample n=3 < 30 (not established)

CC_INCOMEn=3sample n=3 < 30 (not established)

earnings_5_7_AMCn=1sample n=1 < 30 (not established)

NVDA_corning_optical_halon=1sample n=1 < 30 (not established)

theta_decay_12dten=1sample n=1 < 30 (not established)

post_earnings_winner_lockn=1sample n=1 < 30 (not established)

decision_zone_loss_capn=1sample n=1 < 30 (not established)

no_recovery_signaln=1sample n=1 < 30 (not established)

CSP_INCOMEn=1sample n=1 < 30 (not established)

Discipline rules

The hard rules the system enforces — the part that matters more than any single trade.

Determinism boundary

LLMs own reasoning (which trade, why). Shell scripts own side-effects (sizing math, sanitizing, sending). Validators sit between agent output and any send. A hallucinated ticker physically cannot reach an order.

Deploy gate before real capital

A signal must clear: established sample (n≥30) + positive expectancy + Wilson 95% lower bound > 55% + profit factor > 1.2. Paper can trade anything; $10K cannot.

Symmetric stops/targets

Long options: cut at −50%, trim/BTC at +50%. CSP/CC income: BTC at 50% profit — never hold to expiry chasing the last dollar (the AAOI lesson).

Short-dated options cap

Aggregate sub-30-DTE long-option cost basis capped at $10K — separate from the $50K options budget.

Retire losers, keep the evidence

A proven losing signal is excluded everywhere user-facing but its trades stay in the all-time data, so the honest all-time P&L still shows what kept-trading-garbage would have cost.

Freshness or nothing

Every derived table shows 'refreshed Xm ago' and flags amber when stale. A dead daemon can't masquerade as a live signal.

What changes when we leave paper

The flaws that only bite with real money. Read before funding the sleeve.

Slippage is real now

Paper assumes bid/ask + commission fills. Real single-name option fills in size are worse. The 68.6% paper win rate will compress — plan for it.

We may be selling beta as skill

Over May, SPY was up in 79% of our holding windows; our win rate (71%) did NOT beat the tape on hit-rate. In a bull run, winning is cheap. The real test is a flat/down tape — we have only 10 such trades.

One thesis, one correlated bet

Everything is AI-infra. A semis/AI-capex drawdown hits every position at once. The HHI monitor exists because this is the top risk.

Thin samples on the 'best' signals

UW_FLOW/IV_RANK look spectacular at n=6–7 — too small to stake real money. Only GEX_FLIP (n=65) has earned the gate. Don't let a hot small sample pull real capital early.

Psychology isn't modeled

A paper loss costs nothing. A real −50% leg tempts you to abandon the rules. The whole point of the $10K sleeve is to feel that at survivable stakes.

One vendor, one machine

If UW's API changes or the Mac sleeps through a session, the desk goes dark. No redundancy yet.