04 — Positioning & Flows: The Legitimate "What Are Real Traders Doing"
Doctrine. Scraping Twitter to figure out what real traders are doing is cargo-cult. It is also legally fraught (TOS, ToS-derivative data products) and statistically empty (selection bias, survivorship bias, the loudest accounts are the worst traders). The grown-up version of "I want to know how the smart money is positioned" is public regulatory data and exchange-published flows. Every serious bank desk, real-money fund, and CTA reads this stuff every Monday morning. It is free, structured, machine-readable, and — unlike Twitter — has produced peer-reviewed evidence of edge.
This document is the reference for sourcing, parsing, interpreting, and combining that data. Target audience: a builder who wants to spin up a positioning + flow pipeline in a week and have an honest backtestable signal at the end of it.
1. CFTC Commitments of Traders (COT) reports
The COT is the single most-cited "real positioning" dataset in macro. It is a weekly snapshot, mandated by the Commodity Exchange Act, of every reportable position in US futures markets. Published Fridays at 15:30 ET, reflecting the prior Tuesday's open interest — a 3-day lag.
1.1 Four report flavors (pick the right one)
The CFTC publishes four parallel reports across the same Tuesday snapshot. Most retail traders default to the Legacy report; that is usually the wrong call.
| Report | Categories | Best for |
|---|---|---|
| Legacy (since 1986) | Commercial / Non-commercial / Non-reportable | Historical backtests needing >20yr data |
| Disaggregated (since 2006) | Producer/Merchant, Swap Dealer, Managed Money, Other Reportable | Physical commodities — oil, gold, ags |
| Traders in Financial Futures (TFF) (since 2006) | Dealer, Asset Manager, Leveraged Funds, Other Reportable | FX, rates, equity index futures — this is the one for FX |
| CIT (Commodity Index Traders, since 2006) | Adds index-fund category to ags | Niche; tracking commodity index unwinds |
For FX work you want TFF. The reason: Legacy lumps real-money asset managers (long-only, sticky, often the actual flow driver) into the same "non-commercial" bucket as fast-money hedge funds, which is the bucket that actually predicts reversals. TFF separates them: Asset Managers ≈ real money, Leveraged Funds ≈ macro / CTA / fast money. The Leveraged Funds line is the line you trade against at extremes.
CFTC explanatory notes: https://www.cftc.gov/sites/default/files/idc/groups/public/@commitmentsoftraders/documents/file/tfmexplanatorynotes.pdf
1.2 How to actually read it
Reading the COT well comes down to three habits:
- Look at net position, not gross. Net = Longs − Shorts for the category you care about. A category can be 90% long and still be a small net position if the shorts are similar size.
- Normalize. Raw contract counts mean nothing — open interest grows and shrinks. The two standard normalizations:
- % of open interest:
net / OI. Compresses cycles to roughly ±40%. - Z-score / percentile vs trailing 3yr (156 weeks): the standard "extreme positioning" lens.
- % of open interest:
- Watch the change, not the level. A reversal is a flow event. Speculators going from +60k to +20k net long in two weeks is a flow you can trade with; the same +60k position sitting still is information that's already in price.
The classic interpretation: commercials hedge, specs speculate. Commercials are typically counter-trend (they short into strength because they have inventory to deliver). Non-commercials / Leveraged Funds are trend-following. When Leveraged Funds reach a multi-year extreme, you are by definition late to the move — and the marginal flow has to come from somewhere it doesn't exist yet. That's the mean-reversion setup.
1.3 The "extreme positioning" signal
The canonical rule of thumb on the sell-side desks:
Leveraged Funds net position in the >80th percentile vs trailing 156-week window → momentum stretched, fade-the-extreme bias on a 4–8 week horizon. <20th percentile → contrarian long bias.
This is not a standalone signal. It is a filter that biases other signals (price exhaustion, divergent macro data, news event risk).
1.4 Documented edge — what the literature actually says
The academic record on COT is mixed and worth being honest about:
- Wang (2001, 2003): Sentiment index from large-trader positions forecasts price continuations (i.e. specs are right on the trend), while large hedger sentiment predicts reversals. Small-trader sentiment is noise. Published in Journal of Futures Markets. https://ideas.repec.org/p/pra/mprapa/36425.html
- Sanders, Boris & Manfredo (2004) "Smart Money? The Forecasting Ability of CFTC Large Traders in Agricultural Futures Markets": Limited unconditional edge from raw positioning. https://ideas.repec.org/a/ags/jlaare/54547.html
- Klitgaard & Weir (NY Fed, 2004) "Exchange Rate Changes and Net Positions of Speculators in the Futures Market": Speculators correctly track 75% of weekly EUR/USD direction contemporaneously — but positions do not forecast future moves on their own. Coincident, not predictive. https://www.newyorkfed.org/medialibrary/media/research/staff_reports/sr262.pdf
Honest read: COT is not a get-rich-quick signal on its own. It is a conditioning variable. The papers that find edge filter COT by extremes, combine it with price, or use it in agricultural markets where the commercial/speculator distinction is cleaner. For FX, the value is regime detection, not standalone alpha.
1.5 FX-specific reading by pair
The five FX futures contracts that matter, all CME-listed:
- EUR (6E) — by far the deepest. Leveraged Funds extremes typically run ±200k contracts net. Watch for divergence vs DXY composite.
- JPY (6J) — most reliable mean-reversion candidate because the carry-trade structure creates persistent crowding. JPY positioning at multi-year shorts (carry crowded) → vol shock = forced cover.
- GBP (6B) — thinner, noisier. Politics overwhelms positioning around elections / BoE events.
- AUD (6A) — China/commodity proxy. Read with copper and iron-ore positioning.
- CHF (6S) — small and dominated by the SNB. Treat with caution.
- MXN, CAD — useful for "risk-off" cross-checks.
1.6 Free sources
| Source | Format | Notes |
|---|---|---|
| CFTC.gov direct | TXT / CSV / Excel | Authoritative, ugly. https://www.cftc.gov/MarketReports/CommitmentsofTraders/index.htm |
| CFTC Public Reporting (Socrata) | JSON API | Machine-friendly. https://publicreporting.cftc.gov |
| Tradingster.com | HTML tables, charts | Best free UI for COT |
| Barchart COT | Charts + history | https://www.barchart.com/futures/commitment-of-traders |
| FinViz futures | Quick scan | Aggregated extremes |
| Saxo Bank weekly note | Commentary | Published Mondays |
| MrTopStep / Hedgopia | Curated extremes | Useful sanity checks |
1.7 Pulling COT programmatically (Socrata)
The CFTC Public Reporting Environment exposes Socrata endpoints — no token required for normal use:
# pip install sodapy pandas
from sodapy import Socrata
import pandas as pd
# TFF Futures Only — the right report for FX
DOMAIN = "publicreporting.cftc.gov"
TFF_FUTURES = "gpe5-46if"
client = Socrata(DOMAIN, None) # None = anonymous; rate-limited but adequate
# Pull the last 156 weeks (3yr) for EUR FX futures
results = client.get(
TFF_FUTURES,
where="contract_market_name like '%EURO FX%' "
"AND report_date_as_yyyy_mm_dd > '2023-01-01'",
limit=500,
order="report_date_as_yyyy_mm_dd DESC",
)
df = pd.DataFrame.from_records(results)
df["report_date"] = pd.to_datetime(df["report_date_as_yyyy_mm_dd"])
df = df.set_index("report_date").sort_index()
# Convert numeric cols
num_cols = [c for c in df.columns if "positions" in c or "open_interest" in c]
df[num_cols] = df[num_cols].apply(pd.to_numeric, errors="coerce")
# Leveraged Funds net position
df["lev_net"] = df["lev_money_positions_long_all"] - df["lev_money_positions_short_all"]
# 156-week percentile rank — the "extreme" lens
df["lev_net_pct_rank"] = df["lev_net"].rolling(156, min_periods=52).rank(pct=True)
# Signal: extreme long = mean-reversion bias
df["signal"] = 0
df.loc[df["lev_net_pct_rank"] > 0.85, "signal"] = -1 # short bias
df.loc[df["lev_net_pct_rank"] < 0.15, "signal"] = +1 # long bias
The four endpoints:
- Legacy Futures Only:
6dca-aqww - Disaggregated Futures Only:
72hh-3qpy - Disaggregated Combined (futures+options):
kh3c-gbw2 - TFF Futures Only:
gpe5-46if
Full API foundry: https://dev.socrata.com/foundry/publicreporting.cftc.gov/gpe5-46if
2. ETF flows as positioning proxy
When non-US investors want USD exposure (or the reverse), they often route through ETFs because they don't have a futures account. ETF flows therefore proxy the retail and RIA channel of currency flow — which is precisely the slow, sticky, sentiment-driven money that lags moves and overshoots.
2.1 The currency ETF universe
| ETF | Exposure | Issuer | AUM (approx, mid-2026) | Notes |
|---|---|---|---|---|
| UUP | Long DXY (bullish USD) | Invesco | ~$280M | Most-traded USD ETF |
| UDN | Short DXY (bearish USD) | Invesco | small | Thin but useful for divergence checks |
| FXE | Long EUR vs USD | Invesco CurrencyShares | ~$525M | Deep options market |
| EUO | -2× EUR | ProShares | small | Leveraged — flow noisy |
| ULE | +2× EUR | ProShares | small | |
| FXY | Long JPY vs USD | Invesco CurrencyShares | small | |
| YCS | -2× JPY | ProShares | small | Crowded carry-trade proxy |
| YCL | +2× JPY | ProShares | small | |
| FXB | Long GBP vs USD | Invesco | thin | |
| FXA | Long AUD vs USD | Invesco | thin | |
| GLD | Gold | SPDR | >$70B | Anti-USD proxy; flows enormous |
| IAU | Gold | iShares | >$30B | Lower-fee competitor |
2.2 Where to get flow data
- etf.com — free daily flow tables. https://www.etf.com/etf-flows/daily
- ETFdb.com (VettaFi) — flow histories, free tier. https://etfdb.com
- ETF.com Fund Flows API — paid.
- Bloomberg FLOW
— gold standard but $$$. - Issuer fact sheets — Invesco, iShares, ProShares post daily NAV and shares-outstanding; subtract day-over-day shares × NAV → net flow approximation. Free if you scrape.
- SEC N-PORT filings — monthly portfolio composition with 60-day lag. Latency-killer but auditable.
2.3 The "smart money vs dumb money" framing
Be skeptical of the framing. Larry Swedroe summarizing the literature: ETF inflows are largely "dumb money" — they chase recent returns, the momentum effect explains the flow, and the marginal buyer's outperformance fades within months. https://larryswedroe.substack.com/p/are-etf-flows-smart-money
The useful version is not "follow ETF flows" but "fade ETF flow extremes." Two patterns worth tracking:
- Flow vs price divergence. Price rising, flows leaving (or vice versa). Means existing holders are distributing to chasers. Classic late-stage signal. Hartzmark & Sussman (2019) and the broader fund-flow literature support that retail-driven flows tend to be performance-chasing rather than informationally efficient. https://onlinelibrary.wiley.com/doi/10.1111/jofi.12841
- Flow magnitude z-score. Same percentile-rank trick as COT. Flows >95th percentile = capitulation buying. Pair with a price-momentum exhaustion read.
2.4 Pulling ETF flows
For a free pipeline, the cleanest approach:
# Pseudo-code — scrape Invesco's daily fact sheet for FXE
# Shares-outstanding change × prior NAV ≈ creation/redemption flow
import requests, pandas as pd
URL = "https://www.invesco.com/us/financial-products/etfs/holdings?audienceType=Investor&ticker=FXE"
# In practice: pull from ETFdb's JSON or SEC's N-PORT XML
# Once you have a per-day series of shares outstanding × NAV:
df["mkt_cap"] = df["shares_out"] * df["nav"]
df["flow"] = (df["shares_out"].diff()) * df["nav"] # creation = inflow
df["flow_z"] = (df["flow"] - df["flow"].rolling(252).mean()) / df["flow"].rolling(252).std()
3. SEC 13F filings — institutional positions
Form 13F is required of every institutional manager with >$100M in 13(f) securities. Filed quarterly within 45 days of quarter-end.
3.1 What 13F actually contains
- US-listed long equity positions, including ETFs and ADRs.
- Long calls and puts (notional).
- Convertible debt.
3.2 What it does NOT contain — critical for FX
- Short positions.
- Cash.
- Currency forwards, FX swaps, NDFs, cross-currency basis swaps.
- Most fixed income.
- Private investments.
- Non-US-listed securities.
This is why "tracking Bridgewater's FX book via 13F" is largely a myth. You see their equity book — which is often a small part of their actual exposure. For pure FX-positioning intelligence, 13F is not the right tool.
3.3 What 13F IS useful for
- Tracking long positions in FX ETFs (UUP, FXE, FXY, GLD). When a real-money fund builds a $100M FXE position, it shows up. This is the legitimate "what are the funds doing in currencies" signal that 13F supports.
- Identifying macro funds with disclosed equity-proxy currency bets — gold miners (GDX), EM ETFs (EEM), Japan ETFs (EWJ, DXJ — the latter is currency-hedged), Europe ETFs (EZU, HEDJ).
- Tracking the option positions disclosed — e.g. macro funds frequently disclose large FXE put or FXY call positions which signal direction even if delta is uncertain.
3.4 Free sources
- SEC EDGAR full-text search: https://efts.sec.gov/LATEST/search-index?q=%22FXE%22&forms=13F-HR
- WhaleWisdom: https://whalewisdom.com — free tier, queryable.
- Stockcircle / HedgeFollow / Dataroma — free aggregators.
- 13F.info — clean comparisons.
3.5 Pulling 13F programmatically
import requests, xml.etree.ElementTree as ET
# Find the latest 13F-HR filing index for a given CIK
cik = "0001350694" # example: Bridgewater
url = f"https://data.sec.gov/submissions/CIK{int(cik):010d}.json"
headers = {"User-Agent": "Your Name your@email.com"} # SEC requires
resp = requests.get(url, headers=headers).json()
# Filter form types
filings = resp["filings"]["recent"]
recent_13f = [(d, a) for d, a, f in zip(filings["filingDate"],
filings["accessionNumber"],
filings["form"])
if f == "13F-HR"][:4]
# Each filing has an information table XML — parse holdings
# Look for nameOfIssuer matching "INVESCO CURRENCYSHARES" etc
4. Options market positioning
The options market is structurally more informative than the spot market for one reason: option prices encode a probability distribution, not just a point estimate. Skew, smile, and risk-reversal pricing reveal what hedgers and speculators are scrambling to buy.
4.1 Put/call ratio for currency ETFs
CBOE publishes daily put/call ratios for individual symbols including the major FX ETFs. The vanilla interpretation: extreme readings are contrarian. Total market P/C > 1.2 → bearish capitulation. < 0.7 → bullish complacency.
- CBOE daily statistics: https://www.cboe.com/us/options/market_statistics/daily/
- CBOE historical options data: https://www.cboe.com/us/options/market_statistics/historical_data/
- Barchart per-symbol P/C history: e.g. https://www.barchart.com/etfs-funds/quotes/FXE/options
4.2 Implied volatility skew
For a given expiry, plot IV against strike (or delta). The shape of that curve is the skew. In FX:
- Symmetric smile = market expects two-tailed move.
- Skewed to OTM puts = downside hedging demand. Real money is paying up for protection.
- Skewed to OTM calls = speculative upside chasing.
A change in skew often precedes a change in spot. The classic study findings: skew + 1-week return has predictive value, especially during stress regimes — see Doran, Peterson, Tarrant (and related work) on the predictive power of the volatility smirk for foreign exchange returns. The signal is strongest when negative news hits the market.
4.3 25-delta risk reversal
The 25-delta risk reversal (RR25) is the standard FX-skew benchmark on every interbank dealer screen:
RR25 = IV(25Δ call) − IV(25Δ put)
A positive RR25 means calls are bid (the market is paying for upside). Negative = puts bid (downside protection demand).
For each major FX pair, RR25 oscillates with risk sentiment. Persistent negative RR25 on EUR/USD = real-money hedging dollar exposure. A sharp swing in RR25 without a corresponding spot move is the early warning sign.
The 25-delta risk reversal is the cleanest single-number summary of FX options positioning. https://flyonthewall.ai/25-delta-risk-reversal/
4.4 Where to access
| Source | Coverage | Cost |
|---|---|---|
| CBOE DataShop | Historical options chains | Paid |
| ORATS | Implied vols, skew time series | Paid, reasonable |
| MarketChameleon | Options analytics for ETFs | Free tier exists |
| Bloomberg OVDV | FX vol surface | $$$$ |
| Refinitiv FXVL | Same | $$$$ |
| CME DataMine | Settlement vols | Free historical |
For a free pipeline, the realistic move: scrape end-of-day option chains for FXE, FXY, UUP, GLD from Yahoo or CBOE, compute skew yourself.
import yfinance as yf, numpy as np
opt = yf.Ticker("FXE").option_chain(yf.Ticker("FXE").options[0])
calls, puts = opt.calls, opt.puts
spot = yf.Ticker("FXE").history(period="1d")["Close"].iloc[-1]
# Approximate 25-delta strikes — full implementation needs Black-Scholes inversion
call_25d_strike = spot * 1.02 # rough proxy; replace with proper delta solve
put_25d_strike = spot * 0.98
iv_call = calls.loc[(calls.strike - call_25d_strike).abs().idxmin(), "impliedVolatility"]
iv_put = puts.loc[(puts.strike - put_25d_strike).abs().idxmin(), "impliedVolatility"]
rr25 = iv_call - iv_put
5. Real-money order-flow indicators
These are not "positioning" in the COT sense; they are structural plumbing indicators that move because real money has to do something. They are far harder to fake and far less crowded as signals.
5.1 DXY composition vs Fed trade-weighted
There are two dollar indexes and they are not the same:
- ICE DXY — the futures-tradable DXY. Six fixed weights, all developed-market: EUR 57.6%, JPY 13.6%, GBP 11.9%, CAD 9.1%, SEK 4.2%, CHF 3.6%. Frozen since 1973. Doesn't include CNY, MXN.
- Fed Broad Dollar Index (DTWEXBGS) — annually re-weighted by bilateral trade volume, includes EM currencies, more economically meaningful. https://fred.stlouisfed.org/series/DTWEXBGS
When the Fed Broad is rising and ICE DXY is flat — emerging-market currencies are weakening but EUR/JPY/GBP are stable. This divergence is itself a flow signal: USD strength is being driven by EM.
5.2 Money market spreads
Real-time short-USD-funding stress indicators:
- SOFR (FRED:
SOFR) — replaces LIBOR, secured overnight. - OIS (overnight index swap) — derived from Fed Funds expectations.
- SOFR–OIS spread — credit/funding risk component.
- EFFR–IORB — interbank stress proxy.
- Commercial paper (CP) – OIS spread — corporate USD funding.
The post-LIBOR equivalents of the old "TED spread" still drive FX. When USD funding gets expensive, foreign holders unwind hedges → USD demand → spot moves.
5.3 Cross-currency basis swaps
The cross-currency basis is the smoking gun for USD funding stress:
Basis = (USD interest rate via FX swap) − (USD interest rate from cash market)
A negative basis means it is more expensive to borrow USD via FX swaps than cash. This signals foreign demand for USD funding is outstripping supply.
Documented behavior — see BIS Quarterly Review "Understanding the Cross-Currency Basis" (Borio et al, 2016) https://www.bis.org/publ/qtrpdf/r_qt1609e.pdf and Sushko et al "What Drives the Cross-Currency Basis" https://www.bis.org/resman_march2017/sushko.pdf:
- Pre-2008: basis ≈ 0.
- Post-2008: persistent negative basis on EUR/USD, JPY/USD, reflecting structural USD shortage among non-US banks.
- March 2020 COVID shock: basis blew out → Fed activated swap lines → basis normalized within days.
- Quarter-end and year-end: predictable basis widening as banks shrink balance sheets.
This is one of the few FX signals where you can write a one-line trading rule that has worked for 15 years: basis blows wide → fade USD strength on a 2-4 week horizon, conditional on Fed not letting it persist.
Series IDs to watch (Bloomberg-style; FRED has partial coverage):
- EUR-USD 3M basis (BBG: EUR003M Curncy basis, ~negative typically)
- JPY-USD 3M basis (most negative structurally)
- BIS publishes quarterly summary tables: https://www.bis.org/statistics/derstats.htm
6. Sentiment indicators that aren't Twitter
The criticism of Twitter scraping is not "sentiment doesn't matter." It is "the population sampled by Twitter is unrepresentative, the data is unstructured, the signal is dominated by survivorship, and there are five better, audited, structured surveys you could use instead."
6.1 The five worth tracking
| Survey | Frequency | Sample | Best read as |
|---|---|---|---|
| AAII Investor Sentiment | Weekly (Thu) | US retail | Contrarian at extremes (>1σ) |
| BofA Global Fund Manager Survey | Monthly | ~200 institutions | Crowded-trade detector |
| Investors Intelligence Bull/Bear | Weekly | Newsletter writers | Contrarian, smoother than AAII |
| NAAIM Exposure Index | Weekly | Active managers' net exposure | Positioning, not sentiment |
| Sentix | Weekly | 5,500+, mostly German-speaking | EU-centric, EUR-relevant |
| Daily Sentiment Index (DSI) | Daily | Retail futures traders | Daily contrarian on futures (paid) |
| TD Ameritrade IMX | Monthly | TDA brokerage clients | Retail US equity positioning |
6.2 What the literature supports
AAII as a contrarian indicator: When bullish sentiment exceeds the historical mean by >1σ, subsequent 4-week S&P returns are below average. Conversely, extreme bearish readings (<1σ) have preceded above-average 6- and 12-month gains. The effect is asymmetric — bearish extremes are more reliable contrarian buys than bullish extremes are sells. https://www.aaii.com/journal/article/feature-investor-sentiment-as-a-contrarian-indicator
BofA FMS as crowded-trade detector: The "most crowded trade" question in the monthly survey has flagged: long FAANG (Q4 2021, preceded 2022 selloff), long USD (late 2022, preceded reversal), long gold (Q1 2026). Direct quote from a recent survey: "Just 4% of fund managers see a hard landing" — peak optimism. https://atranicapital.substack.com/p/february-2026-bank-of-america-global
DSI: Bernstein's claim — and the rough academic confirmation — is that readings >85 or <15 mark turning points in futures markets. The data isn't free (~$300/yr for the daily product). https://www.thestreet.com/dictionary/daily-sentiment-index-dsi
6.3 Why these beat Twitter
- Sampled, not self-selected. A random sample of fund managers ≠ the loudest accounts on Twitter.
- Structured questions, longitudinal. You can compare today's reading to 2008, 2011, 2020. You cannot do that with tweet sentiment because the underlying population, platform, and norms have changed.
- No legal grey area. AAII data is published. BofA FMS results are reported by Reuters and Bloomberg.
- Backtestable. Every one of these has decades of history. Twitter scrape has noisy, biased, recent history at best.
- Cited by the desks you'd be trading against. The flows you want to anticipate are driven by people reading BofA FMS, not by people reading Crypto Twitter.
6.4 Free sources
- AAII: https://www.aaii.com/sentimentsurvey
- Investors Intelligence: paid, ~$300/yr.
- NAAIM: https://www.naaim.org/programs/naaim-exposure-index/
- Sentix: https://www.sentix.de (free EZ headline, paid detail)
- BofA FMS: not directly published; covered by Reuters and ZeroHedge with 1-day lag.
7. Implementation — building the pipeline
A pragmatic free-tier ingest. SQLite is enough; resist the urge to over-engineer.
7.1 Data inventory
| Source | Update | Free tier | Best API |
|---|---|---|---|
| COT (CFTC) | Weekly Fri 15:30 ET | Yes | Socrata publicreporting.cftc.gov |
| FRED (DXY, SOFR, OIS, spreads) | Daily | Yes | fredapi Python lib, free key |
| ETF flows (issuer scrape) | Daily | Yes | requests + lxml; or N-PORT |
| 13F | Quarterly (45d lag) | Yes | SEC EDGAR JSON |
| Option chains (FXE/FXY/UUP) | Daily | Yes | yfinance or CBOE EOD |
| AAII | Weekly Thu | Yes | Direct CSV download |
| BofA FMS | Monthly | Indirect | Reuters/news scrape |
| Cross-currency basis | Daily | Limited | BIS quarterly + manual EOD |
7.2 Storage
-- SQLite schema
CREATE TABLE cot (
report_date DATE,
market TEXT,
report_type TEXT, -- 'legacy' | 'tff' | 'disagg'
category TEXT, -- 'lev_funds' | 'asset_mgr' | 'commercial' | ...
net_position INTEGER,
open_interest INTEGER,
pct_oi REAL,
PRIMARY KEY (report_date, market, report_type, category)
);
CREATE TABLE etf_flows (
trade_date DATE,
ticker TEXT,
shares_out INTEGER,
nav REAL,
est_flow REAL,
PRIMARY KEY (trade_date, ticker)
);
CREATE TABLE skew (
trade_date DATE,
ticker TEXT,
expiry_days INTEGER,
rr25 REAL,
atm_iv REAL,
PRIMARY KEY (trade_date, ticker, expiry_days)
);
CREATE TABLE sentiment (
observation_date DATE,
survey TEXT, -- 'aaii' | 'bofa_fms' | 'naaim' | 'sentix'
series TEXT, -- 'bullish_pct', 'bearish_pct', etc
value REAL,
PRIMARY KEY (observation_date, survey, series)
);
7.3 The daily brief
A 5-minute morning routine, generated by a single script:
- COT extremes — any category × market where percentile rank crossed an 85/15 threshold this week.
- ETF flow z-scores — any FX ETF with |z| > 2 over last 5 days.
- Risk reversal shifts — RR25 1-week change > 1.5σ for FXE / FXY / UUP / GLD.
- Funding indicators — SOFR-OIS, cross-currency basis, any move > 5bp from prior week.
- Sentiment extremes — AAII bull-bear spread >1σ; latest BofA FMS crowded trade.
Output: markdown report, max one page, three actionable lines at the top. The discipline is that anything below the top three is research, not signal.
8. Three signals to implement first
Ranked by expected edge × ease of implementation.
8.1 COT Leveraged-Fund Extreme (TFF report)
- Signal: When Leveraged Funds net position in EUR / JPY / GBP / AUD futures (TFF report) exceeds the 85th percentile of its trailing 156-week distribution, lean against the position over a 4-8 week horizon.
- Refinement: Trigger only when the percentile flips from extreme back toward neutral — i.e. the unwind is starting. Avoids the "extreme keeps getting more extreme" trap.
- Expected Sharpe contribution: 0.2–0.4 standalone, higher when filtered by price exhaustion. Wang (2003) shows the edge exists but is conditional.
- Backtest difficulty: Low. Free data, weekly cadence, clean rules. Watch for category renaming in 2010 (TFF launch); use post-2010 data only.
- Honest caveat: Edge is statistical, not deterministic. You will see strings of losers when positioning extends further. Position sizing must reflect that.
8.2 ETF Flow Divergence vs Price
- Signal: For each major FX/macro ETF (UUP, FXE, FXY, GLD), compute (a) 20-day cumulative net flow z-score, (b) 20-day price return z-score. When the two diverge by > 2σ (price up + flows out, or price down + flows in), bias against the price direction over 2-4 weeks.
- Why it works: Flows lagging price = late retail chasing or stale real money distributing. Both have empirical precedent in Hartzmark and Frazzini/Lamont flow literature.
- Expected Sharpe: 0.15–0.30. Lower than COT extreme because ETF channel is a smaller share of FX flow than futures.
- Backtest difficulty: Medium. ETF flow data is partially free (issuer scrape, ETFdb) but historical depth varies. Need ≥5 years for any conclusion.
8.3 Risk Reversal Skew Shift
- Signal: For FXE and FXY, compute daily 1-month 25-delta risk reversal. When 5-day change in RR25 exceeds 1.5σ of trailing 1yr volatility without a corresponding spot move >1%, treat as early warning that the spot move is coming in the direction of the skew shift.
- Why it works: Options market is faster than spot for hedging demand. Real-money flow shows up in skew before it shows up in price.
- Expected Sharpe: 0.3–0.5 in stress regimes, near zero in calm regimes. Conditional signal.
- Backtest difficulty: High. Free option chain data (
yfinance) is only reliable for recent history and the 25-delta interpolation is approximate. Production version wants ORATS or CBOE EOD data (paid). Plan for $50–200/month if you go production.
8.4 Combining
The cleanest combination is a 3-of-3 vote plus a size scalar:
- 0 of 3 → no trade.
- 1 of 3 → research note, no trade.
- 2 of 3 → 0.5× size.
- 3 of 3 → 1.0× size.
The point of three uncorrelated positioning signals is not to "find more trades." It is to gate the trades you already had ideas about. The signals filter; the discretion (or systematic price/momentum overlay) trades.
Closing doctrine
The reason this document exists is that "scrape Twitter" is a category error. The question "what are real traders doing" has a real answer, and the answer is structured, regulatory, free, and audited. Banks, funds, and central banks read this data Monday morning. You can too.
The grown-up edge over the Twitter-scraper crowd is not better natural language processing on tweets. It is getting structured positioning data into a structured pipeline and turning the discipline of reading it weekly into a habit.
Two final guardrails:
- Positioning is a filter, not a thesis. No single one of these signals will make you money on its own. They make the trades you already had ideas about better-sized and better-timed.
- Honor the lag. COT is 3 days stale by publication. ETF flows are 1 day stale. 13F is 45 days stale. The signals work because of the lag — late participants are still adjusting. Trying to beat the lag with private data is a different game (sell-side desk view of customer flow) and not free.
Appendix: canonical URLs
- CFTC COT main: https://www.cftc.gov/MarketReports/CommitmentsofTraders/index.htm
- CFTC Public Reporting (Socrata home): https://publicreporting.cftc.gov
- TFF explanatory notes: https://www.cftc.gov/sites/default/files/idc/groups/public/@commitmentsoftraders/documents/file/tfmexplanatorynotes.pdf
- Socrata TFF Futures: https://dev.socrata.com/foundry/publicreporting.cftc.gov/gpe5-46if
- Socrata Disaggregated Futures: https://dev.socrata.com/foundry/publicreporting.cftc.gov/72hh-3qpy
- Socrata Legacy Futures: https://dev.socrata.com/foundry/publicreporting.cftc.gov/6dca-aqww
- SEC EDGAR full-text: https://efts.sec.gov/LATEST/search-index
- WhaleWisdom 13F: https://whalewisdom.com
- Fed H.10 (currency weights): https://www.federalreserve.gov/releases/h10/weights/
- FRED Broad Dollar (nominal): https://fred.stlouisfed.org/series/DTWEXBGS
- FRED SOFR: https://fred.stlouisfed.org/series/SOFR
- BIS Cross-Currency Basis QR: https://www.bis.org/publ/qtrpdf/r_qt1609e.pdf
- BIS What Drives the Basis (Sushko et al): https://www.bis.org/resman_march2017/sushko.pdf
- NY Fed Klitgaard & Weir on speculators: https://www.newyorkfed.org/medialibrary/media/research/staff_reports/sr262.pdf
- AAII Sentiment Survey: https://www.aaii.com/sentimentsurvey
- AAII Contrarian Article: https://www.aaii.com/journal/article/feature-investor-sentiment-as-a-contrarian-indicator
- NAAIM Exposure Index: https://www.naaim.org/programs/naaim-exposure-index/
- CBOE Daily Stats: https://www.cboe.com/us/options/market_statistics/daily/
- CBOE Historical Data: https://www.cboe.com/us/options/market_statistics/historical_data/
- ETF.com Flows: https://www.etf.com/etf-flows/daily
- Sentix: https://www.sentix.de
- Wang (2001) on speculator sentiment: https://ideas.repec.org/p/pra/mprapa/36425.html
- Hartzmark & Sussman (2019) JoF: https://onlinelibrary.wiley.com/doi/10.1111/jofi.12841