Google Research released TimesFM — a foundation model for time series, basically GPT for numbers. It takes any sequence of data, zero training, and tells you what comes next. 200 million parameters. Works out of the box. We had to try it.

The question was simple: does a foundation model see something Tauntaun doesn't?

The Setup

We fed TimesFM six months of daily prices for all 10 of our current ETF positions. Asked it to forecast 10 days out. Then compared its directional call (bullish or bearish) against our actual positions.

First pass: 30% alignment. The model disagreed with 7 out of 10 of our positions. Interesting, but one snapshot doesn't mean much. So we went deeper.

The Backtest: 300 Forecasts

We walked back 30 trading days. At each day, for each of our 10 ETFs, we asked TimesFM: "Given the last 128 days, what happens in the next 10?" Then we checked what actually happened.

300 forecasts. Real prices. No peeking.

Symbol Accuracy Model Bias
XLE83%Bullish 83% of the time
GLD70%Neutral
TIP60%Bearish 100% of the time
XLU60%Bearish 73%
IWM57%Neutral
XAR53%Bearish 90%
ITA37%Bearish 73%
QQQ30%Bullish 100% of the time
SPY17%Bullish 90%
USO3%Bearish 100% of the time

Overall: 47%. Worse than a coin flip.

But look at those extremes. XLE at 83%, USO at 3%. The model doesn't have one personality — it has two. Some assets it reads beautifully. Others, you'd literally make money doing the opposite of what it says.

The Contrarian Experiment

That 3% on USO caught our eye. If the model is reliably wrong, that's just as useful as being reliably right. So we flipped everything: model says buy, we sell. Model says sell, we buy.

Contrarian accuracy: 53%. Better, but not by much overall. The magic is in the per-symbol split:

"Trust the model": XLE (83%), GLD (70%), TIP (60%), XLU (60%)
"Invert the model": USO (97% contrarian!), SPY (83%), QQQ (70%), ITA (63%)

The simulated contrarian strategy posted a 0.83 Sharpe ratio and averaged +1.31% per 10-day trade. High-confidence contrarian shorts hit a 72% win rate. On paper, interesting. But 30 days of data doesn't build a career.

Adding Macro Context

Last experiment: what if we gave the model more context? We ran TimesFM on four macro indicators — VIX (fear), the dollar index, 10-year treasury yields, and crude oil — and used their forecasts as a consensus filter alongside the ETF calls.

Made everything worse. Accuracy dropped from 47% to 32%.

Turns out, TimesFM can't forecast macro indicators reliably either. Stacking unreliable forecasts on top of unreliable forecasts just compounds the noise. Lesson learned.

What We Actually Learned

This was never going to end with us bolting a foundation model onto Tauntaun and calling it done. That's not the point. The point is understanding what these tools can and can't do — before you need them.

Here's the real takeaway: financial markets are adversarial. Language has grammar. Sensor data has physics. But price series exist in a world where, if a pattern were reliably predictive, someone would already be exploiting it until it disappeared. A model trained on "what comes next in sequences" is fighting the efficient market hypothesis with pattern matching. It's a knife at a gunfight.

TimesFM would probably crush demand forecasting, energy consumption, patient vitals — domains where the underlying process has structure and isn't actively trying to defeat you. For ETFs? The seven signal sources Tauntaun already uses (FRED, news, Google Trends, prediction markets, credit spreads, geopolitical risk, whale wallets) carry fundamentally different information than price history alone.

We're not discouraged. We're one experiment smarter. That's the whole game — try things fast, measure honestly, keep what works, document what doesn't. Three scripts, 300 forecasts, and a clear answer in under an hour.

On to the next one. 🧊

Portfolio snapshot: $100,025 · +$25 (+0.03%) · 10 positions · Day 4
Scripts: tauntaun_poc.py · tauntaun_backtest.py · tauntaun_contrarian_xreg.py
Model: TimesFM 2.5 (200M params) · 22s load · 0.6s inference for 10 ETFs