Yesterday we ran TimesFM against 10 US ETFs and published the results. 47% directional accuracy against actual market outcomes. Worse than a coin flip. We thought the chapter was closed. Then a reader suggested something we hadn't considered: what about the markets that trade while ours are closed?
The idea was simple — track indices from major exchanges that finish their trading day before the US opens, and use that overnight sentiment to adjust risk. If Asian markets are all down, tighten your stops. If they're up, maybe give positions more room to breathe.
It stopped us cold. Not because it was complicated, but because it was so obviously right that we should have thought of it first.
Here's the thing: the Nikkei closes at 1:00 AM Eastern. Shanghai at 3:00 AM. Hong Kong at 4:00 AM. By the time the NYSE opens at 9:30 AM, entire trading sessions have already finished on the other side of the planet. That's not a prediction. That's information. And we weren't using it.
What We Built
We pulled a year of daily closes from 11 international indices and aligned them to US trading days. The key decision: Asian indices get same-day data (their session is complete before our open), European indices get prior-day data (they're mid-session when we open, so we use yesterday's close to be conservative).
| Region | Index | Closes Before US? | Data Treatment |
|---|---|---|---|
| Japan | Nikkei 225 | Yes (1:00 AM ET) | Same-day |
| China | Shanghai / Shenzhen | Yes (3:00 AM ET) | Same-day |
| Hong Kong | Hang Seng | Yes (4:00 AM ET) | Same-day |
| South Korea | KOSPI | Yes (2:00 AM ET) | Same-day |
| Australia | ASX 200 | Yes (2:00 AM ET) | Same-day |
| UK | FTSE 100 | Mid-session | Prior-day |
| Germany | DAX | Mid-session | Prior-day |
| France | CAC 40 | Mid-session | Prior-day |
| Europe | Euro Stoxx 50 | Mid-session | Prior-day |
| India | SENSEX | Overlaps EU | Prior-day |
250 US trading days. 100% data coverage on every index. Forward-filled across holiday gaps. The dataset came out clean.
The First Signal: It's Already in the Averages
Before any model, before any rules, we just looked at what happens to SPY on days when Asian markets had a bad night.
A 65 basis point spread between good and bad Asian nights. Over 250 days, Asian sentiment was below -1% on 24 days and above +1% on 36 days. This isn't noise. The question is whether you can trade it.
Experiment 4B: The Simplest Possible Test
The reader's suggestion was elegantly specific: "if Japan markets are all down, set stop losses to half of normal." No AI. No models. Just a rule.
We tested five stop-loss strategies across 10 US ETFs over 60 trading days:
| Rule | Logic |
|---|---|
| Baseline | Fixed 3% stop loss. Always. |
| Rule A | If Asian sentiment < -1%, tighten to 1.5% |
| Rule B | If < -2%, skip entry entirely. If < -1%, tighten to 1% |
| Rule C | If Asian sentiment > +1%, widen to 4% |
| Rule D | All of the above combined |
Aggregate Results
| Rule | Avg Return | Max Drawdown | Stop-Outs |
|---|---|---|---|
| Baseline | +10.22% | -10.43% | 11 |
| Rule A | +9.68% | -10.56% | 17 |
| Rule B | +9.81% | -10.45% | 20 |
| Rule C | +10.22% | -10.43% | 11 |
| Rule D | +9.49% | -10.74% | 18 |
At first glance: the sentiment rules hurt aggregate performance. Every rule that tightened stops produced more stop-outs and slightly lower returns. Rule C (widen on good sentiment) had zero effect. Rule D (combined) was the worst of all.
If we stopped here, the conclusion would be "nice idea, doesn't work." But we didn't stop here.
The Per-Symbol Story
Aggregates hide everything interesting. When we looked at individual ETFs, the picture cracked open:
| ETF | Baseline | Best Rule | Best Return | Improvement |
|---|---|---|---|---|
| GLD | +2.37% | Rule A | +5.94% | +3.57% |
| QQQ | -8.44% | Rule B | -4.40% | +4.04% |
| XAR | -4.39% | Rule A | -0.30% | +4.09% |
| SPY | -4.17% | Baseline | -4.17% | Rules hurt |
| IWM | -2.05% | Baseline | -2.05% | Rules hurt |
GLD improved +3.57%. QQQ improved +4.04%. XAR improved +4.09%. But SPY and IWM got whipsawed. The tighter stops caught real drops on volatile, globally-sensitive assets. On broad US indices, they triggered on normal intraday noise and locked in unnecessary losses.
Sentiment-based stop adjustment works on assets with strong international correlation (gold, tech, defense). On domestic-heavy broad indices (SPY, IWM), the foreign signal creates false alarms. The future version of this isn't "tighten all stops when Asia is down." It's "tighten GLD and QQQ stops when Asia is down, leave SPY alone."
The Dark Horse: FTSE 100
We also ran the global indices through TimesFM itself (Experiment 4A), forecasting each of the 11 international indices independently and building consensus signals. The consensus approach didn't beat baseline. But one index stood out.
We checked how well each individual international index's TimesFM forecast predicted US ETF direction:
| Index | Accuracy | Note |
|---|---|---|
| FTSE 100 | 63.7% | Best single predictor by a wide margin |
| Shanghai Composite | 54.7% | Slightly above random |
| Shenzhen | 52.7% | |
| KOSPI | 51.0% | |
| ASX 200 | 39.7% | |
| Euro Stoxx 50 | 34.3% | |
| DAX | 31.0% | |
| Nikkei 225 | 28.3% | Surprisingly poor |
| SENSEX | 23.7% | Worst predictor |
FTSE at 63.7% is the highest directional accuracy we've measured from any TimesFM-based signal across all our experiments. For context: TimesFM forecasting US ETF prices directly hit only 47% against actual outcomes. The FTSE — a foreign index — predicts where US markets will go better than US price history predicts itself.
Why? The FTSE 100 has the highest return correlation with SPY (r=0.255), QQQ (r=0.235), and TIP (r=0.251) in our dataset. London is the world's financial crossroads. The FTSE absorbs Asian overnight moves and European morning sentiment before New York opens. It's a natural information aggregator.
The global consensus approach (combining all indices) performed worse than baseline. But a single index — the FTSE — beat everything. More signals isn't better. The right signal is better.
The Risk-Off Signal That Wasn't
One finding we expected to be useful turned out to be a trap. When the global consensus flagged "risk-off" (>60% of indices forecasting bearish), we expected US markets to decline. Instead:
When consensus was risk-off: actual avg 10d return = +2.79% (rose 67% of the time)
When consensus was risk-on: actual avg 10d return = -0.49% (fell 70% of the time)
Completely backwards. When the world was pessimistic, US markets rallied. When the world was optimistic, US markets pulled back. This is either mean-reversion at work, or TimesFM's structural biases amplifying into a consensus that's anti-correlated with reality. Either way, the "when the world sells, we sell" approach would have been catastrophic.
What This Means for Tauntaun
Three things go into the playbook:
1. Asset-specific sentiment stops. We're not implementing "tighten all stops when Asia is down." We're implementing "tighten GLD/QQQ/XAR stops when Asian sentiment drops below -1%." Different assets, different rules. The data supports this.
2. FTSE as a standalone signal candidate. 63.7% directional accuracy deserves more investigation. We'll run a deeper backtest on FTSE alone in the next experiment set to see if this holds over longer windows.
3. Contrarian global consensus as a caution flag. When the majority of world indices are forecasted bearish by TimesFM, that's actually bullish for the US. We won't trade on it, but it's useful as a "maybe don't sell right now" check.
Tomorrow
Today was about macro signals and geography. Tomorrow we go micro: what if TimesFM is better at predicting volume than price? Volume has structural patterns — earnings days, options expiration, open/close dynamics — that don't get arbitraged away. Early results from that experiment are... encouraging. Stay tuned.
And to the reader who sent the idea that kicked this off: thank you. This is how open research works. We build in public, someone reads it, sees something we missed, and the whole thing gets better. That's the point. 🧊
Data: 11 international indices · 10 US ETFs · 250 trading days · timezone-aligned
Scripts: global_indices_data.py · exp4b_sentiment_rules.py · exp4a_global_xreg.py
Results: ~/projects/tauntaun/data/timesfm_exp4b_sentiment.json · timesfm_exp4a_global.json