Geopolitical Risk Signals: Research Report

Mar 5, 2026 Quant Researcher

Geopolitical Risk Signals: Research Report

Field Value
Date 2026-03-05
Researcher Quant Researcher (Claude Opus 4.6)
Status Complete
Script analysis/quant-research/scripts/geopolitical-risk-research-2026-03-05.py
Charts geopolitical_spy_event_study.png, geopolitical_sector_event_study.png, geopolitical_correlation_shift.png, geopolitical_buy_invasion_dist.png

1. Hypothesis

We test five hypotheses:

H1: Geopolitical events create predictable, tradeable patterns in S&P 500 returns (the "buy the invasion" thesis).

H2: Defense stocks (LMT, NOC, RTX, GD, ITA) reliably outperform the broad market following conflict events.

H3: Gold, treasuries, and oil provide meaningful hedges during geopolitical stress windows.

H4: Quantitative signals (VIX, credit spreads, cross-asset correlations) provide advance warning of geopolitical events or at least identify the stress regime quickly enough to be actionable.

H5: A "geopolitical risk overlay" can add alpha to our existing Immune System or RS Screener models.


2. Data

Sources Used

Asset Source Range Rows
SPY yfinance (extended) 2001-01-02 to 2026-03-05 6,330
VIX yfinance (extended) 2001-01-02 to 2026-03-05 6,330
GLD yfinance (extended) 2004-11-18 to 2026-03-05 5,356
TLT yfinance (extended) 2002-07-30 to 2026-03-05 5,938
ITA, XLE, GD, LHX, RTX project OHLCV 2018-01-02 to 2026-03-04 ~2,053
LMT, NOC yfinance 2010-01-04 to 2026-03-05 ~4,067
PANW, FTNT yfinance 2012-07-20 to 2026-03-05 ~3,400-4,067
CRWD yfinance 2019-06-12 to 2026-03-05 1,692
XOP, USO yfinance 2010-01-04 to 2026-03-05 ~4,067

Geopolitical Events Studied (n=10)

Event Date Type
9/11 Attacks 2001-09-11 Terrorist attack
Iraq War Start 2003-03-20 US invasion
Georgia War 2008-08-08 Regional conflict
Arab Spring (Libya) 2011-02-17 Civil war / intervention
Crimea Annexation 2014-02-27 Territorial seizure
ISIS Caliphate 2014-06-29 Non-state actor
Soleimani Strike 2020-01-03 Targeted killing
Russia Invades Ukraine 2022-02-24 Full-scale invasion
Hamas Attack Israel 2023-10-07 Terrorist attack / war
Iran Strikes Israel 2024-04-13 Direct state-on-state

3. Methodology

Event Study Design

For each geopolitical event, we compute cumulative returns in a window of -20 to +60 trading days, normalized to 0% at the event day. This standard event-study methodology allows us to see both pre-event drift and post-event recovery patterns.

Statistical Tests

  • Two-sample t-tests comparing post-event returns to 10,000 randomly-sampled equivalent-horizon returns from SPY's full history
  • One-sample t-tests for defense sector excess returns (testing whether mean excess return is significantly different from zero)
  • Correlation analysis comparing normal-period vs. stress-period cross-asset correlations (stress defined as 45-day windows following each event)
  • Beta-adjusted excess returns for cybersecurity stocks to separate genuine alpha from high-beta recovery

4. Results

A. S&P 500 "Buy the Invasion" Pattern

SPY Event Study

Horizon Event Mean Event Median Random Mean t-stat p-value Win Rate
+5 days -1.11% +0.36% -- -- -- 6/10 (60%)
+10 days -0.57% -0.43% -- -- -- 3/10 (30%)
+20 days +0.52% +0.58% +0.72% -0.14 0.888 6/10 (60%)
+40 days -0.18% +1.53% +1.51% -0.85 0.395 6/10 (60%)
+60 days -0.79% +2.66% +2.29% -1.29 0.197 6/10 (60%)

Interpretation: There is NO statistically significant edge to buying SPY after a geopolitical event. At every horizon tested, post-event returns are statistically indistinguishable from random 60-day windows (all p-values > 0.19). The mean is actually negative at +60 days (-0.79%) due to two catastrophic outliers -- the Georgia War coincided with the GFC onset (SPY -24.5% at +60d) and the Soleimani Strike coincided with COVID (-19.6% at +60d). The median is positive (+2.66% at +60d), but this is still lower than the random median (+3.42%).

Verdict on H1: REJECTED. "Buy the invasion" is not a statistically valid strategy. The 60% win rate at +60d is barely above a coin flip and well within the base rate for any random 60-day SPY window.

B. VIX Pricing Speed

Event VIX Pre-Event VIX at Event Spike % Half-Life
9/11 Attacks 27.3 31.8 +16.4% 29 days
Iraq War Start 31.3 30.4 -2.7% 1 day
Crimea Annexation 14.3 14.0 -2.1% 1 day
Russia Invades Ukraine 28.0 30.3 +8.3% 1 day
Hamas Attack Israel 18.4 17.5 -5.1% 1 day
Iran Strikes Israel 15.4 17.3 +12.5% 7 days

Key finding: Most geopolitical events produce surprisingly modest VIX reactions. Only 9/11 caused a large, persistent spike. For 6 of 10 events, VIX was already at or near its pre-event level within one trading day. The Iraq War and Crimea events actually saw VIX decline on the event day -- markets had already priced in the risk during the buildup.

This is the most important finding in this report: Geopolitical events are priced in within hours to days. There is no multi-week tradeable window. By the time a retail or systematic trader identifies the event and acts, the pricing adjustment is substantially complete.

C. Defense Sector Performance

Sector Event Study

20-day excess returns over SPY (defense names):

Event LMT NOC RTX GD ITA Avg
Arab Spring (Libya) +3.7% +2.3% N/A N/A N/A +3.0%
Crimea Annexation -1.4% +0.2% N/A N/A N/A -0.6%
ISIS Caliphate +4.7% +5.0% N/A N/A N/A +4.9%
Soleimani Strike +1.9% -3.5% -2.7% -2.0% -1.7% -1.6%
Russia Invades Ukraine +8.9% +9.9% +2.4% +4.8% +2.4% +5.7%
Hamas Attack Israel +11.9% +10.0% +17.4% +9.4% +7.8% +11.3%
Iran Strikes Israel +2.1% +2.1% +4.3% +1.0% +3.8% +2.7%

Statistical significance of defense outperformance:

Ticker Mean Excess Positive Rate t-stat p-value n
LMT +4.57% 86% 2.68 0.037 7
NOC +3.70% 86% 1.97 0.097 7
RTX +5.32% 75% 1.24 0.302 4
GD +3.31% 75% 1.35 0.270 4
ITA +3.07% 75% 1.57 0.214 4

Interpretation: LMT is the only name that reaches statistical significance (p=0.037) for outperformance at the 20-day horizon. NOC is borderline (p=0.097). The others fail to reach significance due to small sample sizes (n=4). However, the direction is consistently positive -- defense outperforms in 75-86% of events, with an average excess return of +3-5%.

Important caveat on durability: At the 60-day and 90-day horizons, defense outperformance fades significantly. The average ITA excess return goes from +3.07% at +20d to -3.05% at +60d to -3.57% at +90d. This is driven almost entirely by the Soleimani event (COVID crash overwhelmed any defense thesis). Excluding Soleimani, defense outperformance persists longer, but the sample becomes n=3, which is too small for any statistical claim.

Verdict on H2: PARTIALLY SUPPORTED but fragile. Defense stocks do tend to outperform in the first 20 trading days, but the effect is (a) not robust at longer horizons, (b) based on small samples, and (c) driven heavily by recent events (Ukraine, Hamas) that may reflect a structural shift in defense spending expectations rather than a repeatable "war trade."

D. Energy Sector: Conflict Type Matters

A critical finding: energy sector performance is highly dependent on the type of conflict, not the mere existence of one.

Middle East conflicts (XLE 20-day returns): - Soleimani Strike: -12.7% - Hamas Attack: +1.1% - Iran-Israel: -2.4% - Mean: -4.66%

Non-Middle East conflicts (XLE 20-day returns): - Russia Invades Ukraine: +15.9%

The naive thesis that "wars = higher oil = buy energy" is wrong more often than it is right. Only the Ukraine invasion produced meaningful energy outperformance, because it directly disrupted the world's third-largest oil producer. Middle East events since 2020 have NOT produced sustained energy rallies -- the global supply picture has changed (US shale, OPEC+ spare capacity, strategic reserves).

Oil (USO) tells the same story but louder: 20-day post-event returns range from -20.4% (Soleimani) to +20.2% (Ukraine). The dispersion is enormous, and the sign depends entirely on whether the conflict genuinely threatens supply, not on the conflict's severity.

E. Cybersecurity as "Modern War Trade"

CRWD's beta-adjusted excess returns around conflicts:

Event CRWD Return SPY Return Beta-Adj Excess
Soleimani +20.4% +0.5% +19.7%
Ukraine +22.2% +5.5% +15.1%
Hamas +6.5% +1.2% +5.0%
Iran-Israel +3.8% +2.0% +1.3%

CRWD's market beta is 1.27, meaning ~27% of its movement is just amplified SPY. Even after adjusting for beta, CRWD shows genuine excess returns post-conflict. However, this trend is clearly decaying: +19.7% excess after Soleimani, +15.1% after Ukraine, +5.0% after Hamas, +1.3% after Iran-Israel. The "cybersecurity war trade" appears to be getting arbitraged away as it becomes more widely known.

FTNT and PANW show no consistent pattern. The cybersecurity thesis is really a "CRWD thesis" more than a sector trade, and even that is fading.

F. Gold and Treasuries as Hedges

Gold (GLD) 20-day post-event returns: - Event mean: +0.71% - Random mean: +0.90% - t-stat: -0.113, p-value: 0.910 - Win rate: 5/8 (62%)

Treasuries (TLT) 20-day post-event returns: - Event mean: +2.05% - Random mean: +0.36% - t-stat: 1.351, p-value: 0.177 - Win rate: 8/9 (89%)

Gold does NOT outperform its own baseline during geopolitical events. The "gold as geopolitical hedge" narrative is not supported by the data at a 20-day horizon. Gold may spike intraday on the news, but those gains are not durable.

Treasuries show a stronger signal (89% win rate, +2.05% mean vs +0.36% random), but it does not reach statistical significance at p=0.05. The flight-to-quality into treasuries is real but moderate and inconsistent in magnitude. Notably, the Ukraine invasion was the glaring exception: TLT lost -4.48% at +20d as inflation and rate-hike concerns dominated the geopolitical flight-to-safety impulse.

G. Correlation Regime Shifts

Correlation Shifts

Pair Normal Stress Shift
SPY-GLD +0.113 -0.341 -0.454
SPY-XLE +0.626 +0.140 -0.486
SPY-ITA +0.765 +0.622 -0.143
GLD-XLE +0.078 +0.375 +0.297
SPY-TLT -0.164 +0.053 +0.216

This is the most actionable quantitative finding. During geopolitical stress:

  1. SPY-GLD correlation inverts dramatically (from +0.11 to -0.34). Gold genuinely decouples from equities during conflicts, even if its absolute return is not exceptional.
  2. SPY-XLE correlation collapses (from +0.63 to +0.14). Energy trades on supply disruption narratives, not equity beta.
  3. GLD-XLE correlation strengthens (+0.08 to +0.38). Gold and energy become correlated during conflict (both are "real asset / commodity" plays).
  4. SPY-TLT relationship shifts from mildly negative (-0.16) to slightly positive (+0.05). This is surprising and reflects the post-2022 regime where inflation concerns reduce treasuries' hedging effectiveness.

H. Pre-Event Signal Detection

Do any quantitative signals provide advance warning?

Event VIX 5d Pre-Chg GLD/SPY Ratio Chg TLT Chg Advance Warning?
9/11 +25.1% N/A N/A Yes (VIX elevated)
Iraq War -1.6% N/A -0.7% No (gradual buildup)
Crimea -11.8% +2.0% -0.04% Mild (GLD/SPY rising)
Ukraine +8.6% +6.5% -2.1% Yes (VIX + Gold/SPY)
Hamas +22.3% -1.2% -6.3% Yes (VIX elevated)
Iran-Israel +10.2% +6.6% -2.6% Yes (VIX + Gold/SPY)

For events post-2020, there IS a pattern: VIX tends to be elevated before major conflict events (Russia-Ukraine buildup, Hamas/Iran tensions). But this is confounded -- VIX was elevated for many reasons in those periods (COVID aftermath, inflation, rate uncertainty). The signal-to-noise ratio for "VIX elevation predicts geopolitical events" is very poor.


5. Signal Quality Assessment

False Positive Rate

The fundamental problem with geopolitical signals is the false positive rate. VIX spikes above 20 roughly 30-40% of all trading days since 2018. Of those days, fewer than 1% precede an actual geopolitical event. Any system that flags "VIX elevated" as a geopolitical risk warning will produce hundreds of false positives for every true positive.

Lead Time

Signal Typical Lead Time Actionable?
VIX spike 0-1 days No (concurrent, not leading)
Gold/SPY ratio rise 0-5 days Marginal (too many false positives)
Credit spread widening 0-2 days No (concurrent)
GPR Index (newspaper-based) Same-day to +1 month lag No (published monthly, available daily but noisy)

Robustness

  • Sample size: n=10 events over 25 years is extremely small. Any result with n=10 should be treated as suggestive, not conclusive.
  • Survivorship bias: We only study events that were significant enough to be remembered. Smaller events (Turkish invasion of Syria, Saudi-Yemen conflict, Ethiopia-Tigray) may show different patterns.
  • Regime dependence: Pre-2020 vs post-2020 market structure is meaningfully different (COVID, rates, inflation). Results from 2001-2014 may not generalize.
  • Confounders: The Soleimani event coincided with COVID; the Georgia event coincided with the GFC. Isolating the "geopolitical" effect from the macro backdrop is nearly impossible with so few events.

6. Academic Literature Context

The Caldara-Iacoviello Geopolitical Risk (GPR) Index, published by the Federal Reserve, is the gold standard for quantifying geopolitical risk from newspaper text. Key findings from the academic literature:

  • GPR predicts returns at a monthly horizon, but the effect is small and concentrated in the "threats" sub-index (GPRHT), not the "acts" sub-index. In other words, the anticipation of conflict affects markets more than the conflict itself (Caldara & Iacoviello, AER 2022).
  • Higher GPR foreshadows lower investment and stock prices, but the effect works through economic channels (capex reduction, uncertainty premium) rather than direct market shocks (Fed Working Paper).
  • Defense stock abnormal returns are event-specific, not systematic. After the Russia-Ukraine invasion, arms companies averaged +10pp CAR; after the Hamas attack, CAR was indistinguishable from zero (Tandfonline, 2025).
  • Geographic proximity matters more than event severity. A country neighboring Ukraine experienced -23.1% cumulative stock decline; the effect diminished by ~2.6pp per 1,000km of distance (Federle et al., JMCB).
  • VIX backwardation is a contrarian buy signal, but this is a general finding about VIX term structure, not specific to geopolitical events. Backwardation signals panic, and panic is followed by recovery -- regardless of the cause (Macrosynergy Research).

7. What Could We Build in Signals?

Option A: GPR Index Integration (Recommended if building anything)

Data source: The GPR Index is freely downloadable from matteoiacoviello.com/gpr.htm as an Excel file. Daily frequency is available. The data could be fetched monthly and cached locally, similar to how we fetch FRED data for credit-stress.

Architecture:

Input:  GPR daily index (downloaded monthly from Fed website)
        + Immune System turbulence output
        + VIX level
        + GLD/SPY 20-day ratio

Computation:
  1. GPR z-score (current GPR vs 252-day rolling mean/std)
  2. GPR regime classification: Normal (<1 sigma), Elevated (1-2 sigma), High (>2 sigma)
  3. Overlay modifier: When GPR is "High" AND Immune System is "Elevated" or above,
     elevate the IS warning by one notch (same pattern as the BKLN modifier)

Output: Modified IS warning level + GPR regime tag in manifest

Estimated effort: 1-2 days. Small module, mostly data plumbing.

Value assessment: Low. The GPR Index is monthly (daily version is noisy), and by the time it registers an event, markets have already priced it in. The IS turbulence index will already be elevated because geopolitical events cause cross-asset volatility, which is exactly what Mahalanobis distance measures. The GPR overlay would provide narrative value ("the IS warning is elevated AND geopolitical risk is elevated") but unlikely to provide informational value beyond what IS already captures.

Option B: Correlation Regime Detector (More Interesting)

Rather than trying to detect geopolitical events, detect the correlation regime shifts that accompany them. This is more general and captures any source of stress, not just geopolitical.

Architecture:

Input:  Daily returns for SPY, GLD, TLT, XLE (already in IS core basket)

Computation:
  1. 20-day rolling correlation matrix (SPY-GLD, SPY-XLE, SPY-TLT)
  2. Compare to 252-day rolling correlation matrix
  3. Flag "correlation breakdown" when any pair deviates >2 sigma from its 252-day norm
  4. Specifically flag SPY-GLD inversion (normal: +0.11, stress: -0.34)

Output: Correlation regime tag (Normal / Decorrelating / Stressed)

Estimated effort: 2-3 days. More computation but leverages existing IS data pipeline.

Value assessment: Medium. Correlation breakdown is a genuine leading/concurrent indicator of regime change, regardless of the cause. It captures geopolitical events, credit events, liquidity events, and monetary policy shocks. However, it may overlap significantly with the IS dispersion ratio and stealth stress detection that already exist.

Option C: War Playbook (Static Reference)

Rather than a live signal, produce a static reference document: "When a geopolitical event occurs, here is what the data says about sector behavior, hedge effectiveness, and timeline."

Contents: 1. Expected SPY drawdown range (median -0.4% at day 0, max -11.3% in 10 days) 2. Defense sector outperformance window (strongest in first 20 trading days, fades by 60d) 3. LMT and NOC as the most consistent defense names 4. Gold: decorrelation benefit, not absolute return 5. Treasuries: flight to quality, but not reliable in inflationary environments 6. Energy: only if the conflict threatens oil supply (Russia/Ukraine = yes, Middle East since 2020 = no) 7. Cybersecurity: CRWD alpha is fading, don't chase it 8. VIX normalization timeline: expect half-life of 1-7 days for most events

Estimated effort: Already substantially done by this research report.

Value assessment: Highest practical value per unit of effort. When a geopolitical event occurs, the user can reference this playbook rather than panicking or relying on pundit narratives.


8. Honest Assessment

Should we build this?

No, not as a live signal module. Here is why:

  1. The base rate problem is fatal. Geopolitical events happen ~1-2x per year at most. Any system designed to detect them will spend 99% of its time either silent or producing false positives. The signal-to-noise ratio is structurally terrible.

  2. Markets price these events within hours. By the time any systematic signal fires, the opportunity is gone. The only tradeable window is the first 1-2 hours after news breaks, which requires real-time news processing and sub-minute execution -- both are out of scope for Signals.

  3. The IS already captures the effect. The Mahalanobis distance turbulence index will spike when geopolitical events cause cross-asset volatility. The divergence detector will fire when hidden stress builds. The correlation breakdown that accompanies geopolitical events is already reflected in the IS's correlation surprise metric. We would be building a narrative wrapper around signals that already exist.

  4. n=10 is not enough to build a systematic strategy. We have 10 events over 25 years. Even the strongest finding (LMT outperformance, p=0.037) is fragile -- one additional event where LMT underperforms would push it above p=0.05. This is not the kind of evidence base you build a module on.

  5. The GPR Index adds narrative, not information. The Caldara-Iacoviello index is a rigorous academic construct, but it tells you something you already know -- that the newspapers are talking about geopolitical risk. By the time the index spikes, you have already seen the VIX spike, the gold rally, and the IS turbulence warning.

What IS worth doing?

The War Playbook (Option C) is the highest-value output. It costs nothing to maintain, requires no data pipeline, and provides the most useful decision support when an event actually occurs. This research report effectively IS the first draft of that playbook.

If there is appetite for any model work, Option B (correlation regime detector) is more interesting than Option A, because it captures a broader set of stress events and might complement the existing IS dispersion ratio. But it should be evaluated on its own merits as a general regime indicator, not as a "geopolitical risk" tool.

Minimum Viable Version

If the user wants something in the pipeline, the minimum viable version is:

  1. Download the GPR daily index once per month (manual or cron)
  2. Add a one-line GPR z-score to the IS report output ("GPR: Elevated" / "GPR: Normal")
  3. No computation, no overlay logic, no threshold optimization
  4. Cost: ~30 minutes of work

This gives the user a glanceable geopolitical risk context alongside their existing IS report, without building a module that creates maintenance burden for marginal informational value.


9. Limitations

  1. Survivorship bias in event selection. We only studied events significant enough to be named. The vast majority of geopolitical tensions produce no measurable market impact.

  2. Confounded events. The Soleimani event coincided with COVID onset; the Georgia War coincided with the GFC. We cannot isolate geopolitical effects from macro cycles with n=10.

  3. Post-2020 regime change. Higher rates, inflation sensitivity, and structural changes in energy markets mean that pre-2020 patterns (flight to treasuries, oil rallies on Middle East conflict) may no longer hold.

  4. No intraday analysis. With daily data, we cannot measure the first-hour reaction, which is where most of the tradeable move occurs. Our analysis inherently understates the speed of pricing.

  5. US-centric. All analysis uses US-listed assets. The academic literature shows that geographic proximity to conflict is the strongest predictor of market impact -- European and emerging market equities behave very differently around these events.

  6. LMT significance may be spurious. With 7 observations and no multiple-testing correction, p=0.037 for LMT outperformance should not be treated as definitive. Applying a Bonferroni correction for 5 defense names tested, the threshold would be p=0.01, and LMT would fail.


10. Conclusion

The data provides a clear answer: geopolitical risk signals are primarily noise for a systematic, daily-frequency trading system. The events are too rare, too quickly priced, and too heterogeneous to form the basis of a reliable signal. The existing Immune System turbulence index already captures the cross-asset volatility effects that accompany these events.

The practical value lies in a static playbook that the user can reference when events occur, not in a live signal that runs daily and produces nothing actionable 364 days per year.

The one finding worth noting for potential future work: cross-asset correlation regime shifts are real, measurable, and potentially useful as a general stress indicator -- but this would be an enhancement to the IS module's existing architecture, not a geopolitical-specific tool.