Do High Implied Volatility Indicators Predict Stock Price Declines? IVolAI Investigates

A Cross-Sectional Study of S&P 500 Stocks, 2021–2025

Summary

There is a widely held belief among options traders that stocks with extremely high implied volatility (IV) are headed for a crash. The logic is intuitive: when the options market is pricing in large expected moves, something bad must be coming. But is this actually true? Do stocks with the highest IV readings consistently underperform?
At IVolatility, we tasked our artificial intelligence backtesting assistant, IVolAI, with conducting a comprehensive cross-sectional analysis to rigorously test this hypothesis. Using five years of daily data across all S&P 500 stocks, IVolAI evaluated multiple implied volatility indicators and their forward performance characteristics.
Our main finding: the answer depends on which IV indicator you use — and it's far more nuanced than the simplistic "high IV = crash" narrative suggests.
This entire report, including statistical testing, robustness checks, and backtesting validation, was prepared by IVolAI.

Key Findings

IV/HV Ratio (implied volatility divided by historical volatility) is the only indicator that shows consistent underperformance after a high reading: -0.29% below baseline at 20 days, -0.72% at 30 days. This effect is statistically significant.
IVX30 Raw (absolute implied volatility level) actually shows the opposite — stocks with the highest absolute IV levels gained an average +5.09% over 20 days. But this is misleading: it's driven by "crash survivors" like CVNA and SMCI that had extreme IV during distress and then rebounded.
IV Rank and IV Percentile show no reliable predictive power. Their results are indistinguishable from random noise.
Year-to-year consistency is poor. No indicator works reliably every year. The IV/HV Ratio was negative in 4 out of 5 years, but by small and varying amounts.
The time window matters. We tested five IV/HV ratio variants from IVX7/HV10 to IVX60/HV30. Using a composite score that avoids cherry-picking horizons, IVX14/HV10 (14-day implied vs 10-day realized) emerged as the best all-round predictor, while IVX60/HV30 showed the most year-to-year consistency.
Backtest validation (Section 10): Trading the signal over 5 years with 313 symbols and 8,818 trades across two strategies: selling naked ATM options on high-IV/HV stocks breaks even at 30-day hold (PF 1.00, 73% win rate) with extreme tail risk, while short stock loses money consistently (PF 0.69). The -0.29% statistical edge is real but too small to trade profitably on individual names — contrast this with the same type of signal applied to SPY (Part II), which produced +61% returns.

1. What We Studied

The Question

When a stock has one of the highest implied volatility readings in the S&P 500 on a given day, what happens to its price over the next 1 to 30 trading days?

The Indicators

We tested four different measures of "high implied volatility," each capturing something different:

Indicator	What It Measures	Range	Interpretation
IV/HV Ratio (IVX30/HV20)	How much the options market's expectations exceed recent actual stock movement	0 to ~6	A ratio of 2.0 means the options market expects twice the movement that the stock has actually shown
IV Rank (IVR30)	Where today's IV sits relative to its range over the past year	0 to 100	IVR = 90 means today's IV is near the top of its 12-month range
IV Percentile (IVP30)	What percentage of days in the past year had lower IV than today	0 to 100	IVP = 95 means only 5% of days in the past year had higher IV
IVX30 Raw	The absolute level of 30-day implied volatility	5 to 300+	Higher numbers mean the market expects bigger price moves

These indicators are related but not the same. A stock can have high absolute IV (IVX30) but a low IV Rank if it always trades with high IV. Conversely, a normally quiet stock might have a high IV Rank even when its absolute IV is modest. The IV/HV Ratio specifically measures the gap between what the market expects and what is actually happening.

The Data

Universe: All S&P 500 stocks (520 unique tickers over the period)
Period: February 24, 2021 to December 31, 2025 (1,220 trading days)
Source: IVolatility.com /equities/stock-market-data API endpoint (pre-computed daily indicator values)
Total data points: 606,343 stock-days (520 stocks × ~1,220 days)
Signal events: 6,100 per indicator (5 stocks × 1,220 days)

The Method

Every trading day, we:

Ranked all ~500 S&P 500 stocks by each indicator
Selected the top 5 stocks (those with the highest readings)
Measured their forward returns at 1, 3, 5, 7, 10, 20, and 30 trading days

We then compared these "high-IV" stock returns against a baseline: the average forward return of all S&P 500 stocks for the same period and year. The baseline resets each year to account for bull/bear market conditions — comparing a 2022 bear market signal against 2021 bull market averages would be misleading.

This creates a simple but powerful question: did stocks flagged by each indicator do better or worse than the average S&P 500 stock over the same period?

2. Overall Results

The chart below shows the mean forward return at each time horizon for stocks flagged by each indicator, compared to the S&P 500 baseline:

Figure 1: Mean forward returns after a high-IV signal. The black dashed line is what an average S&P 500 stock earned. Lines above the baseline outperformed; lines below underperformed.

The most striking observation: IVX30 Raw (orange) dramatically outperforms, while IV/HV Ratio (red) steadily falls below baseline as the horizon extends. IV Rank and IV Percentile track close to baseline.

But raw performance can be misleading. What matters is the excess return — how much better or worse than the average stock:

Figure 2: Excess return vs S&P 500 baseline. Negative bars mean high-IV stocks underperformed the average stock.

Full Statistics

Indicator	Horizon	Mean Return	vs Baseline	Median	% Negative	Avg Loss	Avg Win
IV/HV Ratio	1d	+0.03%	-0.02%	+0.05%	47.8%	-1.50%	+1.46%
	5d	+0.20%	-0.06%	+0.29%	46.4%	-3.82%	+3.68%
	10d	+0.35%	-0.16%	+0.34%	46.8%	-5.34%	+5.36%
	20d	+0.72%	-0.29%	+0.62%	46.5%	-7.32%	+7.72%
	30d	+0.79%	-0.72%	+0.73%	47.0%	-8.46%	+9.00%
IV Rank	20d	+1.03%	+0.02%	+0.47%	48.4%	-9.58%	+11.00%
	30d	+1.95%	+0.44%	+0.64%	47.8%	-10.60%	+13.46%
IV Percentile	20d	+0.72%	-0.29%	+0.46%	48.4%	-9.28%	+10.11%
	30d	+1.41%	-0.09%	+0.60%	48.0%	-10.19%	+12.14%
IVX30 Raw	20d	+5.09%	+4.08%	+1.33%	47.4%	-15.98%	+24.08%
	30d	+8.00%	+6.50%	+2.67%	46.2%	-18.91%	+31.13%

Notice the IVX30 Raw indicator: mean +5.09% at 20 days but median only +1.33%. That's a massive gap, meaning a few extreme winners (stocks bouncing back from crashes) are pulling the average up. More on this in Section 5.

3. Year-by-Year Consistency

A signal that only works in certain years is unreliable. The heatmap below shows excess returns (vs that year's baseline) for each indicator, broken down by year and horizon:

Figure 3: Excess forward returns by year and horizon. Green cells = high-IV stocks beat the S&P 500 average. Red cells = they underperformed. Values are in percentage points.

What the heatmap reveals:

IV/HV Ratio (top-left): Consistently red (negative) across most cells, especially at 20d and 30d. This is the most directionally consistent indicator. The strongest underperformance was in 2025 (-3.3% at 30d) and 2023 (-1.7% at 30d). Only 2022 showed slight outperformance at longer horizons.

IV Rank (top-right): Mixed results. Strongly positive in 2022 (stocks flagged during the selloff bounced back) but slightly negative in 2021 and 2024. No consistent pattern.

IV Percentile (bottom-left): Similar to IV Rank but slightly more negative. Positive in 2022, negative in most other years.

IVX30 Raw (bottom-right): Dominated by two extreme years: 2023 (+22.4% at 30d!) and 2025 (+15.8% at 30d). These outlier years completely drive the overall positive result. In 2021 and 2022, IVX30 Raw actually showed negative returns.

Per-Year Baseline Returns

To understand the context, here are the baseline (average S&P 500 stock) 20-day forward returns per year:

Year	Baseline 20d	Market Context
2021	+1.05%	Strong bull market
2022	+0.00%	Bear market (Fed tightening)
2023	+1.80%	Recovery rally
2024	+0.82%	AI-driven bull market
2025	+1.40%	Tariff volatility + recovery

4. Statistical Significance

Are these results real or just random noise? We applied three statistical tests, each with different assumptions:

Welch's t-test — The standard test. Assumes returns are normally distributed (they aren't perfectly, but it's a reasonable approximation with 6,000+ observations).
Mann-Whitney U test (Wilcoxon rank-sum) — Does not assume any distribution. Simply asks: are the signal returns systematically shifted compared to baseline returns?
Bootstrap confidence interval — Takes no assumptions at all. Resamples the signal returns 10,000 times and computes the range of plausible excess returns.

20-Day Forward Return Results

Indicator	Excess Return	t-test p	Wilcoxon p	Bootstrap 95% CI	Verdict
IV/HV Ratio	-0.29%	0.034	0.044	[-0.55%, -0.02%]	STRONG
IV Rank	+0.02%	0.908	0.015	[-0.35%, +0.39%]	weak
IV Percentile	-0.29%	0.118	0.001	[-0.64%, +0.07%]	weak
IVX30 Raw	+4.08%	<0.001	<0.001	[+3.38%, +4.79%]	STRONG

"STRONG" = all three tests agree the effect is real (p < 0.05 for both parametric and non-parametric tests, and bootstrap CI excludes zero). "Weak" = only some tests significant.

Figure 4: Excess returns with 95% bootstrap confidence intervals. Error bars that cross the zero line indicate the result is not statistically significant at that horizon.

Across All Horizons (t-test significance)

Indicator	1d	3d	5d	7d	10d	20d	30d
IV/HV Ratio	-0.02	-0.02	-0.06	-0.07	-0.16	-0.29*	-0.72*
IV Rank	-0.02	-0.11	-0.10	-0.16	-0.07	+0.02	+0.44
IV Percentile	-0.08	-0.25	-0.26*	-0.27*	-0.28*	-0.29	-0.09
IVX30 Raw	+0.22	+0.63	+1.03	+1.37	+2.00	+4.08	+6.50**

Stars indicate statistical significance: * p<0.05, p<0.01, * p<0.001. Values are excess returns in percentage points.

Key insight: The IV/HV Ratio effect builds over time — barely visible at 1 day but growing to -0.72% at 30 days. This is consistent with a gradual price adjustment story, not a sudden crash.

5. Why IVX30 Raw Shows Positive Returns (The Survivor Bias Problem)

The IVX30 Raw result (+4.08% excess at 20d) seems to contradict the "high IV = crash" hypothesis. But look at the distribution:

Figure 5: Distribution of 20-day forward returns. The IVX30 Raw panel (bottom-right) shows a much wider distribution with a long right tail, indicating extreme positive outliers.

The IVX30 Raw distribution has much fatter tails than the others. The median return is only +1.33% while the mean is +5.09% — classic skew from outliers.

The Crash Survivor Effect

Stocks that reach extremely high absolute IV levels (IVX30 > 100%) are typically in genuine distress — bankruptcy risk, regulatory investigations, massive earnings misses. Some of these stocks do crash permanently. But many of them — especially those that remain in the S&P 500 — eventually recover, and the bounce-back is enormous.

The top appearances in our dataset illustrate this perfectly:

Stock	Signal Appearances	Avg 20d Return	Story
CVNA (Carvana)	933	+7.65%	Near-bankruptcy in 2022-23, then recovered 3,000%+
COIN (Coinbase)	839	+4.16%	Crypto winter → crypto recovery
SMCI (Super Micro)	813	+7.23%	Accounting concerns → AI boom recovery
APP (AppLovin)	774	+5.71%	High-growth volatility → strong rally
HOOD (Robinhood)	524	-1.53%	IPO bust, one of the few persistent losers
DLTR (Dollar Tree)	335	-4.91%	Fundamental deterioration, genuine crash
DG (Dollar General)	257	-6.33%	Similar to DLTR — true high-IV crash case

Figure 6: Most frequent high-IV stocks and their average forward returns. Green = positive, Red = negative. The biggest winners (CVNA, BIIB) dramatically outweigh the losers.

CVNA alone appeared 933 times in our signal database. When it had IVX30 > 200% in early 2023, it was a $4 stock. Twenty trading days later it would regularly be up 30-40%. These extreme recovery events dominate the IVX30 Raw average.

This is survivorship bias at work. We're studying S&P 500 stocks — companies that were already large enough to be in the index. Companies that crash and get delisted leave the sample. Companies that crash and recover generate massive positive returns. The IVX30 Raw indicator is picking up the "crash and bounce" pattern, not a genuine predictive signal.

Excluding Extreme Events

To test this, we excluded several volatile periods:

Exclusion Window	IV/HV Ratio 20d	IVX30 Raw 20d	Change for IVX30
None (full sample)	+0.72%	+5.09%	—
Apr 2025 tariffs removed	+0.75%	+4.93%	-0.16%
Jan 2022 selloff removed	+0.76%	+5.20%	+0.11%
Oct 2022 bottom removed	+0.71%	+5.37%	+0.28%

The IV/HV Ratio result is remarkably stable across exclusions. IVX30 Raw is more sensitive — removing the October 2022 bottom (when crash survivors started bouncing) actually makes it look better, confirming that the positive result comes from broad mean-reversion rather than any single event.

6. Do the Indicators Agree?

An important question: when IV/HV Ratio says "this stock has extreme IV," does IV Rank say the same thing? If different indicators flag different stocks, they may be measuring different phenomena.

Figure 7: Average daily overlap between top-5 lists. Each cell shows how many stocks (out of 5) appear in both indicators' top-5 on the same day. A value of 5.0 = perfect overlap, 1.0 = random.

Key findings:

IV Rank and IV Percentile overlap heavily (3.1 out of 5 stocks overlap daily). This makes sense — they measure similar things (where IV stands relative to its history).
IV/HV Ratio has minimal overlap with everything (1.1–1.3). It's capturing genuinely different stocks — those where implied vol exceeds realized vol, regardless of the absolute level.
IVX30 Raw is the most independent (1.1–1.2 overlap). Stocks with the highest absolute IV levels aren't necessarily the ones with the highest rank or ratio.

This explains why IV/HV Ratio gives different results from the others: it's selecting different stocks. A stock with 30% IV but only 10% recent historical vol (ratio = 3.0) will be flagged by IV/HV Ratio but not by the other indicators if its IV is average for its history. This is precisely the situation where the options market is "overpricing" the stock relative to what's actually happening — and our data suggests these stocks do underperform.

7. Ratio Term Structure: Does the IV/HV Time Window Matter?

Having established that the IV/HV Ratio is the most reliable predictor, we asked a natural follow-up question: does the time window of the ratio matter? Our standard ratio (IVX30/HV20) compares 30-calendar-day implied volatility against 20-trading-day historical volatility. But IVolatility also provides shorter-term IVX at 7, 14, and 21 calendar days, and we have HV10 (10 trading days) and HV30 (30 trading days) available.

We constructed five ratio variants:

Ratio	IV Window	HV Window	What it captures
IVX7/HV10	7 calendar (~5 trading)	10 trading days	Very short-term vol gap
IVX14/HV10	14 calendar (~10 trading)	10 trading days	Short-term vol gap
IVX21/HV10	21 calendar (~15 trading)	10 trading days	Medium-short vol gap
IVX30/HV20	30 calendar (~21 trading)	20 trading days	Standard (Part I study)
IVX60/HV30	60 calendar (~42 trading)	30 trading days	Longer-term vol gap

The hypothesis was intuitive: shorter ratios should predict shorter forward returns, and longer ratios should predict longer horizons. The data partially confirmed this — with a surprise.

Excess Return by Ratio and Horizon

Figure 8: Excess forward return for each ratio at each horizon. Darker green cells indicate stronger underperformance (our signal). Stars show statistical significance.

The heatmap reveals a clear pattern:

IVX14/HV10 is significant earliest — from 7d onward (p < 0.01). It's the best all-around predictor with the highest average excess return across all horizons (-0.26%).
IVX60/HV30 and IVX30/HV20 dominate at 20-30 days — the strongest single-horizon effects (-0.79% and -0.72% at 30d).
IVX7/HV10 is too noisy — the 7-day IV term captures too much weekly expiration noise.
IVX21/HV10 underperforms — oddly, it's weaker than both its neighbors (IVX14 and IVX30).

Composite Scoring: A Fair Comparison

Comparing ratios at cherry-picked horizons (e.g., "IVX14/HV10 wins at 7d") is misleading. To compare fairly, we designed a composite score with three components:

Average Excess Return (40% weight): Mean excess across all 7 horizons. A ratio that works at every horizon scores higher than one that only works at 30d.
Significance Breadth (30% weight): How many of the 7 horizons show statistical significance (p < 0.05). More significant horizons = more robust signal.
Year Consistency (30% weight): What percentage of the 5 years showed negative excess at 20d. A ratio that underperforms in 5/5 years is more reliable than one that only works in 3/5 years.

Figure 9: Composite scoring of all five ratio variants. Higher score = better predictor of underperformance. Red = excess return component, blue = significance breadth, green = year consistency.

Ratio	Avg Excess	Sig Horizons	Years Negative	Combined Score
IVX14/HV10	-0.259%	4/7	60%	0.800
IVX60/HV30	-0.251%	2/7	100%	0.774
IVX30/HV20	-0.191%	2/7	60%	0.378
IVX7/HV10	-0.137%	2/7	40%	0.100
IVX21/HV10	-0.157%	1/7	40%	0.064

IVX14/HV10 wins (0.800), narrowly beating IVX60/HV30 (0.774). The two winners have complementary strengths:

IVX14/HV10 excels in significance breadth (4/7 horizons significant) and average effect size.
IVX60/HV30 excels in year consistency (negative all 5 years) but is only significant at 2 horizons.

The standard IVX30/HV20 lands in the middle — decent but outperformed by both the shorter and longer alternatives.

Different Stocks, Different Signal

The overlap analysis shows that these ratios are not just trivially repackaging the same information:

Figure 10: Average daily top-5 overlap between ratio variants. Short-term ratios (IVX7-21/HV10) pick similar stocks to each other but very different stocks from IVX30/HV20 and IVX60/HV30.

IVX7/HV10 and IVX14/HV10 share 4.2 out of 5 stocks daily — nearly identical selections. But they share only 1.8 stocks with IVX30/HV20. The short-term ratios are selecting stocks where the implied-realized gap is concentrated in the very near term: the market expects imminent movement that hasn't materialized yet.

Year-to-Year Stability

The critical weakness of all these signals is year-to-year variability:

Figure 11: Per-year excess return at 20d for each ratio. Negative bars (below zero) mean the signal correctly predicted underperformance that year.

IVX60/HV30 is the most stable — negative in all 5 years, with a standard deviation of just 0.39% across years. IVX14/HV10 has larger effects when it works but was positive in 2021 and 2022 (std = 1.11%). For risk-averse applications, IVX60/HV30's consistency may matter more than IVX14/HV10's larger average effect.

Practical Recommendation

For practitioners who need a single ratio:

Short-term signal (7-10 day horizon): Use IVX14/HV10
Medium-term signal (20-30 day horizon): Use IVX60/HV30
General purpose: IVX14/HV10 has the best composite score, but IVX60/HV30 is more year-consistent

Combining both (e.g., requiring agreement between a short-term and long-term ratio) could further improve the signal, though this was not tested in the current study.

8. Practical Implications

For Options Sellers

The data supports a cautious interpretation: stocks with high IV/HV Ratio are statistically more likely to underperform the market over the next 20-30 trading days. The effect size is small (-0.29% at 20d with IVX30/HV20, improving to -0.41% with IVX14/HV10) but consistent. This is compatible with a short premium strategy — selling options on stocks where IV significantly exceeds realized vol — but the edge is modest. Using IVX14/HV10 for shorter-dated positions and IVX60/HV30 for longer-dated positions may improve timing.

For Stock Traders

The popular narrative "high IV = crash incoming" is too simplistic. Most stocks with high IV readings just continue their normal trajectory. The average win (+7.72% at 20d) is actually larger than the average loss (-7.32% at 20d) for IV/HV Ratio signals. The underperformance comes not from crashes but from slightly more losers than winners and slightly smaller wins — consistent with a market that slightly overestimates future moves.

What Doesn't Work

Using IV Rank alone does not predict forward returns. IVR = 100 is not a sell signal.
Using absolute IVX30 levels as a crash indicator is dangerous — the stocks with the highest absolute IV are often in distress situations where recovery (not further decline) is more common.
Short-term (1-5 day) prediction is nearly impossible with any of these indicators. The signal only becomes meaningful at 10+ day horizons.

Caveats

Small effect size. -0.29% excess at 20 days is real but small. After transaction costs, it may not be profitable as a standalone signal.
Year-to-year inconsistency. The IV/HV Ratio effect was negative in 4/5 years but the magnitude varied greatly.
Top-5 selection is arbitrary. We chose the top 5 stocks daily. Different cutoffs (top 10, top 20, or using a threshold like ratio > 2.0) might give different results.
S&P 500 only. Small-cap stocks with high IV might behave very differently.
No earnings control. Many high-IV stocks are approaching earnings dates. Some of the forward return movement may reflect earnings reactions rather than IV mean reversion.

9. Methodology Notes

Data Quality

IVP30 (IV Percentile) had 68% raw data coverage because the API returns rows for weekends and holidays where this field is null (other fields are forward-filled). After filtering to actual trading days, coverage was 97%+.
Duplicate tickers: Some tickers (e.g., DG — Dollar General) returned two rows per date in the API with different values, likely due to corporate actions creating multiple stock IDs. We kept the row with the higher IVX30 value (matching the website display).

Weekend/Holiday Handling

The raw API data includes entries for all calendar dates (1,802 days over 5 years). We identified 1,258 actual trading days by filtering to weekdays where at least 200 stocks had non-null IVP30 values. Forward returns were computed using a trading-day index (so "20-day forward" means 20 actual trading days, not calendar days).

Statistical Tests

Welch's t-test: Tests whether the mean return of signal stocks differs from the mean return of all stocks. Assumes approximately normal distributions. With 6,000+ signal observations and 600,000+ baseline observations, the Central Limit Theorem provides reasonable normality of means even though individual returns are fat-tailed.
Mann-Whitney U (Wilcoxon rank-sum): Non-parametric test. Makes no distributional assumptions. Tests whether signal returns tend to be higher or lower than baseline returns by comparing ranks.
Bootstrap confidence interval: Resamples signal returns with replacement 10,000 times. Computes the 2.5th and 97.5th percentile of the mean difference. If the interval excludes zero, the effect is significant at 95% confidence.

Baseline Definition

The baseline is the average forward return of all S&P 500 stocks for each year separately. This is important because a stock gaining +2% in a month where the average stock gained +3% is actually underperforming. Annual baseline reset prevents multi-year market trends from contaminating the comparison.

10. Backtest Validation: Trading the IV/HV Ratio Signal

The statistical study above established that stocks with the highest IV/HV ratios underperform by -0.29% at 20 days and -0.72% at 30 days. We now test whether this edge is tradeable by running a full 5-year backtest (Feb 2021 – Dec 2025) using realistic option and stock strategies with actual market data — real bid/ask spreads, open interest filters, and intraday stop-loss monitoring.

Methodology

The backtest was built using the IVolatility backtesting framework with the following pipeline:

Signal generation: Each trading day, rank all S&P 500 stocks by IVX30/HV20 ratio and select the top 5. This produces ~1,220 signal days x 5 names = ~6,100 signal events over 5 years.
Option selection: For each signal, find the best ATM option (call or put) with: DTE 20–60 days, open interest ≥ 1,000, bid-ask spread ≤ 30%, |delta| ≤ 0.55. Select the contract with the highest OI. Of 313 unique symbols flagged, 209 (67%) had options meeting these liquidity filters.
Execution: Enter at mid-price on signal day. Hold for 20 or 30 calendar days. Exit at mid-price on the exit date.
Stop-loss: Earnings-style — dormant during the holding period, activates only on the exit date. Baseline resets to the 9:30 AM bar on the exit day, monitored at 1-minute intervals.
Position sizing: 10% of $100K capital per trade. Option margin calculated at 20% of underlying (naked option margin).
Walk-forward design: Run in yearly batches (2021, 2022, 2023, 2024, 2025) to avoid look-ahead bias. Each batch uses only signals generated within that year's date range.

Two strategies were tested independently:

Strategy	Position	Thesis	Entry	Exit
SHORT_OPTION	Sell naked ATM option	Capture IV crush — options are overpriced relative to realized vol	IV/HV ratio in daily top 5	Hold 20 or 30 days + SL on exit day
SHORT_STOCK	Short stock shares	Direct underperformance play — stock price should decline	IV/HV ratio in daily top 5	Hold 20 or 30 days + SL on exit day

Optimization Grid

Six parameter combinations were tested per symbol:

Combo	Hold Days	Stop-Loss	Description
cb1	20	None	Baseline, 20-day hold
cb2	20	PL 10%	Tight SL on exit day
cb3	20	PL 40%	Loose SL on exit day
cb4	30	None	Baseline, 30-day hold
cb5	30	PL 10%	Tight SL on exit day
cb6	30	PL 40%	Loose SL on exit day

Short Option Strategy Results

Combo	Trades	Win Rate	Total P&L	Avg Win	Avg Loss	Profit Factor
20d Baseline	547	67.1%	-$129,532	+69.2%	-213.5%	0.70
20d PL10	374	64.7%	-$97,440	+68.2%	-219.4%	0.68
20d PL40	374	64.7%	-$97,440	+68.2%	-219.4%	0.68
30d Baseline	405	73.1%	-$140	+77.2%	-233.8%	1.00
30d PL10	302	71.2%	+$20,568	+76.5%	-229.1%	1.10
30d PL40	302	71.2%	+$20,568	+76.5%	-229.1%	1.10

The 30-day hold period decisively outperforms the 20-day, consistent with the statistical study's finding that the IV/HV effect strengthens from 20d (-0.29%) to 30d (-0.72%). At 30 days with a tight stop-loss, the strategy generates a modest profit (+$20,568 on $100K capital over 5 years, PF 1.10).

The fundamental challenge is asymmetric risk: the average winning option trade earns +77% of premium collected, but the average loser costs -234% of premium. A single catastrophic move (e.g., APP +1,740%, TSLA +1,266%) can wipe out dozens of winners. The median trade is profitable (+74% for 30d), but the mean is dragged negative by tail losses.

Short Option Results by Year

Year	30d Baseline (cb4)	30d + SL (cb5)
2021	81 trades, 72.8% WR, -$11,642	65 trades, 70.8% WR, -$10,846
2022	90 trades, 78.9% WR, +$25,051	63 trades, 76.2% WR, +$14,320
2023	81 trades, 74.1% WR, +$26,029	64 trades, 75.0% WR, +$30,421
2024	71 trades, 64.8% WR, -$35,224	52 trades, 61.5% WR, -$27,046
2025	82 trades, 73.2% WR, -$4,354	58 trades, 70.7% WR, +$13,719

The strategy was profitable in 2022 (bear market — elevated IV provided rich premium) and 2023 (recovery — IV crush as volatility normalized). It lost in 2024 due to several large tail-risk events in AI and tech stocks.

Short Stock Strategy Results

Combo	Trades	Win Rate	Total P&L	Avg Win	Avg Loss	Profit Factor
20d Baseline	1,122	43.3%	-$139,540	+6.5%	-7.1%	0.69
30d Baseline	1,034	46.0%	-$154,310	+7.3%	-9.0%	0.69
30d PL10	1,034	46.0%	-$153,760	+7.3%	-9.0%	0.69

The short stock strategy consistently loses money. The win rate never exceeds 52.4% (2022 bear market), and even then the P&L barely turned positive (+$5,901). Stop-loss is essentially irrelevant — PL10 and PL40 produce nearly identical results because the exit-day-only SL rarely triggers on intraday moves from the 9:30 bar.

Year	20d Stock	30d Stock	Market
2021	-$60,351 (37.3% WR)	-$61,767 (39.7% WR)	Bull +27%
2022	+$5,901 (52.4% WR)	-$4,138 (52.3% WR)	Bear -19%
2023	-$49,463 (37.5% WR)	-$47,681 (45.2% WR)	Recovery +24%
2024	-$31,134 (39.4% WR)	-$35,318 (48.4% WR)	Bull +23%
2025	-$4,493 (46.3% WR)	-$5,407 (44.2% WR)	Volatile +5%

The short stock results confirm the statistical study's caveat: the -0.29% underperformance at 20 days is real but too small to trade profitably as a directional bet. The effect size does not overcome transaction costs, bid-ask slippage, and the inherent difficulty of shorting into a structurally bullish market (S&P 500 rose ~74% over the 5-year period).

Key Findings

1. The IV/HV signal has a real but modest options edge. Selling naked ATM options on high-IV/HV stocks with a 30-day hold breaks even (PF 1.00) without stop-loss and earns a slight profit (PF 1.10) with stop-loss. The signal is not strong enough for a standalone strategy but could serve as a filter within a broader options-selling framework.

2. Naked short options have extreme tail risk. The worst single trade (APP, Oct 2024) lost $38,640 — equivalent to 39 average winning trades. This is the core problem: the strategy wins often (73%) but the rare catastrophic loss is 3× larger than the average win. Position sizing and portfolio diversification are essential.

3. Stop-loss has minimal impact with exit-day-only activation. Because the SL only activates on the final day (earnings-style), it catches very few losses. Intraday moves from the 9:30 bar on day 30 rarely exceed 10-40% of option premium. A more aggressive SL (active throughout the hold) would cut more losses but also kill winning trades that dip temporarily.

4. The signal is overwhelmingly non-earnings. Only 9.2% of signals fire near earnings dates (1.5% on the earnings day, 7.4% in the days before). The remaining 82.5% capture structural IV/HV mispricing on normal trading days. This is important: the edge is not simply "sell premium before earnings."

5. Call options dominate due to liquidity. 61% of trades are calls vs 39% puts, driven by the OI ≥ 1,000 filter. S&P 500 stocks tend to have higher call OI, especially at ATM strikes. This creates a natural short-call bias, which explains some of the losses in strong bull years (2021, 2024) when short calls face unlimited upside risk.

6. Short stock is not viable. Despite the statistically significant -0.29% underperformance, the directional strategy loses money every year except the 2022 bear market. The effect size is too small relative to the structural upward drift of equities.

Comparison with SPY Backtest (Part II)

The SPY Z-score backtest (Section 10 of Part II) found the opposite result — a highly profitable straddle strategy:

Metric	SPY Straddle (Z-score)	Individual Stock Short Option (IV/HV)
Signal	IVX30 Z-score spike on SPY	IV/HV ratio top-5 individual stocks
Trades	26 (5 years)	405 (5 years, 30d baseline)
Win Rate	76.9%	73.1%
Profit Factor	8.43	1.00
Total Return	+61.24%	-$140 (~0%)
Max Drawdown	-2.27%	N/A
Avg Hold	15.5 days	36 days

The SPY strategy outperforms dramatically for three reasons: (1) SPY is a diversified index — individual stock tail risk is eliminated; (2) the Z-score signal on SPY fires only ~26 times in 5 years (high conviction), while IV/HV ratio on individual stocks fires daily (low conviction per trade); (3) SPY straddle captures both calls and puts symmetrically, while individual stocks face asymmetric OI-driven selection.

This comparison illustrates an important principle: a statistically significant signal does not automatically translate into a profitable strategy. The -0.29% cross-sectional underperformance effect is real and replicable, but the practical challenges of trading it on individual names — tail risk, liquidity filters, call/put asymmetry, and the small effect size — erode the theoretical edge. The same type of signal applied to SPY, where these frictions are minimal, produces a dramatically different outcome.

11. Data and Reproducibility

This study uses data from the IVolatility.com API. Two endpoints were used:

/equities/stock-market-data with stockGroup=SP500_STOCKS — provides pre-computed daily values for IVX30, IVX60, HV10, HV20, HV30, IVX30/HV20 ratio, IVR30, and IVP30 for all S&P 500 constituents.
/equities/eod/ivx — provides the Implied Volatility Index at shorter terms (7, 14, 21 calendar days) per individual symbol. This endpoint was called for each of the 520 S&P 500 tickers to obtain IVX7, IVX14, and IVX21 for the ratio term structure analysis (Section 7).

Study parameters:

Date range: February 24, 2021 – February 23, 2026 (data fetch), signals through December 31, 2025
Selection: Top 5 stocks daily per indicator/ratio
Forward horizons: 1, 3, 5, 7, 10, 20, 30 trading days
Part I (indicator comparison): 6,100 events per indicator (24,400 total across 4 indicators)
Part II (ratio term structure): 6,100 events per ratio (30,500 total across 5 ratios)
Statistical tests: Welch's t-test, Mann-Whitney U, Bootstrap (10,000 resamples)
Composite scoring: 40% average excess return, 30% significance breadth, 30% year consistency

The complete analysis code, DuckDB databases, and chart generation scripts are available for reproducibility.

Disclaimer: This study is for informational and educational purposes only and does not constitute investment advice, a recommendation, or a solicitation to buy, sell, or hold any security, option, or financial instrument. Past performance and statistical findings do not guarantee future results. Options trading involves significant risk, including the potential for losses exceeding the initial investment. Always conduct your own due diligence and consult a qualified financial advisor before making any investment decisions.

Research and backtesting by IVOLAI. Analysis conducted using IVolatility.com data and API. All implied volatility indicators (IVX, IVR, IVP, HV) are pre-computed by IVolatility. The IVX is calculated using a proprietary weighting technique factoring Delta and Vega of each option, using 8 ATM options (4 calls, 4 puts) per expiration, normalized to fixed tenors of 7, 14, 21, 30, 60, 90, 120, 150, 180, 270, 360, 720, and 1080 days.

IVolAI is currently in the final stage of testing. Please stay tuned for updates on when it becomes available.