Skip to content

Stationarity & Differencing

Most statistical time series models (ARIMA, SARIMA) require stationarity — meaning the mean, variance, and autocorrelation structure do not change over time.

Why Stationarity Matters

Non-stationary data has trends, changing variance, or seasonal shifts that violate the assumptions of ARIMA-type models. Fitting these models on non-stationary data produces unreliable forecasts.

Testing for Stationarity

Augmented Dickey-Fuller (ADF) Test

  • Null hypothesis: The series has a unit root (non-stationary).
  • If p-value < 0.05: Reject the null → the series is stationary.
  • If p-value ≥ 0.05: Fail to reject → the series is non-stationary (needs differencing).
import pandas as pd
import numpy as np
from statsmodels.tsa.stattools import adfuller

# Non-stationary: random walk with drift
rng = np.random.default_rng(42)
ts = pd.Series(rng.normal(0, 1, 200).cumsum())

result = adfuller(ts)
print(f"ADF Statistic: {result[0]:.4f}")
print(f"p-value: {result[1]:.4f}")
# Expected: p-value > 0.05 → non-stationary

Differencing

Differencing subtracts each observation from its predecessor, removing trends:

# First difference — removes linear trend
ts_diff1 = ts.diff().dropna()

# Second difference — removes quadratic trend (rarely needed)
ts_diff2 = ts_diff1.diff().dropna()

# Test again
result2 = adfuller(ts_diff1)
print(f"After differencing — p-value: {result2[1]:.4f}")

Visual Check

import matplotlib.pyplot as plt

fig, axes = plt.subplots(1, 2, figsize=(12, 4))
ts.plot(ax=axes[0], title="Original (Non-Stationary)")
ts_diff1.plot(ax=axes[1], title="First Difference (Stationary)")
plt.tight_layout()
plt.show()

Common Pitfall

Do not over-difference. If one round of differencing makes the series stationary (\(d = 1\)), stop. Over-differencing introduces artificial patterns and degrades model performance.

KSB Mapping

KSB Description How This Addresses It
K4.1 Statistical models and methods ARIMA, SARIMA, and exponential smoothing foundations
K4.2 Predictive analytics and ML techniques Time series forecasting and model comparison
K5.3 Common patterns in real-world data Identifying trends, seasonality, and stationarity
S1 Scientific methods and hypothesis testing Stationarity testing, model diagnostics, forecast validation
S4 Analysis and models to inform outcomes Building forecasts to support business planning
B5 Impartial, hypothesis-driven approach Honest evaluation of forecast accuracy and limitations