How to Evaluate Forecast Accuracy¶

Time series forecasting requires specific metrics — like MAE, RMSE, and MAPE — calculated on a chronologically held-out test set.

Golden Rule¶

Never shuffle time series data. The train/test split must respect temporal order: train on the past, test on the future.

Key Metrics¶

Metric	Formula	Interpretation
MAE	Mean of \|actual − predicted\|	Average absolute error in the original units
RMSE	√Mean of (actual − predicted)²	Penalises large errors more heavily than MAE
MAPE	Mean of \|actual − predicted\| / \|actual\| × 100	Percentage error — useful for comparing across scales

Implementation¶

import numpy as np
import pandas as pd
from sklearn.metrics import mean_absolute_error, mean_squared_error

# Simulate actual vs predicted
rng = np.random.default_rng(42)
dates = pd.date_range("2024-01-01", periods=30, freq="D")
y_true = 100 + rng.normal(0, 5, 30).cumsum()
y_pred = y_true + rng.normal(0, 3, 30)  # Predictions with noise

mae = mean_absolute_error(y_true, y_pred)
rmse = np.sqrt(mean_squared_error(y_true, y_pred))
mape = np.mean(np.abs((y_true - y_pred) / y_true)) * 100

print(f"MAE:  {mae:.2f}")
print(f"RMSE: {rmse:.2f}")
print(f"MAPE: {mape:.2f}%")

Time Series Train/Test Split¶

# Chronological split — no shuffling!
train = ts[:int(len(ts) * 0.8)]
test = ts[int(len(ts) * 0.8):]

Common Pitfall

Using train_test_split(shuffle=True) on time series data causes data leakage — you train on future data and test on past data, producing artificially inflated scores.

KSB Mapping¶

KSB	Description	How This Addresses It
K4.1	Statistical models and methods	ARIMA, SARIMA, and exponential smoothing foundations
K4.2	Predictive analytics and ML techniques	Time series forecasting and model comparison
K5.3	Common patterns in real-world data	Identifying trends, seasonality, and stationarity
S1	Scientific methods and hypothesis testing	Stationarity testing, model diagnostics, forecast validation
S4	Analysis and models to inform outcomes	Building forecasts to support business planning
B5	Impartial, hypothesis-driven approach	Honest evaluation of forecast accuracy and limitations