Skip to content

SHAP Values

SHAP (SHapley Additive exPlanations) provides a mathematically rigorous, game-theoretic approach to explaining model predictions — both globally and locally.

The Concept

SHAP assigns each feature a Shapley value — the average marginal contribution of that feature across all possible feature combinations. It answers: "How much did each feature contribute to pushing this prediction away from the baseline?"

Installation

pip install shap

Implementation

import shap
import numpy as np
from sklearn.ensemble import RandomForestClassifier
from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split

X, y = make_classification(n_samples=1000, n_features=10,
                            n_informative=5, random_state=42)
feature_names = [f"feature_{i}" for i in range(10)]

X_tr, X_te, y_tr, y_te = train_test_split(X, y, random_state=42)

model = RandomForestClassifier(n_estimators=100, random_state=42)
model.fit(X_tr, y_tr)

# Create SHAP explainer
explainer = shap.TreeExplainer(model)
shap_values = explainer.shap_values(X_te)

Key Visualisations

Summary Plot (Global)

Shows feature importance and the direction of each feature's effect:

shap.summary_plot(shap_values[1], X_te, feature_names=feature_names)

Waterfall Plot (Local — Single Prediction)

Explains one prediction step by step:

shap.plots.waterfall(explainer(X_te)[0])

Bar Plot (Global Importance)

Simple bar chart of mean absolute SHAP values:

shap.plots.bar(explainer(X_te))

SHAP vs LIME

Aspect SHAP LIME
Theory Game-theoretic (exact) Local linear approximation
Consistency Guaranteed consistent No consistency guarantees
Speed Slower (especially KernelSHAP) Faster
Global view Yes (summary plot) No (local only)
Best for Thorough analysis and reports Quick local explanations

Tree-Specific Speedup

For tree-based models, TreeExplainer computes exact SHAP values in polynomial time — much faster than the model-agnostic KernelExplainer.

Workplace Tip

SHAP is the gold standard for model explainability. Include a SHAP summary plot in every ML report — it shows stakeholders which features matter and how they influence predictions in a single chart.

KSB Mapping

KSB Description How This Addresses It
S5 Deployment, value assessment, and ROI Translating model performance into business impact
S6 Communicate through storytelling and visualisation Presenting ML results to non-technical stakeholders
B4 Consideration of organisational goals Framing technical results in terms of business objectives
B1 Inquisitive approach Exploring creative ways to explain model behaviour