When Clustering Fails¶

Clustering algorithms will almost always return something. Knowing when those results are meaningless is a critical data science skill.

Common Failure Modes¶

Forcing Structure where None Exists: Data might just be one big blob naturally. k-Means will still cut it into pieces.
The Curse of Dimensionality: In very high-dimensional space, all points look equidistant from each other. Distance metrics break down.
Varying Densities: DBSCAN struggles if cluster A is very dense but cluster B is very sparse.
Non-Spherical Data for k-Means: k-Means functionally draws circles (or spheres/hyper-spheres). If your data is shaped like concentric rings or bananas, k-Means will fail dramatically.

KSB	Description	How This Addresses It
K4.2	Advanced analytics and ML techniques	Unsupervised learning algorithms for pattern discovery
K4.4	Trade-offs in selecting algorithms	Choosing between clustering approaches based on data characteristics
S1	Scientific methods and hypothesis testing	Validating cluster quality without ground truth labels
S4	Analysis and models to inform outcomes	Using clustering to derive actionable segments
B1	Inquisitive approach	Exploring hidden structure in unlabelled data