Clustering vs Classification¶

Both group data, but they operate under fundamentally different paradigms.

The Core Difference¶

Classification (Supervised): You know the answers (labels). You train the algorithm to map inputs to those known answers. The goal is prediction.
Clustering (Unsupervised): You do not know the answers. You ask the algorithm to find natural groupings in the raw data. The goal is exploration and discovery.

Use classification if you have a historically labelled dataset (e.g., predicting if a known customer churned).
Use clustering if you have raw data and want to understand it (e.g., finding groups of similar customers to create targeted marketing campaigns without knowing what those groups are beforehand).

KSB	Description	How This Addresses It
K4.2	Advanced analytics and ML techniques	Unsupervised learning algorithms for pattern discovery
K4.4	Trade-offs in selecting algorithms	Choosing between clustering approaches based on data characteristics
S1	Scientific methods and hypothesis testing	Validating cluster quality without ground truth labels
S4	Analysis and models to inform outcomes	Using clustering to derive actionable segments
B1	Inquisitive approach	Exploring hidden structure in unlabelled data