Chapter 4 Cluster Analysis
Section 3 Agglomerative Hierarchical Clustering
Page 2 Algorithm Definition

Objectives

The objectives of this section are:
to define agglomerative hierarchical clustering
to explain its basic algorithm
to briefly mention key issues it presents

Outcomes

By the time you have completed this section you will be able to:
define agglomerative hierarchical clustering
describe the algorithm
list key issues that this method creates/resolves

Definition

Agglomerative hierarchical clustering is a hierarchical, bottom up clustering algorithm that starts with each point as an individual cluster and then at each step merges the closest pairs of clusters until all clusters have been combined into one. It is usually displayed graphically by using a dendrogram, which is a tree-like diagram that displays both the cluster-subcluster relationships and the order in which the clusters were merged. Figure 2 shows a dendrogram for the points clustered in Figure 1

DendrogramClusteredPoints

 

Algorithm

There are many agglomerative hierarchical clustering techniques being used for cluster analysis but most of these are a variation of a basic single approach. The basic algorithm is as follows: start with individual data points are clusters and then merge the two closest clusters until only one cluster remains. Figure 3 below presents a high-level formal representation of the algorithm.

Agg.Hierarchical Algorithm