Chapter 4 Cluster Analysis
Section 1 Clustering Basics
Page 2 Introduction

Objectives

The objectives of this section are:
define clustering
outline the various application of clustering
delve into the various types of clustering
define the various types of clusters
introduce some popular clustering algorithms

Outcomes

By the time you have completed this section you will be able to:
list some of the applications of cluster analysis
define clustering
list the various types of clusters and clustering
list some of the well known clustering algorithms

Cluster Analysis is

Cluster analysis is the process of grouping data objects based on the information present, the grouping criteria is that objects within a cluster or group be similar to one another and different from objects in another group. Clustering is different from the previously discussed concept of classification (chapter 1) in that clustering does not assign labels that have been previously determined to the groups; it only separates the data objects into groups. Any labeling that may occurs does not depend on previously classified data objects that we have access to.

Applications

Cluster analysis has a vital role in numerous fields ranging from biology to machine learning. Its application depends on whether clustering is used as a stepping stool and a basis for future analysis or as a tool for understanding.

Understanding: When it comes to data analysis for the purpose of understanding the dataset, cluster analysis is the study of techniques for automatically finding classes because every cluster is a potential class just needed a class label. Applications for this use of clustering exist in the fields of biology when it comes to taxonomy and grouping genetic information, information retrieval, climate to help find patterns in the atmosphere and ocean. In the field of psychology and medicine, clustering is used for diagnosis of diseases and in business it is used to segment customers into small groups that can later be targeted for future marketing activities.

Utility: Cluster analysis can also be used as the basis for other data analysis or processing techniques, in this context, cluster analysis is similar to visualization it is the study of techniques for finding the most representative clusters.  Applications for this use of clustering include summarization which uses clustering to avoid the curse of dimensionality and apply the algorithm to cluster prototypes. Clustering can also be used to efficiently find nearest neighbors.