Chapter 4 Cluster Analysis
Section 1 Clustering Basics
Page 2 Introduction

Objectives

The objectives of this section are:
define clustering
outline the various application of clustering
delve into the various types of clustering
define the various types of clusters
introduce some of the major clustering algorithms

Outcomes

By the time you have completed this section you will be able to:
list some of the applications of cluster analysis
define clustering
list the various types of clusters and clustering
list the various types of clustering algorithms

Clustering in the Real World

The human mind is a beautiful instrument and its beauty is wrapped up in its ability to perform complex actions with such simplicity. You walk into a room and without knowing the different names of the various ethnicities your mind is able to group people into categories based on their attributes. This is what clustering is all about. It is the process of grouping data objects based on the information present, the grouping criteria is that objects within a cluster or group be similar to one another and different from objects in another group. Clustering is different from the previously discussed concept of classification (chapter 1) in that clustering does not assign labels that have been previously determined to the groups; it only separates the data objects into groups. Any labeling that may occurs does not depend on previously classified data objects that we have access to.

Applications

Cluster analysis has a vital role in numerous fields ranging from biology to machine learning. Its applications are far reaching are in some situations they serve as a stepping stone for further analysis and in other areas they are the final analytical tool. Some of the applications include the following

In biology and bioinformatics it is used in taxonomy and grouping genetic information
In information retrieval it aids in internet queries
In climatology it helps find patterns in the atmosphere and ocean
In the fields of psychology and medicine, clustering is used for diagnosis of diseases and illnesses
In business it is used to segment customers into small groups that can later be targeted for future marketing activities.
In machine learning and data mining it is used to efficiently find nearest neighbors and in summarization.