Table of Contents

Contents

The table of contents presented below is intended to reflect the current version of the UH Data Mining hypertextbook and should be used in conjunction with traditional course textbooks until the complete of this online resource.
Another thing to note is that the placement of chapters is subject to revision and in some cases content may also be moved around to increase understandability One of the really cool features of a hypertextbook is that it can evolve dynamically.  Indeed, content that is useful even in draft form can be included and used immediately.  We will utilize this dynamic aspect of the hypertextbook to bring you updates in real time.
Please note that the chapters below are not linked to the content.  To access a chapter, hover your mouse over the Chapters dropdown menu in the top bar of this page and click on the desired chapter.

Chapter 1. Decision Trees

This chapter provides an introduction to one of the major fields of data mining called classification. It also outlines some of the real world applications of classification tools and introduces the decision tree classifier that is widely used. What is a classifier? What is a decision tree? How does one construct a decision tree? These are just some of the questions answered in this chapter. Currently the novice (green) and intermediate (blue) tracks are active. More content will be added to this chapter over time.

Chapter 2. Association Analysis

In this chapter we explore another major field of data mining called association rules. Association Analysis focuses on discovering association rules which are interesting and useful hidden relationships that can be found in large data sets. This chapter is divided into various sections that explain the key concepts in Association Analysis and introduce you, the reader, to the basic algorithms used in generating Association Rules. Currently the novice (green) and intermediate (blue) tracks are active. More content will be added to this chapter over time.

Chapter 3. Visualization

In this chapter we take a step back from data mining algorithms and techniques and focus on the visualization of data. This step is crucial and normally takes place before any data mining algorithms have been applied or pre-processing techniques, it is useful because it helps us in some situations pinpoint which algorithms should be used in future analysis. This chapter is divided into three main sections, the first section introduces you, the reader to visualization, the second defines general concepts that are pertinent and the third section explores a couple of visualization techniques. This capter also includes a brief introduction to OLAP. More content will be added to this chapter over time.

Chapter 4. Cluster Analysis

In this chapter we pick up from where classification left off and delve a little bit deeper into the world of grouping data objects. Cluster analysis aims to group data objects based on the information that is available that describes the objects and their relationships. This chapter is first introduces the concept of cluster analysis and its applications in the real world and then it explores some of the popular clustering techniques such as the k-means clustering algorithm and agglomerative hierarchical clustering. More content will be added to this chapter over time.

Appendix 1.  Includes direct links to Java Applets, online links to additional resources and a list of references

As citations are made to literature, the corresponding references are kept in this appendix.  Links to this appendix accompany the citations.