Chapter 2 Association Analysis
Section 4 Compact F. I. Representations
Page 3 Closed Frequent Itemset

Objectives

The objectives of this section are:
to introduce alternative representations for frequent itemsets
to define the maximal frequent itemset representation
to define the closed frequent itemset representation

Outcomes

By the time you have completed this section you will be able to:
explain and identify the maximal frequent itemset
explain and identify the closed frequent itemset

Closed Frequent Itemset

Definition:

It is a frequent itemset that is both closed and its support is greater than or equal to minsup.
An itemset is closed in a data set if there exists no superset that has the same support count as this original itemset.

Identification

  1. First identify all frequent itemsets.
  2. Then from this group find those that are closed by checking to see if there exists a superset that has the same support as the frequent itemset, if there is, the itemset is disqualified, but if none can be found, the itemset is closed.
    An alternative method is to first identify the closed itemsets and then use the minsup to determine which ones are frequent.

IllustrationClosed Frequent Itemset

The lattice diagram above shows the maximal, closed and frequent itemsets. The itemsets that are circled with blue are the frequent itemsets. The itemsets that are circled with the thick blue are the closed frequent itemsets. The itemsets that are circled with the thick blue and have the yellow fill are the maximal frequent itemsets. In order to determine which of the frequent itemsets are closed, all you have to do is check to see if they have the same support as their supersets, if they do they are not closed.
For example ad is a frequent itemset but has the same support as abd so it is NOT a closed frequent itemset; c on the other hand is a closed frequent itemset because all of its supersets, ac, bc, and cd have supports that are less than 3.
As you can see there are a total of 9 frequent itemsets, 4 of them are closed frequent itemsets and out of these 4, 2 of them are maximal frequent itemsets. This brings us to the relationship between the three representations of frequent itemsets.

Relationship between Frequent Itemset Representations

Circle Relationship
In conclusion, it is important to point out the relationship between frequent itemsets, closed frequent itemsets and maximal frequent itemsets. As mentioned earlier closed and maximal frequent itemsets are subsets of frequent itemsets but maximal frequent itemsets are a more compact representation because it is a subset of closed frequent itemsets. The diagram to the right shows the relationship between these three types of itemsets. Closed frequent itemsets are more widely used than maximal frequent itemset because when efficiency is more important that space, they provide us with the support of the subsets so no additional pass is needed to find this information.