/    /  Machine Learning- Eclat Algorithm

Eclat Algorithm

 

Eclat Algorithm is abbreviated as Equivalence class clustering and bottom up lattice transversal algorithm. It is an algorithm for finding frequent item sets in a transaction or database. It is one of the best methods of Association Rule Learning. Which means Eclat algorithm is used to generate frequent item sets in a database.

Eclat Algorithm 1 (i2tutorials)

 

Eclat algorithm uses a Depth first search for discovering frequent item sets, whereas Apriori algorithm uses breadth first search. It represents the data in vertical manner unlike Apriori algorithm which represents data in horizontal pattern. This vertical pattern of Eclat algorithm making it into faster algorithm compared to Apriori algorithm. Hence, Eclat algorithm is more efficient and scalable version of the Association Rule Learning.

 

Generally, Transaction Id sets which are also called as tidsets are used to calculate the value of Support value of a dataset and also avoiding the generation of subsets which does not exist in the prefix tree. In the first call of function, all single items or data are used along with their respective tidsets. Then the function is called recursive, in each recursive call, each item in tidsets pair is verified and combined with other item in tidsets pairs. This process is repeated until no candidate item in tidsets pairs can be combined.

 

The input given to this Eclat algorithm is a transaction database and a threshold value which is in the range of 0 to 100.

 

A transaction database is a set of transaction values where each transaction is a set of items. It is important to note that an item should not be appear more than once in the same transaction and also the items are assumed to be sorted by lexicographical order in a transaction.

 

Each frequent itemset is marked with its corresponding support value. The support of an itemset is given by number of times the itemset appears in the transaction database.

 

The given transaction data should be a Boolean matrix where for each cell (i, j), the value denotes that whether the jth item is included in the ith transaction or not. Here, 1 means true and 0 means false.

 

Now, we have to call the function for the first time and arrange each item with its tidset in a tabular column. We have to call this function iteratively till no more item-tidset pairs can be combined.

 

An example for Eclat Algorithm is given below

Eclat Algorithm 2 (i2tutorials)

 

Finally, we can determine the frequent items or data in the transaction sets or database by using this Eclat algorithm.

 

Advantages

  1. Since the Eclat algorithm uses a Depth-First Search approach, it consumes less memory than the Apriori algorithm.
  2. The Eclat algorithm is naturally faster compared to the Apriori algorithm.
  3. The Eclat algorithm does not involve in the repeated scanning of the data in order to calculate the individual support values.
  4. This algorithm is better suited for small and medium datasets where as Apriori algorithm is used for large datasets.
  5. Eclat algorithm scans the currently generated dataset unlike Apriori which scans the original dataset.

 

Disadvantages

Intermediate Tidsets which are created in Eclat algorithm consumes more space in memory.