Site icon i2tutorials

Machine Learning- ID3 Algorithm and Hypothesis space in Decision Tree Learning

ID3 Algorithm and Hypothesis space in Decision Tree Learning

 

The collection of potential decision trees is the hypothesis space searched by ID3. ID3 searches this hypothesis space in a hill-climbing fashion, starting with the empty tree and moving on to increasingly detailed hypotheses in pursuit of a decision tree that properly classifies the training data.

 

In this blog, we’ll have a look at the Hypothesis space in Decision Trees and the ID3 Algorithm. 

 

ID3 Algorithm: 

The ID3 algorithm (Iterative Dichotomiser 3) is a classification technique that uses a greedy approach to create a decision tree by picking the optimal attribute that delivers the most Information Gain (IG) or the lowest Entropy (H).

 

What is Information Gain and Entropy? 

Information Gain: 

The assessment of changes in entropy after segmenting a dataset based on a characteristic is known as information gain.

 

It establishes how much information a feature provides about a class.

 

We divided the node and built the decision tree based on the value of information gained.

 

The greatest information gain node/attribute is split first in a decision tree method, which always strives to maximize the value of information gain. 

 

The formula for Information Gain: 

Information Gain= Entropy(S)- [(Weighted Avg) *Entropy(each feature)

 

Entropy is a metric for determining the degree of impurity in a particular property. It denotes the unpredictability of data. The following formula may be used to compute entropy:

Entropy(s)= -P(yes)log2 P(yes)- P(no) log2 P(no)

Where,

S stands for “total number of samples.”

P(yes) denotes the likelihood of a yes answer.

P(no) denotes the likelihood of a negative outcome.

 

ID3 Algorithm: 

 

Determine the entropy for each of the category values.

 

Calculate the feature’s information gain.

 

Characteristics of ID3: 

 

Over Fitting: 

Good generalization is the desired property in our decision trees (and, indeed, in all classification problems), as we noted before. 

 

This implies we want the model fit on the labeled training data to generate predictions that are as accurate as they are on new, unseen observations.

 

Capabilities and Limitations of ID3:

 

Hypothesis Space Search by ID3: 

 

Reference

ID3 Algorithm and Hypothesis space in Decision Tree Learning

Exit mobile version