/    /  Machine Learning- A concept Learning Task and Inductive Learning Hypothesis

A concept Learning Task and Inductive Learning Hypothesis

 

Concept Learning is a way to find all the consistent hypotheses or concepts. This article will help you understand the concept better. 

 

We have already covered designing the learning system in the previous article and to complete that design we need a good representation of the target concept. 

 

Why Concept learning? 

A lot of our learning revolves around grouping or categorizing a large data set. Each concept of learning can be viewed as describing some subset of objects or events defined over a larger set. For example, a subset of vehicles that constitute cars. 

 

Alternatively, each dataset has certain attributes. For example, if you consider a car, its attributes will be color, size, number of seats, etc. And these attributes can be defined as Binary valued attributes. 

 

Let’s take another elaborate example of EnjoySport, The attribute EnjoySport shows if a person is participating in his favorite water activity on this particular day.

 

The goal is to learn to anticipate the value of EnjoySport on any given day based on its other qualities’ values.

To simplify,

 

Task T: Determine the value of EnjoySport for every given day based on the values of the day’s qualities.

 

The total proportion of days (EnjoySport) accurately anticipated is the performance metric P.

 

Experience E: A collection of days with pre-determined labels (EnjoySport: Yes/No).

 

Each hypothesis can be considered as a set of six constraints, with the values of the six attributes Sky, AirTemp, Humidity, Wind, Water, and Forecast specified.

 

SkyAir tempHumidityWindWaterForecastEnjoySport
SunnyWarmNormalStrongWarmSameYes
SunnyWarmHighStrongWarmSameYes
RainyColdHighStrongWarmChangeNo
SunnyWarmHighStrongCoolChangeYes

 

Here the concept = < Sky, Air Temp, Humidity, Wind, Forecast>.

 

The number of possible instances = 2^d.

The total number of Concepts = 2^(2^d). 

 

Where d is the number of features or attributes. In this case, d = 5

 

=> The number of possible instances = 2^5 = 32.

=> The total number of Concepts = 2^(2^5) = 2^(32). 

 

From these 2^(32) concepts we got, Your machine doesn’t have to learn about all of these topics. You’ll select a few of the concepts from 2^(32) concepts to teach the machine. 

 

The concepts chosen need to be consistent all the time. This hypothesis is called target concept (or) hypothesis space. 

 

Hypothesis Space:

To formally define Hypothesis space, The collection of all feasible legal hypotheses is known as hypothesis space. This is the set from which the machine learning algorithm will select the best (and only) function or outputs that describe the target function.

 

The hypothesis will either 

  • Indicate with a “?” that any value is acceptable for this attribute.
  • Define a specific necessary value (e.g., Warm).
  • Indicate with a “0” that no value is acceptable for this attribute.

 

  • The expression that represents the hypothesis that the person loves their favorite sport exclusively on chilly days with high humidity (regardless of the values of the other criteria) is –

  < ?, Cold, High, ?, ? >

  • The most general hypothesis that each day is a positive example is represented by 

                   <?, ?, ?, ?, ?, ?> 

 

  • The most specific possible hypothesis that none of the day is a positive example is represented by

                         <0, 0, 0, 0, 0, 0>

 

Concept Learning as Search: 

The main goal is to find the hypothesis that best fits the training data set. 

Consider the examples X and hypotheses H in the EnjoySport learning task, for example.

 

With three potential values for the property Sky and two for AirTemp, Humidity, Wind, Water, and Forecast, the instance space X contains precisely,

 

=> The number of different instances possible = 3*2*2*2*2*2 = 96. 

 

Inductive Learning Hypothesis

The learning aim is to find a hypothesis h that is similar to the target concept c across all instances X, with the only knowledge about c being its value throughout the training examples.

 

Inductive Learning Hypothesis can be referred to as, Any hypothesis that accurately approximates the target function across a large enough collection of training examples will likewise accurately approximate the target function over unseen cases. 

 

Over the training data, inductive learning algorithms can only ensure that the output hypothesis fits the goal notion.

 

The optimum hypothesis for unseen occurrences, we believe, is the hypothesis that best matches the observed training data. This is the basic premise of inductive learning.

 

Inductive Learning Algorithms Assumptions:

  • The population is represented in the training sample.
  • Discrimination is possible thanks to the input characteristics.

 

The job of searching through a wide set of hypotheses implicitly described by the hypothesis representation may be considered as concept learning.

 

The purpose of this search is to identify the hypothesis that most closely matches the training instances.

 

Reference

A concept Learning Task and Inductive Learning Hypothesis