/    /  Machine Learning- Inductive Bias in Machine Learning

Inductive Bias in Machine Learning

 

The phrase “inductive bias” refers to a collection of (explicit or implicit) assumptions made by a learning algorithm in order to conduct induction, or generalize a limited set of observations (training data) into a general model of the domain. 

 

In this article, we’ll have a look at what is Inductive Bias, and how does it help the machine make better decisions.

 

Why Inductive Bias? 

As seen in the previous article on Candidate-Elimination Algorithm, we get two hypotheses, one specific and one general at the end as a final solution. 

 

Now, we also need to check if the hypothesis we got from the algorithm is actually correct or not, also make decisions like what training examples should the machine learn next. 

 

Some of the fundamental questions for inductive reference are, 

  • What happens if the target concept isn’t in the hypothesis space?
  • Is it possible to avoid this problem by adopting a hypothesis space that contains all potential hypotheses?
  • What effect does the size of the hypothesis space have on the algorithm’s capacity to generalize to unseen instances?
  • What effect does the size of the hypothesis space have on the number of training instances required?

 

Let’s have a look at what is Inductive and Deductive learning to understand more about Inductive Bias. 

 

Inductive Learning: 

This basically means learning from examples, learning on the go. 

 

We are given input samples (x) and output samples (f(x)) in the context of inductive learning, and the objective is to estimate the function (f). The goal is to generalize from the samples and map such that the output may be estimated for fresh samples in the future.

 

In practice, estimating the function is nearly always too difficult, thus we seek extremely excellent estimates of the function.

 

The following are some instances of induction in practice:

Assessment of credit risk: 

The x represents the customer’s properties.

Whether or whether the f(x) has been accepted for credit.

 

The diagnosis of disease:

The x represents the patient’s characteristics.

The f(x) is the illness they are afflicted with.

 

Face recognition: is a technique for recognizing someone’s face.

Bitmaps of people’s faces make up the x.

The f(x) is used to give the face a name.

 

Deductive Learning: 

Learners are initially exposed to concepts and generalizations, followed by particular examples and exercises to aid learning.

 

Already existing rules are applied to the training examples. 

 

Biased Hypothesis Space: 

It does not include all types of training instances. The issue is that we have skewed the learner’s thinking to only evaluate conjunctive possibilities. In this instance, a more expressive hypothesis space is required.

 

Unbiased Hypothesis Space: 

The obvious answer to the challenge of ensuring that the target idea is represented in hypothesis space H is to create a hypothesis space that can represent any teachable notion.

 

What is Inductive Bias?

As discussed in the introduction, Inductive bias refers to a set of assumptions made by a learning algorithm in order to conduct induction or generalize a limited set of observations (training data) into a general model of the domain. 

 

Induction would be impossible without such a bias, because observations may generally be extended in a variety of ways. 

 

Predictions for new scenarios could not be formed if all of these options were treated equally, that is, without any bias in the sense of a preference for certain forms of generalization (representing previous information about the target function to be learned).

 

The idea of inductive bias is to let the learner generalize beyond the observed training examples to deduce new examples. 

> ’ -> Inductively inferred from.

 

For example, 

x > y means y is inductively deduced from x. 

 

Types of Inductive Bias: 

  • Maximum conditional independence: It aims to maximize conditional independence if the hypothesis can be put in a Bayesian framework. The Naive Bayes classifier employs this bias.

 

  • Minimum cross-validation error: Select the hypothesis with the lowest cross-validation error when deciding between hypotheses. Despite the fact that cross-validation appears to be bias-free, the “no free lunch” theorems prove that cross-validation is biased.

 

  • Maximum margin: While creating a border between two classes, try to make the boundary as wide as possible. In support vector machines, this is the bias. The idea is that distinct classes are usually separated by large gaps.

 

  • Minimum hypothesis description length: When constructing a hypothesis, try to keep the description as short as possible. Simpler theories are seen to be more likely to be correct. Occam’s razor does not suggest this. Simpler models are easier to test, not necessarily “more likely to be true.” See the principle of Occam’s Razor.

 

  • Minimum features: features should be removed unless there is strong evidence that they are helpful. Feature selection methods are based on this premise.

 

  • Nearest neighbors: Assume that the majority of the examples in a local neighborhood in feature space are from the same class.

 

If the class of a case is unknown, assume that it belongs to the same class as the majority of the people in its near vicinity. The k-nearest neighbor’s algorithm employs this bias. Cases that are close to each other are assumed to belong to the same class.

 

Reference

Inductive Bias in Machine Learning