/    /  Machine Learning- Evaluating Hypotheses: Estimating hypotheses Accuracy

Evaluating Hypotheses: Estimating hypotheses Accuracy

 

For estimating hypothesis accuracy, statistical methods are applied. In this blog, we’ll have a look at evaluating hypotheses and estimating it’s accuracy. 

 

Evaluating hypotheses: 

Whenever you form a hypothesis for a given training data set, for example, you came up with a hypothesis for the EnjoySport example where the attributes of the instances decide if a person will be able to enjoy their favorite sport or not. 

 

Now to test or evaluate how accurate the considered hypothesis is we use different statistical measures. Evaluating hypotheses is an important step in training the model. 

 

To evaluate the hypotheses precisely focus on these points: 

When statistical methods are applied to estimate hypotheses, 

  • First, how well does this estimate the accuracy of a hypothesis across additional examples, given the observed accuracy of a hypothesis over a limited sample of data?

 

  • Second, how likely is it that if one theory outperforms another across a set of data, it is more accurate in general?

 

  • Third, what is the best strategy to use limited data to both learn and measure the accuracy of a hypothesis?

 

Motivation: 

There are instances where the accuracy of the entire model plays a huge role in the model is adopted or not. For example, consider using a training model for Medical treatment. We need to have a high accuracy so as to depend on the information the model provides. 

 

When we need to learn a hypothesis and estimate its future accuracy based on a small collection of data, we face two major challenges:

 

Bias in the estimation

There is a bias in the estimation. Initially, the observed accuracy of the learned hypothesis over training instances is a poor predictor of its accuracy over future cases.

 

Because the learned hypothesis was generated from previous instances, future examples will likely yield a skewed estimate of hypothesis correctness.

 

Estimation variability. 

Second, depending on the nature of the particular set of test examples, even if the hypothesis accuracy is tested over an unbiased set of test instances independent of the training examples, the measurement accuracy can still differ from the true accuracy. 

 

The anticipated variance increases as the number of test examples decreases.

 

When evaluating a taught hypothesis, we want to know how accurate it will be at classifying future instances.

 

Also, to be aware of the likely mistake in the accuracy estimate. There is an X-dimensional space of conceivable scenarios. We presume that different instances of X will be met at different times. 

 

Assume there is some unknown probability distribution D that describes the likelihood of encountering each instance in X. This is a convenient method to model this.

 

A trainer draws each instance separately, according to the distribution D, and then passes the instance x together with its correct target value f (x) to the learner as training examples of the target function f.

 

The following two questions are of particular relevance to us in this context, 

 

  1. What is the best estimate of the accuracy of h over future instances taken from the same distribution, given a hypothesis h and a data sample containing n examples picked at random according to the distribution D?

 

  1. What is the margin of error in this estimate of accuracy?

 

True Error and Sample Error: 

We must distinguish between two concepts of accuracy or, to put it another way, error. One is the hypothesis’s error rate based on the available data sample. 

 

The hypothesis’ error rate over the complete unknown distribution D of examples is the other. These will be referred to as the sampling error and real error, respectively.

 

The fraction of S that a hypothesis misclassifies is the sampling error of a hypothesis with respect to some sample S of examples selected from X.

 

Sample Error:

It is denoted by errors(h) of hypothesis h with respect to target function f and data sample S is 

 

Where n is the number of examples in S, and the quantity  is 1 if f(x) != h(x), and 0 otherwise. 

 

True Error: 

It is denoted by errorD(h) of hypothesis h with respect to target function f and distribution D, which is the probability that h will misclassify an instance drawn at random according to D.

 

Confidence Intervals for Discrete-Valued Hypotheses:

“How accurate are errors(h) estimates of errorD(h)?” – in the case of a discrete-valued hypothesis (h).

 

To estimate the true error for a discrete-valued hypothesis h based on its observed sample error over a sample S, where

  • According to the probability distribution D, the sample S contains n samples drawn independently of one another and of h. 
  • n >= 30
  • Over these n occurrences, hypothesis h commits r mistakes errors(h) = r/n

 

Under these circumstances, statistical theory permits us to state the following:

  • If no additional information is available, the most likely value of errorD(h) is errors(h).
  • The genuine error errorD(h) lies in the interval with approximately 95% probability.

A more precise rule of thumb is that the approximation described above works well when