/    /  Machine Learning- Designing a learning system

Designing a learning system

 

The formal definition of Machine learning as discussed in the previous blogs of the Machine learning series is “A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P, if its performance at tasks in T, as measured by P, improves with experience E’’.

 

One of the examples discussed was learning checkers game, the parameters T, E, and P with respect to this example are,

T -> Play the checkers game.

P -> Percentage of games won against the opponent.

E -> Playing practice games against itself. 

 

Steps to design a learning system: 

To get a successful learning system we need to have a proper design, to make the design proper we’ll follow certain steps. In this case, designing a learning system is a five-step process. The steps are,

 

  1. Choosing the Training Experience
  2. Choosing the Target Function
  3. Choose a Representation for the Target Function
  4. Choosing a Function Approximation Algorithm
  5. The Final Design

 

Let’s have a look at them briefly,

 

 1. Choosing the Training Experience

The type of training experience chosen has a considerable amount of impact on our algorithm. The training data’s characteristics need to be similar to that of the total data set’s characteristics. 

 

In order to choose the right training experience for your algorithm, consider these three attributes, 

 

a) Type of Feedback: Check whether the training experience provides direct or indirect feedback to the algorithm based on the choices of the performance system. 

In Direct feedback, you get the feedback of your choice immediately. In the case of indirect feedback,  you get a sequence of moves and the final outcome of the sequence of action. 

 

b) Degree: The degree of a training experience refers to the extent up to which the learner can control the sequence of training. 

For example, the learner might rely on constant feedback about the moves played or it might itself propose a sequence of actions and only ask for help when in need. 

 

c) The representation of the distribution of samples across which performance will be tested is the third crucial attribute. 

This basically means the more diverse the set of training experience can be the better the performance can get.

 

2. Choosing the target function: 

The next design decision is to figure out exactly what kind of knowledge will be acquired and how the performance software will put it to use.

 

Let’s take the classic example of the checkers game to understand better. The program only needs to learn how to select the best moves out of the legal moves(Set of all possible moves is called legal moves).

 

The choice of the target function is a key feature in designing the entire system.  The target function V: B -> R. This notation denotes that V maps any legal board state from set B to a real value. 

 

Assigning value to target function in a checkers game, 

  1. V(b) = 100 if b is the final board state that is won. 
  2. V(b) = -100 if b is the final board state that is lost.
  3. V(b) = 0 if b is the final board state that is drawn.
  4. V(b) = V(b’) if b is not a final state, and b’ is the best final board state that can be achieved starting from b and playing optimally until the end of the game. 

 

3. Choosing Representation for Target function: 

Once done with choosing the target function now we have to choose a representation of this target function, When the machine algorithm has a complete list of all permitted movements, it may pick the best one using any format, such as linear equations, hierarchical graph representation, tabular form, and so on. 

 

Out of these moves, the NextMove function will move the Target move, which will increase the success rate. For example, if a chess machine has four alternative moves, the computer will select the most optimal move that will lead to victory.

 

4. Choosing a Function Approximation Algorithm:

In this step, we choose a learning algorithm that can approximate the target function chosen. This step further consists of two sub-steps, a. Estimating the training value, and b. Adjusting the weights. 

 

To estimate a training example, we consider the successor move, and in the case of adjusting the weights, one uses certain algorithms like LMS, to find weights of linear functions. 

 

5. The Final Design: 

 

The final design consists of four modules, as described in the picture. 

  1. The performance system: The performance system solves the given performance task. 
  2. Critic: The critic takes the history of the game and generates training examples.
  3. Generalizer: It outputs the hypothesis that is its estimate of the target function. 
  4. Experiment Generator: It creates a new problem after taking in the hypothesis for the performance system to explore. 

 

Reference Links: 

Designing a learning system- 1

Designing a learning system- 2