Home / Machine Learning – Tutorial / Machine Learning- Reinforcement Learning

Reinforcement Learning

Reinforcement Learning is a type of machine learning model which enables the model to learn in an interactive environment by using trial and error method making use of feedback from its own actions and experiences. It is the training of machine learning models to create a sequence of decisions.

In reinforcement learning, an artificial intelligence faces a situation which is similar to Game. Artificial Intelligence gets either rewards or penalties of the actions it performs to make the machine do whatever the programmers wants to. Goal of reinforcement learning is to maximize the total reward.

Reinforcement Learning 1(i2tutorials)

Although both supervised and reinforcement learning uses mapping between input and output, Reinforcement learning uses rewards and punishments as signals for both positive and negative behavior unlike supervised learning where feedback is needed to the agent for the correct set of actions for accomplishing a task.

When we compare Reinforcement learning and unsupervised learning, reinforcement learning is different in terms of goals. In Unsupervised learning, the goal is to find similarities and differences between data points. In Reinforcement learning, we have to find a suitable action model that would maximize the total cumulative reward of the agent.

Reinforcement Learning 2(i2tutorials)

Reinforcement Learning is currently the most effective way to hint machine’s creativity by leveraging the power of search and trails. Even though Programmer defines the reward policy (Rules of the game) he does not provide any hints or suggestions for the model for how to solve the problem or game. The model decides that how to accomplish the task to maximize the reward. The model begins with the random trials and finishes with sophisticated techniques.

Reinforcement Learning 3(i2tutorials)

Key Points in Reinforcement Learning

The input of reinforcement Learning model should be an initial state from which the model will begin.
There will be many possible outputs as there are variety of possible solutions to a specific problem.
The training is based on the input provided, the model will return a state and the user will decide to reward or punish the model according to its output.
The model will always continue to learn.
The solution is decided as best based on the maximum reward of the problem.

Types of Reinforcement

Positive Reinforcement

Positive Reinforcement is defined as when an event occurs due to a specific behavior, increases the strength and the frequency of the behavior. To make it simple, it has a positive effect on the behavior.

Merits of reinforcement learning