Reinforcement Learning

Reinforcement learning is a subpart of machine learning Reinforcement mean to take perfect action to maximize the reward in given task. It is used in various software, games, and machine to identify the best possible way to perform a given situation.

Reinforcement learning is quite different from supervised learning in a way that under supervised learning we need to train the model as a teacher with all predefine labeled data, which is the basic answer of all the question,  in simple word you are given set of question and their respective answer to model, so model always train with the correct answer, but in another hand in reinforcement learning no labeled data are given to model, model forced  to find own answer for a given question which model learn from experience.

It similar to find route to reach from source to destination if you do not have any guidance of google map or any other support. You will choose one best way and follow that road might be it will reach to your destination or not you will learn by that design and one find the time you will find the best shortest possible route to reach from source to destination.

Let’s imagine the situation where a newborn baby comes across a burning candle. Now, the baby does not know what happens if it touches the candle flame. Because of curiosity, the baby touches the candle flame and gets offended. After this event, the baby understands that repeating the same action again might get him offended. So, the next time if the baby sees a burning candle, he will be more cautious. Reinforcement learning exactly works the same way.

Reinforcement learning is another learning algorithm of Machine Learning wherein the model which is to be trained to do a specific job, learns on its own way based on its previous experiences and consequences while performing a similar type of a job.

Reinforcement learning can be well explained using the ideas of Agent who do some action in an environment and based on its states it got a positive or negative reward.


Reinforcement Learning elements:

  • Agent: The algorithm is the agent which do some actions.
  • Environment: The world in which the agent operates. Based on the agent present stage and its action under certain environment agent will get a reward.
  • State: Is a situation in which the algorithm agent finds itself for example moment, obstacles, specific place and, tools.
  • Action:  Is the combination of all possible action which agent can take to perform the given task.
  • Reward: Can be positive or negative and based on feedback we measure the failer or success of an agent’s actions.

Reinforcement Learning


Will understand the Reinforcement Learning element by one simple example of the self-driving car where the agent represents the self-driving car system, Environment is the road, other cars, traffic signal where the car has to drive.

The state is the car current state in traffic light signal stop or running an action is in the green signal car start and drive or red signal car stop. now if on green signal if car not started or moving another car may hit which is the negative reward but in the same time if the car started and moved from that place which is the positive reward, so by this reward agent understand what to do at what environment situation.

Reinforcement Learning practical applications?

Reinforcement Learning requires the huge size of data, that why it widely use in domains and applicable where simulated data available like robotics, gameplay.

  • In industrial automation and robotics enable the machine to create an efficient automate control environment, which learns from its own behavior and experience.

Look at below interesting demo videos link:-


  • It is widely used in building AI based Games where AI agent plays computer games instead of human and try to learn by reattempting the game different challenges. AlphaGo Zero has become a first AI&ML game which defeats the world champion in the ancient Chinese game of Go, Others include Backgammon, CoastRunners 7, Pacman, Atari Breakout, Mario etc.

Look at below interesting demo videos:-


  • Text summarization application, dialog agents (speech, text) who is capable to learn from user behavior and interactions and improve with time.
  • Reinforcement Learning based agents for online stock trading.
  • learning optimal treatment guidelines in healthcare.

Wrapping up: Reinforcement Learning is reward-based learning where to perform any task agent will get the positive or negative reward and based agent behavior on these reward agents learn how to perform the task, it allows the software agent to learn from its behavior based on feedback and rewards from the working environment.


You May Also Like