TensorFlow.js: Reinforcement Learning

Train a model to balance a pole on a cart using reinforcement learning.

Description

This example illustrates how to use TensorFlow.js to perform simple reinforcement learning (RL). Specifically, it showcases an implementation of the policy-gradient method in TensorFlow.js. This implementation is used to solve the classic cart-pole control problem.

Through self play the agent will learn to balance the pole for as many steps as it can.

Instructions

Choose a hidden layer size and click "Create Model".
Select training parameters and then click "Train".
Note that while the model is training it periodically saves a copy of itself to local browser storage, this mean you can refresh the page and continue training from the last save point. If at any point you want to start training from scratch, click "Delete stored Model".
Once the model has finished training you can click "Test" to see how many 'steps' the agent can balance the pole for. You can also click 'Stop' to pause the training after the current iteration ends if you want to test the model sooner.
During training and testing a small simulation of the agent behaviour will be rendered.

Status

Standing by.

Initialize Model

Hidden layer size(s) (e.g.: "5", "8,6"):

Locally-stored model

Training Parameters

Number of iterations:

Games per iteration:

Max. steps per game:

Reward discount rate:

Learning rate:

Render during training: Uncheck me to speed up training.

Training Progress

Game #:

Iteration #:

Training speed:

Simulation