Train a model to balance a pole on a cart using reinforcement learning.
This example illustrates how to use TensorFlow.js to perform simple
reinforcement learning (RL).
Specifically, it showcases an implementation of the policy-gradient method in TensorFlow.js.
This implementation is used to solve the classic cart-pole
Through self play the agent will learn to balance
the pole for as many steps as it can.
Choose a hidden layer size and click "Create Model".
Select training parameters and then click "Train".
Note that while the model is training it periodically saves a copy of itself
to local browser storage, this mean you can refresh the page and continue training
from the last save point. If at any point you want to start training from scratch, click
"Delete stored Model".
Once the model has finished training you can click "Test" to see how many 'steps' the agent
can balance the pole for. You can also click 'Stop' to pause the training after the current iteration
ends if you want to test the model sooner.
During training and testing a small simulation of the agent behaviour will be rendered.