In this third Part of Applying Temporal Difference Methods to Machine Learning, I will be experimenting with the intra-sequence update variant of TD learning. It is a method where after each time step, the parameters are updated rather than waiting at the end of the sequence.
In this post I detail my project for the course Reinforcement Learning (COMP767) taken at McGill, applying Temporal Difference (TD) methods in a Machine Learning setting.
This concept was first discussed by Sutton when he introduced this family of learning algorithms. I aim to go over what was discussed in the paper and see how it performs on a traditional machine learning problem.
In this part, I will be covering the concepts underlying this application.Read More »