Real-time control of continuous valued plants using TD(lamda) reinforcement learning is detailed. This problem is significantly more dif icult then the case of a discrete control space as in bang-bang or Q-learning. The methodology employs a combination of Stochastic Real-Valued units, Mixtures of Experts and RBF partitioning To do so the significance of both Maximum-Likelihood and Square Error Cost functions are emphasised, as is provision for RBF co-variances during training. The resulting architecture is demonstrated on benchmark problems.
History
Publication status
Published
Journal
International Journal of Knowledge-Based Intelligent Engineering Systems