Mode-Constrained Exploration for Model-Based Reinforcement Learning
Aidan Scannell, Carl Henrik Ek, Arthur Richards
Sep 24, 2022
reinforcement-learning
machine-learning
gaussian-processes
optimal-control
robotics
python
TensorFlow
GPflow
research
Publications
We present a model-based RL algorithm that constrains training to a single dynamic mode with high probability. This is a difficult problem because the mode constraint is a hidden variable associated with the environment’s dynamics. As such, it is 1) unknown a priori and 2) we do not observe its output from the environment, so cannot learn it with supervised learning.
Aidan Scannell, Carl Henrik Ek, Arthur Richards