research

Implicitly Quantized Representations for Reinforcement Learning

Learning representations for reinforcement learning (RL) has shown much promise for continuous control. In this project, we investigate using vector quantization to prevent representation collapse when learning representations for RL using a self-supervised latent-state consistency loss.

Aidan Scannell, Kalle Kujanpää, Yi Zhao, Mohammadreza Nakhaei, Arno Solin, Joni Pajarinen

Implicitly Quantized Representations for Reinforcement Learning

Function-Space Bayesian Deep Learning for Sequential Learning

Sequential learning paradigms pose challenges for gradient-based deep learning due to difficulties incorporating new data and retaining prior knowledge. While Gaussian processes elegantly tackle these problems, they struggle with scalability and handling rich inputs, such as images.

Aidan Scannell, Riccardo Mereu, Paul Chang, Ella Tamir, Joni Pajarinen, Arno Solin

Function-Space Bayesian Deep Learning for Sequential Learning

Investigating Bayesian Neural Network Dynamics Models for Model-Based Reinforcement Learning

This project seeks to evaluate and compare different approaches for learning dynamics models in model-based RL. In particular, we plan to compare different approximate inference techniques (e.g. Laplace approximation, MC dropout, variational inference), as well as ensemble methods, to understand why they either succeed or fail in different environments.

Aidan Scannell, Arno Solin, Joni Pajarinen

Mode-Constrained Exploration for Model-Based Reinforcement Learning

This work presents a learning-based control method for navigating to a target state in unknown, or partially unknown, multimodal dynamical systems. In particular, it develops a model-based reinforcement learning algorithm that can remain in a desired dynamics mode with high probability. For example, if some of the dynamics modes are believed to be inoperable.

Aidan Scannell, Carl Henrik Ek, Arthur Richards

Mode-Constrained Exploration for Model-Based Reinforcement Learning

Trajectory Optimisation in Learned Multimodal Dynamical Systems

This work presents a two-stage method to perform trajectory optimisation in multimodal dynamical systems with unknown nonlinear stochastic transition dynamics. The method finds trajectories that remain in a preferred dynamics mode where possible and in regions of the transition dynamics model that have been observed and can be predicted confidently.

Aidan Scannell

Trajectory Optimisation in Learned Multimodal Dynamical Systems

Identifiable Mixtures of Sparse Variational Gaussian Process Experts

This work introduces a variational lower bound for the Mixture of Gaussian Process Experts model with a GP-based gating network based on sparse GPs. The model (and inference) are implemented in GPflow/TensorFlow.

Aidan Scannell

Identifiable Mixtures of Sparse Variational Gaussian Process Experts