One exciting way to demonstrate research is through the use of multimedia materials. Below are slides, short videos, and full Youtube videos supporting my publications and talks.
- GECCO 2020: Learning behaviour-performance maps with meta-evolution
- GECCO 2021: On the use of feature-maps for improved quality-diversity meta-evolution
- ICRA 2021: Rapidly adapting robot swarms with Swarm Map-based Optimisation
- Talk at DCE Reading group 2021: Beyond MDPs: reinforcement learning in unknown long-term environments
Below are short videos to support qualitative analyses. For full, narrated videos, please have a look at the next section (Youtube videos).
RHex robot behaviours
- adapting to damages: a comparison of QD meta-evolution to traditional MAP-Elites shows improved recovery from damages
- adapting to obstacles: the effect of different meta-objectives in QD meta-evolution. Injecting obstacles in the environment during training leads to jumpy behaviours, which generalises well across environments. Damaging the robot’s legs during training leads to gaits that disable a particular leg.
Find videos of my work at the Artificial Minds channel. For further context, I have listed the videos below with a brief explanation and link to the papers.
Learning to learn with active adaptive perception
This work (Neural Networks journal paper available here) shows how learning how to learn can overcome the problem of exploration in non-episodic environments with sparse rewards and partial observability. The agent may never get any constructive feedback from the environment, and therefore learning to learn effectively based on the sparse rewards that one does receive is essential. In Active Adaptive Perception, this is achieved by encoding learning operations, exploration methods, and selective use of perceptual memory in a Self-Modifying Policy. The Self-Modifying-Policy learns over the long-term which learning mechanisms lead to self-improvements and gradually it avoids to get stuck in distracting corridors. This contrasts to Deep Recurrent Q-Network, which still gets stuck even in the late phases of the lifetime.
QED: using Quality-Environment-Diversity to evolve resilient robot swarms
This work (IEEE Transactions on Evolutionary Computation paper here) explores evolving robot swarms in diverse environments — a system called QED. We define for different tasks (aggregation, dispersion, flocking, patrolling of the area, and patrolling the border) a large variety of environmental settings by varying obstacle densities, swarm size, and the sensory-motor properties of the robot. Environmental diversity in this sense implicitly characterises behavioural diversity: after many generations, each local region in the environment space will have its best solution representing a unique behaviour; for example, if the robots have high speed and there are many obstacles, this will lead to robots behaving cautiously as soon as their proximity sensors fire. The key result of the paper is that QED yields improved adaptation to damages and a unique behavioural diversity profile compared to traditional behavioural diversity approaches.
Rapidly adapting robot swarms with Swarm Map-based Optimisation
This work (ICRA conference paper available here) demonstrates a two-phase system for allowing rapid adaptation in robot swarms. In the first phase, which can be done in controlled conditions, a large archive of behaviours is evolved using a quality-diversity algorithm. In the second phase, which is the application of interest, the environment is uncontrolled and may change at will, so the system learns to adapt rapidly by searching across the behavioural archive using a novel variant of Bayesian optimisation. Through partitioning the swarm into different groups which selectively share a Gaussian process, the novel method accounts for different groups of robots being affected by different environmental conditions. For rapid adaptation when not much coordination is required, the paper further proposes to exploit the different robots within each such group as independent workers for a decentralised batch-based Bayesian Optimisation process.