Abstract
Surprise has been cast as a cognitive-emotional phenomenon that impacts many aspects from creativity to learning to decision-making. Why are some events more surprising than others? Why do different people have different surprises for the same event? In this project, we try to seek a reasonable definition of "surprise" and apply it in reinforcement learning. A surprise-driven agent can learn to explore without knowing any reward system from the environment. This is done by creating a model of the environment. "Surprise" is the inconsistency between the model prediction and observed environment outcome. Agents learn in a reinforcement learning environment by maximizing this “surprise”.
Related Publications
H. Xu, L. Szymanski and B. McCane. VASE: Variational Assorted Surprise Exploration for Reinforcement Learning. IEEE Transactions on Neural Networks and Learning Systems, 34(3):1243-1252, 2023.
@article{xu.etal:2023,
author={Xu, Haitao and Szymanski, Lech and McCane, Brendan},
journal={IEEE Transactions on Neural Networks and Learning Systems},
title={VASE: Variational Assorted Surprise Exploration for Reinforcement Learning},
year={2023},
volume={34},
number={3},
pages={1243--1252},
url={https://doi.org/10.1109/TNNLS.2021.3105140},
doi={10.1109/TNNLS.2021.3105140}
}
Bibtex has been copied to clipboard.
Bibtex has been copied to clipboard.
H. Xu, B. McCane, L. Szymanski and C. Atkinson. MIME: Mutual Information Minimisation Exploration. arXiv preprint arXiv:2001.05636, 2020.
@article{xu.etal:2020,
title={MIME: Mutual Information Minimisation Exploration},
author={Haitao Xu and
Brendan McCane and
Lech Szymanski and
Craig Atkinson},
journal={arXiv preprint arXiv:2001.05636},
url={https://arxiv.org/abs/2001.05636},
year={2020}
}
Bibtex has been copied to clipboard.