Abstract
Imitation from observation is a computational technique that teaches an agent on how to mimic the behavior of an expert by observing only the sequence of states from the expert demonstrations. Recent approaches learn the inverse dynamics of the environment and an imitation policy by interleaving epochs of both models while changing the demonstration data. However, such approaches often get stuck into sub-optimal solutions that are distant from the expert, limiting their imitation effectiveness. We address this problem with a novel approach that overcomes the problem of reaching bad local minima by exploring: (i) a self-attention mechanism that better captures global features of the states; and (ii) a sampling strategy that regulates the observations that are used for learning. We show empirically that our approach outperforms the state-of-the-art approaches in four different environments by a large margin.
Original language | English |
---|---|
Title of host publication | 2020 International Joint Conference on Neural Networks, IJCNN 2020 - Proceedings |
Publisher | Institute of Electrical and Electronics Engineers Inc. |
ISBN (Electronic) | 9781728169262 |
DOIs | |
Publication status | Published - Jul 2020 |
Event | 2020 International Joint Conference on Neural Networks, IJCNN 2020 - Virtual, Glasgow, United Kingdom Duration: 19 Jul 2020 → 24 Jul 2020 |
Conference
Conference | 2020 International Joint Conference on Neural Networks, IJCNN 2020 |
---|---|
Country/Territory | United Kingdom |
City | Virtual, Glasgow |
Period | 19/07/20 → 24/07/20 |
Bibliographical note
Funding Information:This study was financed in part by the Coordenac¸ão de Aperfeic¸oamento de Pessoal de Nível Superior - Brasil (CAPES) - Finance Code 001, and CAPES/FAPERGS agreement (DOCFIX 04/2018) process number 18/2551-0000500-2. We gratefully acknowledge the support of NVIDIA Corporation with the donation of the graphics cards used for this research.
Publisher Copyright:
© 2020 IEEE.
Keywords
- Behavioral Cloning
- Deep Learning
- Imitation Learning
- Learning from Demonstration