Augmented Behavioral Cloning from Observation

Juarez Monteiro; Nathan Gavenski; Roger Granada; Felipe Meneguzzi; Rodrigo Barros

doi:10.1109/IJCNN48605.2020.9207672

Augmented Behavioral Cloning from Observation

Juarez Monteiro, Nathan Gavenski, Roger Granada, Felipe Meneguzzi, Rodrigo Barros

Pontifícia Universidade Católica do Rio Grande do Sul

Research output: Chapter in Book/Report/Conference proceeding › Published conference contribution

4 Citations (Scopus)

Abstract

Imitation from observation is a computational technique that teaches an agent on how to mimic the behavior of an expert by observing only the sequence of states from the expert demonstrations. Recent approaches learn the inverse dynamics of the environment and an imitation policy by interleaving epochs of both models while changing the demonstration data. However, such approaches often get stuck into sub-optimal solutions that are distant from the expert, limiting their imitation effectiveness. We address this problem with a novel approach that overcomes the problem of reaching bad local minima by exploring: (i) a self-attention mechanism that better captures global features of the states; and (ii) a sampling strategy that regulates the observations that are used for learning. We show empirically that our approach outperforms the state-of-the-art approaches in four different environments by a large margin.

Original language	English
Title of host publication	2020 International Joint Conference on Neural Networks, IJCNN 2020 - Proceedings
Publisher	Institute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)	9781728169262
DOIs	https://doi.org/10.1109/IJCNN48605.2020.9207672
Publication status	Published - Jul 2020
Event	2020 International Joint Conference on Neural Networks, IJCNN 2020 - Virtual, Glasgow, United Kingdom Duration: 19 Jul 2020 → 24 Jul 2020

Conference

Conference	2020 International Joint Conference on Neural Networks, IJCNN 2020
Country/Territory	United Kingdom
City	Virtual, Glasgow
Period	19/07/20 → 24/07/20

Bibliographical note

Funding Information:
This study was financed in part by the Coordenac¸ão de Aperfeic¸oamento de Pessoal de Nível Superior - Brasil (CAPES) - Finance Code 001, and CAPES/FAPERGS agreement (DOCFIX 04/2018) process number 18/2551-0000500-2. We gratefully acknowledge the support of NVIDIA Corporation with the donation of the graphics cards used for this research.

Publisher Copyright:
© 2020 IEEE.

Keywords

Behavioral Cloning
Deep Learning
Imitation Learning
Learning from Demonstration

Access to Document

10.1109/IJCNN48605.2020.9207672

Cite this

Monteiro, J, Gavenski, N, Granada, R, Meneguzzi, F & Barros, R 2020, Augmented Behavioral Cloning from Observation. in 2020 International Joint Conference on Neural Networks, IJCNN 2020 - Proceedings., 9207672, Institute of Electrical and Electronics Engineers Inc., 2020 International Joint Conference on Neural Networks, IJCNN 2020, Virtual, Glasgow, United Kingdom, 19/07/20. https://doi.org/10.1109/IJCNN48605.2020.9207672

@inproceedings{ce89348fcc974ce6868a0c252d6ec88e,

title = "Augmented Behavioral Cloning from Observation",

abstract = "Imitation from observation is a computational technique that teaches an agent on how to mimic the behavior of an expert by observing only the sequence of states from the expert demonstrations. Recent approaches learn the inverse dynamics of the environment and an imitation policy by interleaving epochs of both models while changing the demonstration data. However, such approaches often get stuck into sub-optimal solutions that are distant from the expert, limiting their imitation effectiveness. We address this problem with a novel approach that overcomes the problem of reaching bad local minima by exploring: (i) a self-attention mechanism that better captures global features of the states; and (ii) a sampling strategy that regulates the observations that are used for learning. We show empirically that our approach outperforms the state-of-the-art approaches in four different environments by a large margin.",

keywords = "Behavioral Cloning, Deep Learning, Imitation Learning, Learning from Demonstration",

author = "Juarez Monteiro and Nathan Gavenski and Roger Granada and Felipe Meneguzzi and Rodrigo Barros",

note = "Funding Information: This study was financed in part by the Coordenac¸{\~a}o de Aperfeic¸oamento de Pessoal de N{\'i}vel Superior - Brasil (CAPES) - Finance Code 001, and CAPES/FAPERGS agreement (DOCFIX 04/2018) process number 18/2551-0000500-2. We gratefully acknowledge the support of NVIDIA Corporation with the donation of the graphics cards used for this research. Publisher Copyright: {\textcopyright} 2020 IEEE.; 2020 International Joint Conference on Neural Networks, IJCNN 2020 ; Conference date: 19-07-2020 Through 24-07-2020",

year = "2020",

month = jul,

doi = "10.1109/IJCNN48605.2020.9207672",

language = "English",

booktitle = "2020 International Joint Conference on Neural Networks, IJCNN 2020 - Proceedings",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

address = "United States",

}

TY - GEN

T1 - Augmented Behavioral Cloning from Observation

AU - Monteiro, Juarez

AU - Gavenski, Nathan

AU - Granada, Roger

AU - Meneguzzi, Felipe

AU - Barros, Rodrigo

N1 - Funding Information: This study was financed in part by the Coordenac¸ão de Aperfeic¸oamento de Pessoal de Nível Superior - Brasil (CAPES) - Finance Code 001, and CAPES/FAPERGS agreement (DOCFIX 04/2018) process number 18/2551-0000500-2. We gratefully acknowledge the support of NVIDIA Corporation with the donation of the graphics cards used for this research. Publisher Copyright: © 2020 IEEE.

PY - 2020/7

Y1 - 2020/7

N2 - Imitation from observation is a computational technique that teaches an agent on how to mimic the behavior of an expert by observing only the sequence of states from the expert demonstrations. Recent approaches learn the inverse dynamics of the environment and an imitation policy by interleaving epochs of both models while changing the demonstration data. However, such approaches often get stuck into sub-optimal solutions that are distant from the expert, limiting their imitation effectiveness. We address this problem with a novel approach that overcomes the problem of reaching bad local minima by exploring: (i) a self-attention mechanism that better captures global features of the states; and (ii) a sampling strategy that regulates the observations that are used for learning. We show empirically that our approach outperforms the state-of-the-art approaches in four different environments by a large margin.

AB - Imitation from observation is a computational technique that teaches an agent on how to mimic the behavior of an expert by observing only the sequence of states from the expert demonstrations. Recent approaches learn the inverse dynamics of the environment and an imitation policy by interleaving epochs of both models while changing the demonstration data. However, such approaches often get stuck into sub-optimal solutions that are distant from the expert, limiting their imitation effectiveness. We address this problem with a novel approach that overcomes the problem of reaching bad local minima by exploring: (i) a self-attention mechanism that better captures global features of the states; and (ii) a sampling strategy that regulates the observations that are used for learning. We show empirically that our approach outperforms the state-of-the-art approaches in four different environments by a large margin.

KW - Behavioral Cloning

KW - Deep Learning

KW - Imitation Learning

KW - Learning from Demonstration

UR - http://www.scopus.com/inward/record.url?scp=85093860540&partnerID=8YFLogxK

U2 - 10.1109/IJCNN48605.2020.9207672

DO - 10.1109/IJCNN48605.2020.9207672

M3 - Published conference contribution

AN - SCOPUS:85093860540

BT - 2020 International Joint Conference on Neural Networks, IJCNN 2020 - Proceedings

PB - Institute of Electrical and Electronics Engineers Inc.

T2 - 2020 International Joint Conference on Neural Networks, IJCNN 2020

Y2 - 19 July 2020 through 24 July 2020

ER -

Augmented Behavioral Cloning from Observation

Abstract

Conference

Bibliographical note

Keywords

Access to Document

Other files and links

Fingerprint

Cite this