Diversity-based Trajectory and Goal Selection with Hindsight Experience Replay

Tianhong Dai; Hengyan Liu; Kai Arulkumaran; Guangyu Ren; Anil Anthony Bharath

doi:10.1007/978-3-030-89370-5_3

Diversity-based Trajectory and Goal Selection with Hindsight Experience Replay

Tianhong Dai, Hengyan Liu, Kai Arulkumaran, Guangyu Ren, Anil Anthony Bharath

Imperial College London

Research output: Chapter in Book/Report/Conference proceeding › Published conference contribution

6 Citations (Scopus)

Abstract

Hindsight experience replay (HER) is a goal relabelling technique typically used with off-policy deep reinforcement learning algorithms to solve goal-oriented tasks; it is well suited to robotic manipulation tasks that deliver only sparse rewards. In HER, both trajectories and transitions are sampled uniformly for training. However, not all of the agent’s experiences contribute equally to training, and so naive uniform sampling may lead to inefficient learning. In this paper, we propose diversity-based trajectory and goal selection with HER (DTGSH). Firstly, trajectories are sampled according to the diversity of the goal states as modelled by determinantal point processes (DPPs). Secondly, transitions with diverse goal states are selected from the trajectories by using k-DPPs. We evaluate DTGSH on five challenging robotic manipulation tasks in simulated robot environments, where we show that our method can learn more quickly and reach higher performance than other state-of-the-art approaches on all tasks.

Original language	English
Title of host publication	Pacific Rim International Conference on Artificial Intelligence
Subtitle of host publication	PRICAI 2021: Trends in Artificial Intelligence
Pages	32-45
DOIs	https://doi.org/10.1007/978-3-030-89370-5_3
Publication status	Published - 2021

Publication series

Name	Lecture Notes in Computer Science
Volume	13033

Access to Document

10.1007/978-3-030-89370-5_3

Cite this

Diversity-based Trajectory and Goal Selection with Hindsight Experience Replay. / Dai, Tianhong; Liu, Hengyan; Arulkumaran, Kai et al.
Pacific Rim International Conference on Artificial Intelligence: PRICAI 2021: Trends in Artificial Intelligence. 2021. p. 32-45 (Lecture Notes in Computer Science; Vol. 13033).

Research output: Chapter in Book/Report/Conference proceeding › Published conference contribution

@inproceedings{86bafa1ebc68439480ed197989486f6c,

title = "Diversity-based Trajectory and Goal Selection with Hindsight Experience Replay",

abstract = "Hindsight experience replay (HER) is a goal relabelling technique typically used with off-policy deep reinforcement learning algorithms to solve goal-oriented tasks; it is well suited to robotic manipulation tasks that deliver only sparse rewards. In HER, both trajectories and transitions are sampled uniformly for training. However, not all of the agent{\textquoteright}s experiences contribute equally to training, and so naive uniform sampling may lead to inefficient learning. In this paper, we propose diversity-based trajectory and goal selection with HER (DTGSH). Firstly, trajectories are sampled according to the diversity of the goal states as modelled by determinantal point processes (DPPs). Secondly, transitions with diverse goal states are selected from the trajectories by using k-DPPs. We evaluate DTGSH on five challenging robotic manipulation tasks in simulated robot environments, where we show that our method can learn more quickly and reach higher performance than other state-of-the-art approaches on all tasks.",

author = "Tianhong Dai and Hengyan Liu and Kai Arulkumaran and Guangyu Ren and Bharath, {Anil Anthony}",

year = "2021",

doi = "10.1007/978-3-030-89370-5_3",

language = "English",

isbn = "9783030893699",

series = "Lecture Notes in Computer Science",

pages = "32--45",

booktitle = "Pacific Rim International Conference on Artificial Intelligence",

}

TY - GEN

T1 - Diversity-based Trajectory and Goal Selection with Hindsight Experience Replay

AU - Dai, Tianhong

AU - Liu, Hengyan

AU - Arulkumaran, Kai

AU - Ren, Guangyu

AU - Bharath, Anil Anthony

PY - 2021

Y1 - 2021

N2 - Hindsight experience replay (HER) is a goal relabelling technique typically used with off-policy deep reinforcement learning algorithms to solve goal-oriented tasks; it is well suited to robotic manipulation tasks that deliver only sparse rewards. In HER, both trajectories and transitions are sampled uniformly for training. However, not all of the agent’s experiences contribute equally to training, and so naive uniform sampling may lead to inefficient learning. In this paper, we propose diversity-based trajectory and goal selection with HER (DTGSH). Firstly, trajectories are sampled according to the diversity of the goal states as modelled by determinantal point processes (DPPs). Secondly, transitions with diverse goal states are selected from the trajectories by using k-DPPs. We evaluate DTGSH on five challenging robotic manipulation tasks in simulated robot environments, where we show that our method can learn more quickly and reach higher performance than other state-of-the-art approaches on all tasks.

AB - Hindsight experience replay (HER) is a goal relabelling technique typically used with off-policy deep reinforcement learning algorithms to solve goal-oriented tasks; it is well suited to robotic manipulation tasks that deliver only sparse rewards. In HER, both trajectories and transitions are sampled uniformly for training. However, not all of the agent’s experiences contribute equally to training, and so naive uniform sampling may lead to inefficient learning. In this paper, we propose diversity-based trajectory and goal selection with HER (DTGSH). Firstly, trajectories are sampled according to the diversity of the goal states as modelled by determinantal point processes (DPPs). Secondly, transitions with diverse goal states are selected from the trajectories by using k-DPPs. We evaluate DTGSH on five challenging robotic manipulation tasks in simulated robot environments, where we show that our method can learn more quickly and reach higher performance than other state-of-the-art approaches on all tasks.

UR - http://dx.doi.org/10.1007/978-3-030-89370-5_3

U2 - 10.1007/978-3-030-89370-5_3

DO - 10.1007/978-3-030-89370-5_3

M3 - Published conference contribution

SN - 9783030893699

SN - 9783030893705

T3 - Lecture Notes in Computer Science

SP - 32

EP - 45

BT - Pacific Rim International Conference on Artificial Intelligence

ER -

Diversity-based Trajectory and Goal Selection with Hindsight Experience Replay

Abstract

Publication series

Access to Document

Other files and links

Fingerprint

Cite this