Synthesising Reward Machines for Cooperative Multi-Agent Reinforcement Learning

Giovanni Varricchione* (Corresponding Author), Natasha Alechina, Mehdi Dastani, Brian Logan

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingPublished conference contribution

Abstract

Reward machines have recently been proposed as a means of encoding team tasks in cooperative multi-agent reinforcement learning. The resulting multi-agent reward machine is then decomposed into individual reward machines, one for each member of the team, allowing agents to learn in a decentralised manner while still achieving the team task. However, current work assumes the multi-agent reward machine to be given. In this paper, we show how reward machines for team tasks can be synthesised automatically from an Alternating-Time Temporal Logic specification of the desired team behaviour and a high-level abstraction of the agents’ environment. We present results suggesting that our automated approach has comparable, if not better, sample efficiency than reward machines generated by hand for multi-agent tasks.

Original languageEnglish
Title of host publicationMulti-Agent Systems - 20th European Conference, EUMAS 2023, Proceedings
EditorsVadim Malvone, Aniello Murano
PublisherSpringer Science and Business Media Deutschland GmbH
Pages328-344
Number of pages17
ISBN (Print)9783031432637
DOIs
Publication statusPublished - 7 Sept 2023
EventProceedings of the 20th European Conference on Multi-Agent Systems, EUMAS 2023 - Naples, Italy
Duration: 14 Sept 202315 Sept 2023

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume14282 LNAI
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

ConferenceProceedings of the 20th European Conference on Multi-Agent Systems, EUMAS 2023
Country/TerritoryItaly
CityNaples
Period14/09/2315/09/23

Bibliographical note

Code is available at github.com/giovannivarr/SynthesisingRMsMARL.

Fingerprint

Dive into the research topics of 'Synthesising Reward Machines for Cooperative Multi-Agent Reinforcement Learning'. Together they form a unique fingerprint.

Cite this