TY - GEN
T1 - Synthesising Reward Machines for Cooperative Multi-Agent Reinforcement Learning
AU - Varricchione, Giovanni
AU - Alechina, Natasha
AU - Dastani, Mehdi
AU - Logan, Brian
N1 - Code is available at github.com/giovannivarr/SynthesisingRMsMARL.
PY - 2023/9/7
Y1 - 2023/9/7
N2 - Reward machines have recently been proposed as a means of encoding team tasks in cooperative multi-agent reinforcement learning. The resulting multi-agent reward machine is then decomposed into individual reward machines, one for each member of the team, allowing agents to learn in a decentralised manner while still achieving the team task. However, current work assumes the multi-agent reward machine to be given. In this paper, we show how reward machines for team tasks can be synthesised automatically from an Alternating-Time Temporal Logic specification of the desired team behaviour and a high-level abstraction of the agents’ environment. We present results suggesting that our automated approach has comparable, if not better, sample efficiency than reward machines generated by hand for multi-agent tasks.
AB - Reward machines have recently been proposed as a means of encoding team tasks in cooperative multi-agent reinforcement learning. The resulting multi-agent reward machine is then decomposed into individual reward machines, one for each member of the team, allowing agents to learn in a decentralised manner while still achieving the team task. However, current work assumes the multi-agent reward machine to be given. In this paper, we show how reward machines for team tasks can be synthesised automatically from an Alternating-Time Temporal Logic specification of the desired team behaviour and a high-level abstraction of the agents’ environment. We present results suggesting that our automated approach has comparable, if not better, sample efficiency than reward machines generated by hand for multi-agent tasks.
UR - http://www.scopus.com/inward/record.url?scp=85171998898&partnerID=8YFLogxK
U2 - 10.1007/978-3-031-43264-4_21
DO - 10.1007/978-3-031-43264-4_21
M3 - Published conference contribution
AN - SCOPUS:85171998898
SN - 9783031432637
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 328
EP - 344
BT - Multi-Agent Systems - 20th European Conference, EUMAS 2023, Proceedings
A2 - Malvone, Vadim
A2 - Murano, Aniello
PB - Springer Science and Business Media Deutschland GmbH
T2 - Proceedings of the 20th European Conference on Multi-Agent Systems, EUMAS 2023
Y2 - 14 September 2023 through 15 September 2023
ER -