In making practical decisions, agents are expected to comply with ideals of behaviour, or norms. In reality, it may not be possible for an individual, or a team of agents, to be fully compliant—actual behaviour often differs from the ideal. The question we address in this paper is how we can design agents that act in such a way that they select collective strategies to avoid more critical failures (norm violations), and mitigate the effects of violations that do occur. We model the normative requirements of a system through contrary-to-duty obligations and violation severity levels, and propose a novel multi-agent planning mechanism based on Decentralised POMDPs that uses a qualitative reward function to capture levels of compliance: N-Dec-POMDPs. We develop mechanisms for solving this type of multi-agent planning problem and show, through empirical analysis, that joint policies generated are equally as good as those produced through existing methods but with significant reductions in execution time.
Bibliographical noteThis research was funded by Selex ES. The software developed during this research, including the norm analysis and planning algorithms, the simulator and harbour protection scenario used during evaluation is freely available from doi:10.5258/SOTON/D0139
- Multi-agent planning