A Tensor-Based Markov Decision Process Representation

Daniela Kuinchtner* (Collaborator), Felipe Meneguzzi* (Collaborator), Afonso Sales* (Collaborator)

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingChapter

1 Citation (Scopus)

Abstract

A Markov Decision Process (MDP) is a sequential decision problem for a fully observable and stochastic environment. MDPs are widely used to model reinforcement learning problems. Researchers developed multiple solvers with increasing efficiency, each of which requiring fewer computational resources to find solutions for large MDPs. However, few of these solvers leverage advances in tensor processing to further increase solver efficiency, such as Google’s TPUs (https://cloud.google.com/tpu) and TensorFlow (https://www.tensorflow.org/). In this paper, we formalize an MDP problem in terms of Tensor Algebra, by representing transition models of MDPs compactly using tensors as vectors with fewer elements than its total size. Our method aims to facilitate implementation of various efficient MDP solvers reducing computational cost to generate monolithic MDPs.
Original languageEnglish
Title of host publicationAdvances in Soft Computing.
Subtitle of host publication19th Mexican International Conference on Artificial Intelligence, MICAI 2020, Mexico City, Mexico, October 12–17, 2020, Proceedings, Part I
PublisherSpringer
Pages313–324
Volume12468
ISBN (Electronic)978-3-030-60884-2
ISBN (Print)978-3-030-60883-5
DOIs
Publication statusPublished - 7 Oct 2020

Publication series

NameLecture Notes in Computer Science
PublisherSpringer International Publisher
Volume12468
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Keywords

  • Artificial intelligence
  • CANDECOMP/PARAFAC decomposition
  • Compact transition model
  • Markov Decision Process
  • Tensor algebra
  • Tensor decomposition

Fingerprint

Dive into the research topics of 'A Tensor-Based Markov Decision Process Representation'. Together they form a unique fingerprint.

Cite this