TY - CHAP
T1 - A Tensor-Based Markov Decision Process Representation
A2 - Kuinchtner, Daniela
A2 - Meneguzzi, Felipe
A2 - Sales, Afonso
PY - 2020/10/7
Y1 - 2020/10/7
N2 - A Markov Decision Process (MDP) is a sequential decision problem for a fully observable and stochastic environment. MDPs are widely used to model reinforcement learning problems. Researchers developed multiple solvers with increasing efficiency, each of which requiring fewer computational resources to find solutions for large MDPs. However, few of these solvers leverage advances in tensor processing to further increase solver efficiency, such as Google’s TPUs (https://cloud.google.com/tpu) and TensorFlow (https://www.tensorflow.org/). In this paper, we formalize an MDP problem in terms of Tensor Algebra, by representing transition models of MDPs compactly using tensors as vectors with fewer elements than its total size. Our method aims to facilitate implementation of various efficient MDP solvers reducing computational cost to generate monolithic MDPs.
AB - A Markov Decision Process (MDP) is a sequential decision problem for a fully observable and stochastic environment. MDPs are widely used to model reinforcement learning problems. Researchers developed multiple solvers with increasing efficiency, each of which requiring fewer computational resources to find solutions for large MDPs. However, few of these solvers leverage advances in tensor processing to further increase solver efficiency, such as Google’s TPUs (https://cloud.google.com/tpu) and TensorFlow (https://www.tensorflow.org/). In this paper, we formalize an MDP problem in terms of Tensor Algebra, by representing transition models of MDPs compactly using tensors as vectors with fewer elements than its total size. Our method aims to facilitate implementation of various efficient MDP solvers reducing computational cost to generate monolithic MDPs.
KW - Artificial intelligence
KW - CANDECOMP/PARAFAC decomposition
KW - Compact transition model
KW - Markov Decision Process
KW - Tensor algebra
KW - Tensor decomposition
UR - https://doi.org/10.1007/978-3-030-60884-2_23
U2 - 10.1007/978-3-030-60884-2_23
DO - 10.1007/978-3-030-60884-2_23
M3 - Chapter
SN - 978-3-030-60883-5
VL - 12468
T3 - Lecture Notes in Computer Science
SP - 313
EP - 324
BT - Advances in Soft Computing.
PB - Springer
ER -