Options
A continuity result for optimal memoryless planning in POMDPs
Publikationstyp
Preprint
Date Issued
2021-03-06
Sprache
English
Author(s)
Citation
Preprint (2021-03-09)
Consider an infinite horizon partially observable Markov decision process. We show that the optimal discounted reward under memoryless stochastic policies is continuous under perturbations of the observation channel. This implies that we can find approximately optimal memoryless policies by solving an approximate problem with a simpler observation channel.