TUHH Open Research
Help
  • Log In
    New user? Click here to register.Have you forgotten your password?
  • English
  • Deutsch
  • Communities & Collections
  • Publications
  • Research Data
  • People
  • Institutions
  • Projects
  • Statistics
  1. Home
  2. TUHH
  3. Publication References
  4. Task-agnostic constraining in average reward POMDPs
 
Options

Task-agnostic constraining in average reward POMDPs

Publikationstyp
Preprint
Date Issued
2021-03-09
Sprache
English
Author(s)
Montúfar, Guido  
Rauh, Johannes  
Ay, Nihat  
TORE-URI
http://hdl.handle.net/11420/10380
Citation
Preprint (2021-03-09)
Publisher Link
https://www.mis.mpg.de/de/publications/mis-preprints/2021/2021-9.html
We study the shape of the average reward as a function over the memoryless stochastic policies in infinite-horizon partially observed Markov decision processes. We show that for any given instantaneous reward function on state-action pairs, there is an optimal policy that satisfies a series of constraints expressed solely in terms of the observation model. Our analysis extends and improves previous descriptions for discounted rewards or which covered only special cases.
TUHH
Weiterführende Links
  • Contact
  • Send Feedback
  • Cookie settings
  • Privacy policy
  • Impress
DSpace Software

Built with DSpace-CRIS software - Extension maintained and optimized by 4Science
Design by effective webwork GmbH

  • Deutsche NationalbibliothekDeutsche Nationalbibliothek
  • ORCiD Member OrganizationORCiD Member Organization
  • DataCiteDataCite
  • Re3DataRe3Data
  • OpenDOAROpenDOAR
  • OpenAireOpenAire
  • BASE Bielefeld Academic Search EngineBASE Bielefeld Academic Search Engine
Feedback