TUHH Open Research
Help
  • Log In
    New user? Click here to register.Have you forgotten your password?
  • English
  • Deutsch
  • Communities & Collections
  • Publications
  • Research Data
  • People
  • Institutions
  • Projects
  • Statistics
  1. Home
  2. TUHH
  3. Publication References
  4. Curriculum goal masking for continuous deep reinforcement learning
 
Options

Curriculum goal masking for continuous deep reinforcement learning

Publikationstyp
Conference Paper
Date Issued
2019-08
Sprache
English
Author(s)
Eppe, Manfred  
Magg, Sven  
Wermter, Stefan  
TORE-URI
http://hdl.handle.net/11420/12355
Start Page
183
End Page
188
Article Number
8850721
Citation
19th Joint IEEE International Conference on Development and Learning and Epigenetic Robotics (ICDL-EpiRob 2019)
Contribution to Conference
19th Joint IEEE International Conference on Development and Learning and Epigenetic Robotics, ICDL-EpiRob 2019  
Publisher DOI
10.1109/DEVLRN.2019.8850721
Scopus ID
2-s2.0-85073698767
ISBN of container
9781538681282
Deep reinforcement learning has recently gained a focus on problems where policy or value functions are based on universal value function approximators (UVFAs) which renders them independent of goals. Evidence exists that the sampling of goals has a strong effect on the learning performance, and the problem of optimizing the goal sampling is frequently tackled with intrinsic motivation methods. However, there is a lack of general mechanisms that focus on goal sampling in the context of deep reinforcement learning based on UVFAs. In this work, we introduce goal masking as a method to estimate a goal's difficulty level and to exploit this estimation to realize curriculum learning. Our results indicate that focusing on goals with a medium difficulty level is appropriate for deep deterministic policy gradient (DDPG) methods, while an 'aim for the stars and reach the moon-strategy', where difficult goals are sampled much more often than simple goals, leads to the best learning performance in cases where DDPG is combined with hindsight experience replay (HER).
TUHH
Weiterführende Links
  • Contact
  • Send Feedback
  • Cookie settings
  • Privacy policy
  • Impress
DSpace Software

Built with DSpace-CRIS software - Extension maintained and optimized by 4Science
Design by effective webwork GmbH

  • Deutsche NationalbibliothekDeutsche Nationalbibliothek
  • ORCiD Member OrganizationORCiD Member Organization
  • DataCiteDataCite
  • Re3DataRe3Data
  • OpenDOAROpenDOAR
  • OpenAireOpenAire
  • BASE Bielefeld Academic Search EngineBASE Bielefeld Academic Search Engine
Feedback