Eppe, ManfredManfredEppeMagg, SvenSvenMaggWermter, StefanStefanWermter2022-04-252022-04-252019-0819th Joint IEEE International Conference on Development and Learning and Epigenetic Robotics (ICDL-EpiRob 2019)http://hdl.handle.net/11420/12355Deep reinforcement learning has recently gained a focus on problems where policy or value functions are based on universal value function approximators (UVFAs) which renders them independent of goals. Evidence exists that the sampling of goals has a strong effect on the learning performance, and the problem of optimizing the goal sampling is frequently tackled with intrinsic motivation methods. However, there is a lack of general mechanisms that focus on goal sampling in the context of deep reinforcement learning based on UVFAs. In this work, we introduce goal masking as a method to estimate a goal's difficulty level and to exploit this estimation to realize curriculum learning. Our results indicate that focusing on goals with a medium difficulty level is appropriate for deep deterministic policy gradient (DDPG) methods, while an 'aim for the stars and reach the moon-strategy', where difficult goals are sampled much more often than simple goals, leads to the best learning performance in cases where DDPG is combined with hindsight experience replay (HER).enCurriculum goal masking for continuous deep reinforcement learningConference Paper10.1109/DEVLRN.2019.8850721Conference Paper